Talos Vulnerability Report

TALOS-2017-0285

AntennaHouse DMC HTMLFilter UnCompressUnicode Code Execution Vulnerability

May 4, 2017
CVE Number

CVE-2017-2793

Summary

An exploitable heap corruption vulnerability exists in the UnCompressUnicode functionality of AntennaHouse DMC HTMLFilter used by MarkLogic 8.0-6. A specially crafted xls file can cause a heap corruption resulting in arbitrary code execution. An attacker can send/provide malicious XLS file to trigger this vulnerability.

Tested Versions

AntennaHouse DMC HTMLFilter shipped with MarkLogic 8.0-6

fb1a22fa08c986ec3614284f4e912b0a  /opt/MarkLogic/Converters/cvtofc/libdhf_rdoc.so
15b0acc464fba28335239f722a62037f  /opt/MarkLogic/Converters/cvtofc/libdmc_comm.so
1eabb31236c675f9856a7d001b339334  /opt/MarkLogic/Converters/cvtofc/libdhf_rxls.so
1415cbc784f05db0e9db424636df581a  /opt/MarkLogic/Converters/cvtofc/libdhf_comm.so
4ae366fbd4540dd4c750e6679eb63dd4  /opt/MarkLogic/Converters/cvtofc/libdmc_conf.so
81db1b55e18a0cb70a78410147f50b9c  /opt/MarkLogic/Converters/cvtofc/libdhf_htmlif.so
d716dd77c8e9ee88df435e74fad687e6  /opt/MarkLogic/Converters/cvtofc/libdhf_whtml.so
e01d37392e2b2cea757a52ddb7873515  /opt/MarkLogic/Converters/cvtofc/convert

Product URLs

https://www.antennahouse.com/antenna1/

CVSSv3 Score

8.3 - CVSS:3.0/AV:N/AC:H/PR:N/UI:R/S:C/C:H/I:H/A:H

Details

This vulnerability is present in the AntennaHouse DMC HTMLFilter which is used, among others, to convert XLS files to (X)HTML form.
This product is mainly used by MarkLogic for xls document conversions as part of their web based document search and rendering engine. A specially crafted XLS file can lead to heap corruption and ultimately to remote code execution.

Let's investigate this vulnerability. After execution of the XLS to HTML converter with a malformed XLS file as an input we can easily observe the following when using Valgrind:

[email protected]:~/bugs/cvtofc_86$ valgrind ./convert config_xls
==46749== Memcheck, a memory error detector
==46749== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==46749== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==46749== Command: ./convert config_xls
==46749== 
input=/home/icewall/bugs/cvtofc_86/config_xls/toconv.xls
output=/home/icewall/bugs/cvtofc_86/config_xls/conv.html
type=2
info.options='0'
Return from GetFileInfo=0
HtmlInfo.GroupName=UTF-8
HtmlInfo.DefLangName=English
HtmlInfo.bBigEndian=0
HtmlInfo.options=0
HtmlInfo.SheetId=0
HtmlInfo.SlideId=0
HtmlInfo.lpFunc=(nil)
HtmlInfo.szImageFolder=
==46749== Source and destination overlap in strcpy(0x43e178d, 0x43e178d)
==46749==    at 0x402D56F: strcpy (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==46749==    by 0x635186F: DHF_WOpen (in /home/icewall/bugs/cvtofc_86/libdhf_whtml.so)
==46749==    by 0x4039779: FilterToHtml (in /home/icewall/bugs/cvtofc_86/libdhf_htmlif.so)
==46749==    by 0x4038AFB: DHF_GetHtml_V11 (in /home/icewall/bugs/cvtofc_86/libdhf_htmlif.so)
==46749==    by 0x8049AF7: main (in /home/icewall/bugs/cvtofc_86/convert)
==46749== 
==46749== Invalid write of size 1
==46749==    at 0x40409B3: UnCompressUnicode (in /home/icewall/bugs/cvtofc_86/libdhf_rxls.so)
==46749==    by 0x4045223: String (in /home/icewall/bugs/cvtofc_86/libdhf_rxls.so)
==46749==    by 0x40532B8: DHF_RGetObject (in /home/icewall/bugs/cvtofc_86/libdhf_rxls.so)
==46749==    by 0x403979E: FilterToHtml (in /home/icewall/bugs/cvtofc_86/libdhf_htmlif.so)
==46749==    by 0x4038AFB: DHF_GetHtml_V11 (in /home/icewall/bugs/cvtofc_86/libdhf_htmlif.so)
==46749==    by 0x8049AF7: main (in /home/icewall/bugs/cvtofc_86/convert)
==46749==  Address 0x5429bb2 is 0 bytes after a block of size 65,538 alloc'd
==46749==    at 0x402C109: calloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==46749==    by 0x42DCD24: DMC_calloc (in /home/icewall/bugs/cvtofc_86/libdmc_comm.so)
==46749==    by 0x4041002: InitMem (in /home/icewall/bugs/cvtofc_86/libdhf_rxls.so)
==46749==    by 0x40526BB: DHF_ROpen (in /home/icewall/bugs/cvtofc_86/libdhf_rxls.so)
==46749==    by 0x4039765: FilterToHtml (in /home/icewall/bugs/cvtofc_86/libdhf_htmlif.so)
==46749==    by 0x4038AFB: DHF_GetHtml_V11 (in /home/icewall/bugs/cvtofc_86/libdhf_htmlif.so)
==46749==    by 0x8049AF7: main (in /home/icewall/bugs/cvtofc_86/convert)

The out of bounds write occrus in the UnCompressUnicode function but the buffer which is overflowed is allocated in InitMem and has a size of 65538 bytes. Let we investigate InitMem function:

Line 1  signed int __cdecl InitMem(int a1)
Line 2  {
Line 3    _DWORD *v1; // [email protected]
Line 4    int v2; // [email protected]
Line 5 
Line 6    v1 = *(_DWORD **)(a1 + 16164);
Line 7    v1[74] = DMC_calloc(65538, 1);
Line 8    v1[75] = DMC_calloc(65538, 1);
Line 9    v1[76] = DMC_calloc(65538, 1);
Line 10   v2 = DMC_calloc(256, 728);
Line 11   *(_DWORD *)(a1 + 9236) = v2;
Line 12   if ( !v1[76] || !v1[75] || !v1[74] || !v2 )
Line 13     return 12;
Line 14   GetEndian(v1);
Line 15   return 0;
Line 16 }   

As we can see there are a couple allocations made with the aformentioned mentioned size. Looking at the call stack where the overflow appears, we see that operations made are related to a string object. Everything becomes clearer when we look at the following code from the DHF_RGetObject function:

Line 1           else
Line 2           {
Line 3             if ( v12 != 229 )
Line 4               goto LABEL_156;
Line 5             v13 = MergeCellRec(v3);
Line 6           }
Line 7           goto LABEL_155;
Line 8         }
Line 9         if ( v12 == 520 )
Line 10        {
Line 11          v13 = RowRec(v3);
Line 12          goto LABEL_155;
Line 13        }
Line 14        if ( v12 <= 520 )
Line 15        {
Line 16          if ( v12 == 517 )
Line 17          {
Line 18            v13 = BoolerrRec(v3, *(_BYTE *)(*(_DWORD *)(v3 + 296) + 6), *(_BYTE *)(*(_DWORD *)(v3 + 296) + 7));
Line 19          }
Line 20          else if ( v12 > 517 )
Line 21          {
Line 22            if ( v12 != 519 )
Line 23              goto LABEL_156;
Line 24            String();
Line 25          }
Line 26          else
Line 27          {
Line 28            if ( v12 != 515 )
Line 29              goto LABEL_156;
Line 30
Line 31            v13 = NumberRec(a1, (struct_a2 *)v3);
Line 32          }
Line 33
Line 34          v2 = v13;
Line 35          goto LABEL_156;
Line 36        }

This code is responsible for calling the proper object constructor based on the XLS record type. So our buggy code fragment parses an XLS Sring (documented in section 2.4.268 of the Excel Binary Format: https://msdn.microsoft.com/en-us/library/dd923608(v=office.12).aspx). Further investigation reveals that the malformed string record is located at offset 0x11ED:

0x11ED :    07 02 07 00 03 F2 00 31 2F 34 22

0x207   - record type
0x7     - record length
0x03... - Data

According to the documentation:

string (variable): An XLUnicodeString structure that specifies the string value of a formula (section 2.2.2). The value of string.cch MUST be less than or equal to 32767.

The Data included in the record is described by section 2.5.294 XLUnicodeString (https://msdn.microsoft.com/en-us/library/dd922754(v=office.12).aspx). The most important information from there:

cch (2 bytes): An unsigned integer that specifies the count of CHARACTERS in the string.
A - fHighByte (1 bit): A bit that specifies whether the characters in rgb are double-byte characters. MUST be a value from the following table: Value Meaning
    0x0 - All the characters in the string have a high byte of 0x00 and only the low bytes are in rgb.
    0x1 - All the characters in the string are saved as double-byte characters in rgb.
rgb (variable): An array of bytes that specifies the characters. If fHighByte is 0x0, the size of the array MUST be equal to cch. If fHighByte is 0x1, the size of the array MUST be equal to cch*2.

We see check for the A bit at line 15. In our case it's equal to 0x00.

Line 1  signed int __usercall String(int a1@<ebp>)
Line 2  {
Line 3    struct_v1 *v1; // [email protected]
Line 4    int v3; // [email protected]
Line 5    size_t v4; // [email protected]
Line 6 
Line 7    v1 = *(struct_v1 **)(a1 + 8);
Line 8    *(_DWORD *)(a1 - 16) = (unsigned __int16)Exc_GetWord(v1, v1->dword128);
Line 9    *(_DWORD *)(a1 - 16) *= 2;
Line 10   *(_DWORD *)(v1->dword9A0 + 24 * v1->dword9A4 - 4) = DMC_malloc(*(_DWORD *)(a1 - 16));
Line 11   if ( !*(_DWORD *)(v1->dword9A0 + 24 * v1->dword9A4 - 4) )
Line 12     return 12;
Line 13   memset(*(void **)(v1->dword9A0 + 24 * v1->dword9A4 - 4), 0, *(_DWORD *)(a1 - 16));
Line 14   v3 = v1->dword128;
Line 15   if ( *(_BYTE *)(v3 + 2) & 1 )
Line 16   {
Line 17     memcpy(*(void **)(v1->dword9A0 + 24 * v1->dword9A4 - 4), (const void *)(v3 + 3), *(_DWORD *)(a1 - 16));
Line 18   }
Line 19   else
Line 20   {
Line 21     v4 = *(_DWORD *)(a1 - 16) >> 1;
Line 22     memcpy(v1->pvoid12C, (const void *)(v1->dword128 + 3), v4);
Line 23     v1->dword134 = v4;
Line 24     UnCompressUnicode((int)v1);
Line 25     memcpy(*(void **)(v1->dword9A0 + 24 * v1->dword9A4 - 4), v1->pvoid130, *(_DWORD *)(a1 - 16));
Line 26   }
Line 27   *(_DWORD *)(v1->dword9A0 + 24 * v1->dword9A4 - 8) = *(_DWORD *)(a1 - 16);
Line 28   return 0;
Line 29 }

In this case the assumption is that cch indicates the number of characters, so it should not exceed 32767 . Where in our case this value is equal to:

>>> 0xf203
61955

So nearly two times above the limit. There are no extra checks in the UnCompressUnicode function:

Line 1 char *__cdecl UnCompressUnicode(struct_a1_2 *stringObj)
Line 2 {
Line 3   int i; // [email protected]
Line 4   char *result; // [email protected]
Line 5 
Line 6   for ( i = 0; i < stringObj->cch; ++i )
Line 7   {
Line 8     stringObj->globalInitMemPtr[2 * i] = stringObj->rawPtr[i];
Line 9     result = stringObj->globalInitMemPtr;
Line 10    result[2 * i + 1] = 0;
Line 11  }
Line 12  return result;
Line 13 }

As we can see, due to Unicode conversion, we can write double the size of the buffer into the memory allocated by the InitMem function causing heap corruption which can lead to arbitrary code execution.

Crash Information

Starting program: /home/icewall/bugs/cvtofc_86/convert config_xls
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
input=/home/icewall/bugs/cvtofc_86/config_xls/toconv.xls
output=/home/icewall/bugs/cvtofc_86/config_xls/conv.html
type=2
info.options='0'
Return from GetFileInfo=0
HtmlInfo.GroupName=UTF-8
HtmlInfo.DefLangName=English
HtmlInfo.bBigEndian=0
HtmlInfo.options=0
HtmlInfo.SheetId=0
HtmlInfo.SlideId=0
HtmlInfo.lpFunc=(nil)
HtmlInfo.szImageFolder=
*** Error in `/home/icewall/bugs/cvtofc_86/convert': double free or corruption (!prev): 0x080b8720 ***

Program received signal SIGABRT, Aborted.
[----------------------------------registers-----------------------------------]
EAX: 0x0 
EBX: 0xe37b 
ECX: 0xe37b 
EDX: 0x6 
ESI: 0x67 ('g')
EDI: 0xf7f06000 --> 0x1aada8 
EBP: 0xfffec7e8 --> 0x80c8720 --> 0x0 
ESP: 0xfffec524 --> 0xfffec7e8 --> 0x80c8720 --> 0x0 
EIP: 0xf7fdacd9 (pop    ebp)
EFLAGS: 0x206 (carry PARITY adjust zero sign trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
   0xf7fdacd3:  mov    ebp,esp
   0xf7fdacd5:  sysenter 
   0xf7fdacd7:  int    0x80
=> 0xf7fdacd9:  pop    ebp
   0xf7fdacda:  pop    edx
   0xf7fdacdb:  pop    ecx
   0xf7fdacdc:  ret    
   0xf7fdacdd:  and    edi,edx
[------------------------------------stack-------------------------------------]
0000| 0xfffec524 --> 0xfffec7e8 --> 0x80c8720 --> 0x0 
0004| 0xfffec528 --> 0x6 
0008| 0xfffec52c --> 0xe37b 
0012| 0xfffec530 --> 0xf7d89687 (xchg   ebx,edi)
0016| 0xfffec534 --> 0xf7f06000 --> 0x1aada8 
0020| 0xfffec538 --> 0xfffec5d4 --> 0x50 ('P')
0024| 0xfffec53c --> 0xf7d8cab3 (mov    edx,DWORD PTR gs:0x8)
0028| 0xfffec540 --> 0x6 
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
Stopped reason: SIGABRT
0xf7fdacd9 in ?? ()
gdb-peda$ exploitable
Description: Heap error
Short description: HeapError (15/29)
Hash: e2d1b7f3a507d7d332ec76b419bab576.275feecd5f318bb3faa395083dd35f75
Exploitability Classification: EXPLOITABLE
Explanation: The target's backtrace indicates that libc has detected a heap error or that the target was executing a heap function when it stopped. This could be due to heap corruption, passing a bad pointer to a heap function such as free(), etc. Since heap errors might include buffer overflows, use-after-free situations, etc. they are generally considered exploitable.
Other tags: AbortSignal (27/29)

Timeline

2017-02-09 - Vendor Disclosure
2017-05-04 - Public Release

Credit

Discovered by Marcin 'Icewall' Noga of Cisco Talos.