Talos Vulnerability Report

TALOS-2017-0279

AntennaHouse DMC HTMLFilter FillRowFormat Code Execution Vulnerability

May 4, 2017
CVE Number

CVE-2017-2783

Summary

An exploitable heap corruption vulnerability exists in the FillRowFormat functionality of AntennaHouse DMC HTMLFilter that is shipped with MarkLogic 8.0-6. A specially crafted xls file can cause a heap corruption resulting in arbitrary code execution. An attacker can send/provide malicious xls file to trigger this vulnerability.

Tested Versions

AntennaHouse DMC HTMLFilter shipped with MarkLogic 8.0-6

fb1a22fa08c986ec3614284f4e912b0a  /opt/MarkLogic/Converters/cvtofc/libdhf_rdoc.so
15b0acc464fba28335239f722a62037f  /opt/MarkLogic/Converters/cvtofc/libdmc_comm.so
1eabb31236c675f9856a7d001b339334  /opt/MarkLogic/Converters/cvtofc/libdhf_rxls.so
1415cbc784f05db0e9db424636df581a  /opt/MarkLogic/Converters/cvtofc/libdhf_comm.so
4ae366fbd4540dd4c750e6679eb63dd4  /opt/MarkLogic/Converters/cvtofc/libdmc_conf.so
81db1b55e18a0cb70a78410147f50b9c  /opt/MarkLogic/Converters/cvtofc/libdhf_htmlif.so
d716dd77c8e9ee88df435e74fad687e6  /opt/MarkLogic/Converters/cvtofc/libdhf_whtml.so
e01d37392e2b2cea757a52ddb7873515  /opt/MarkLogic/Converters/cvtofc/convert

Product URLs

https://www.antennahouse.com/antenna1/

CVSSv3 Score

8.3 - CVSS:3.0/AV:N/AC:H/PR:N/UI:R/S:C/C:H/I:H/A:H CVSSv3 Calculator: https://www.first.org/cvss/calculator/3.0

Details

This vulnerability is present in the AntennaHouse DMC HTMLFilter which is used, among others, to convert xls files to (x)html form.
This product is mainly used by MarkLogic for xls document conversions as part of their web based document search and rendering engine. A specially crafted XLS file can lead to an heap corruption and ultimately to remote code execution.

Let's investigate this vulnerability: after executing the XLS to html converter with malformed xls file as an input we can easily observe the following problem using Valgrind:

[email protected]:~/bugs/cvtofc_86$ valgrind ./convert config_xls/
==30152== Memcheck, a memory error detector
==30152== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==30152== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==30152== Command: ./convert config_xls/
==30152== 
input=/home/icewall/bugs/cvtofc_86/config_xls/toconv.xls
output=/home/icewall/bugs/cvtofc_86/config_xls/conv.html
type=2
info.options='0'
Return from GetFileInfo=0
HtmlInfo.GroupName=UTF-8
HtmlInfo.DefLangName=English
HtmlInfo.bBigEndian=0
HtmlInfo.options=0
HtmlInfo.SheetId=0
HtmlInfo.SlideId=0
HtmlInfo.lpFunc=(nil)
HtmlInfo.szImageFolder=
==30152== Source and destination overlap in strcpy(0x43e178d, 0x43e178d)
==30152==    at 0x402D56F: strcpy (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==30152==    by 0x635186F: DHF_WOpen (in /home/icewall/bugs/cvtofc_86/libdhf_whtml.so)
==30152==    by 0x4039779: FilterToHtml (in /home/icewall/bugs/cvtofc_86/libdhf_htmlif.so)
==30152==    by 0x4038AFB: DHF_GetHtml_V11 (in /home/icewall/bugs/cvtofc_86/libdhf_htmlif.so)
==30152==    by 0x8049AF7: main (in /home/icewall/bugs/cvtofc_86/convert)
==30152== 
==30152== Invalid write of size 4
==30152==    at 0x403087D: memset (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==30152==    by 0x404E896: FillRowFormat (in /home/icewall/bugs/cvtofc_86/libdhf_rxls.so)
==30152==    by 0x4052633: FillCell (in /home/icewall/bugs/cvtofc_86/libdhf_rxls.so)
==30152==    by 0x405327E: DHF_RGetObject (in /home/icewall/bugs/cvtofc_86/libdhf_rxls.so)
==30152==    by 0x403979E: FilterToHtml (in /home/icewall/bugs/cvtofc_86/libdhf_htmlif.so)
==30152==    by 0x4038AFB: DHF_GetHtml_V11 (in /home/icewall/bugs/cvtofc_86/libdhf_htmlif.so)
==30152==    by 0x8049AF7: main (in /home/icewall/bugs/cvtofc_86/convert)
==30152==  Address 0x57af9c8 is 0 bytes after a block of size 186,368 alloc'd
==30152==    at 0x402C109: calloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==30152==    by 0x42DCD24: DMC_calloc (in /home/icewall/bugs/cvtofc_86/libdmc_comm.so)
==30152==    by 0x404101A: InitMem (in /home/icewall/bugs/cvtofc_86/libdhf_rxls.so)
==30152==    by 0x40526BB: DHF_ROpen (in /home/icewall/bugs/cvtofc_86/libdhf_rxls.so)
==30152==    by 0x4039765: FilterToHtml (in /home/icewall/bugs/cvtofc_86/libdhf_htmlif.so)
==30152==    by 0x4038AFB: DHF_GetHtml_V11 (in /home/icewall/bugs/cvtofc_86/libdhf_htmlif.so)
==30152==    by 0x8049AF7: main (in /home/icewall/bugs/cvtofc_86/convert)

We see that an out-of-bound write appears in the function FillRowFormat and spaced is allocated for the overflowed buffer in InitMem. Let's first investigate the place of allocation:

Line 1  signed int __cdecl InitMem(int a1)
Line 2  {
Line 3    _DWORD *v1; // [email protected]
Line 4    int v2; // [email protected]
Line 5 
Line 6    v1 = *(_DWORD **)(a1 + 16164);
Line 7    v1[74] = DMC_calloc(65538, 1);
Line 8    v1[75] = DMC_calloc(65538, 1);
Line 9    v1[76] = DMC_calloc(65538, 1);
Line 10   v2 = DMC_calloc(256, 728);
Line 11   *(_DWORD *)(a1 + 9236) = v2;
Line 12   if ( !v1[76] || !v1[75] || !v1[74] || !v2 )
Line 13     return 12;
Line 14   GetEndian(v1);
Line 15   return 0;
Line 16 }

Allocation of buffer overflowed in FillRowFormat takes place exactly at line 10. Calculating allocation size value : 256 * 728 = 186368 we see its equal to the value presented by valgrind. Of course an important fact to note is that this value is fixed. Switching to the place where overflow appears, we see the following situation:

Line 1  int __cdecl FillRowFormat(struct_a1 *a1)
Line 2  {
Line 3  (...)
Line 4    v207 = -1;
Line 5    v208 = -1;
Line 6    v197 = a1->dword3F24;
Line 7    qmemcpy(&v225, (const void *)(*(_DWORD *)(v197 + 1956) + (*(_DWORD *)(v197 + 1968) << 6) - 64), 0x40u);
Line 8    memset(a1->pvoid2414, 0, 728 * (HIWORD(v226) - (unsigned __int16)v226));
Line 9  (...)

and information from the debugger:

[----------------------------------registers-----------------------------------]
EAX: 0x1a0d 
EBX: 0xf774ec24 --> 0x1fb2c 
ECX: 0x0 
EDX: 0x8aa9f5c --> 0x0 
ESI: 0x8b1f6c8 --> 0x8aa1870 --> 0x750054 ('T')
EDI: 0xffa185e0 --> 0xf7682420 --> 0x0 
EBP: 0xffa185f8 --> 0xffa18688 --> 0xffa186b8 --> 0xffa186e8 --> 0xffa18ac8 --> 0xffa28e38 --> 0x0 
ESP: 0xffa184b0 --> 0x8adc728 --> 0x0 
EIP: 0xf7742892 --> 0xfefb8de8
EFLAGS: 0x292 (carry parity ADJUST zero SIGN trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
   0xf7742887:  push   0x0
   0xf7742889:  mov    edx,DWORD PTR [ebp-0x9c]
   0xf774288f:  push   DWORD PTR [edx+0x18]
=> 0xf7742892:  call   0xf7732424 <[email protected]>
   0xf7742897:  add    esp,0x10
   0xf774289a:  mov    BYTE PTR [ebp-0x59],0x0
   0xf774289e:  mov    WORD PTR [ebp-0x64],0x0
   0xf77428a4:  mov    ecx,DWORD PTR [ebp-0x98]
Guessed arguments:
arg[0]: 0x8adc728 --> 0x0 
arg[1]: 0x0 
arg[2]: 0x5b2d8 

At line 8, we see the Size parameter of memset is the result of the multiplication of constant value 728 with the result of the subtraction of two WORD fields. In this case, its size is equal to 0x5b2d8 which is much higher than allocated space for this buffer ( 186368 == 0x2d800).

Looking for initialization of these fields we land here:

(rr) rni
$100 = 0x489a
[----------------------------------registers-----------------------------------]
EAX: 0x200 
EBX: 0xf774ec24 --> 0x1fb2c 
ECX: 0x8b3b0b0 --> 0x0 
EDX: 0x8b1f688 --> 0x8aa5d88 --> 0x750054 ('T')
ESI: 0x8aabd10 --> 0x0 
EDI: 0x8b1f688 --> 0x8aa5d88 --> 0x750054 ('T')
EBP: 0xffa18688 --> 0xffa186b8 --> 0xffa186e8 --> 0xffa18ac8 --> 0xffa28e38 --> 0x0 
ESP: 0xffa18660 --> 0x8aa000a --> 0x0 
EIP: 0xf773a8e9 --> 0x42896640
EFLAGS: 0x206 (carry PARITY adjust zero sign trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
   0xf773a8de:  mov    ax,WORD PTR [ecx+eax*8+0x4]
   0xf773a8e3:  cmp    ax,WORD PTR [edx+0x1e]
   0xf773a8e7:  jb     0xf773a8ee
=> 0xf773a8e9:  inc    eax
   0xf773a8ea:  mov    WORD PTR [edx+0x1e],ax
   0xf773a8ee:  sub    esp,0x8
   0xf773a8f1:  mov    eax,DWORD PTR [esi+0x9a4]
   0xf773a8f7:  mov    edx,DWORD PTR [esi+0x9a0]
[------------------------------------stack-------------------------------------]
0000| 0xffa18660 --> 0x8aa000a --> 0x0 
0004| 0xffa18664 --> 0x8aac716 --> 0x1e1a 
0008| 0xffa18668 --> 0x6 
0012| 0xffa1866c --> 0xf773a10a --> 0x1ac3815b 
0016| 0xffa18670 --> 0xb ('\x0b')
0020| 0xffa18674 --> 0x6 
0024| 0xffa18678 --> 0xb ('\x0b')
0028| 0xffa1867c --> 0xf774ec24 --> 0x1fb2c 
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
Warning: not running or target is remote
0xf773a8e7 in NumberRec () from ./libdhf_rxls.so
(rr) bt
#0  0xf773a8e7 in NumberRec () from ./libdhf_rxls.so
#1  0xf77472c5 in DHF_RGetObject () from ./libdhf_rxls.so
#2  0xf775179f in FilterToHtml () from ./libdhf_htmlif.so
#3  0xf7750afc in DHF_GetHtml_V11 () from ./libdhf_htmlif.so
#4  0x08049af8 in main ()
#5  0xf74f0af3 in __libc_start_main (main=0x8049730 <main>, argc=0x2, argv=0xffa28ed4, init=0x8049f70 <__libc_csu_init>, fini=0x8049f60 <__libc_csu_fini>, rtld_fini=0xf7793160 <_dl_fini>, 
    stack_end=0xffa28ecc) at libc-start.c:287
#6  0x08048ad1 in _start ()

The value that we're interested in above is the register eax, which equals : 0x200. Looking at the pseudo code of the NumberRec function, we see the following:

Line 1 signed int __usercall NumberRec@<eax>(long double a1@<st0>, struct_a2 *a2)
Line 2 {
Line 3 (...)
Line 4   *((_DWORD *)a2->pvoid9A0 + 6 * a2->dword9A4 + 2) = (unsigned __int16)Exc_GetWord(a2, a2->dword128);
Line 5   *((_WORD *)a2->pvoid9A0 + 12 * a2->dword9A4 + 2) = Exc_GetWord(a2, a2->dword128 + 2);//XXX
Line 6   *((_WORD *)a2->pvoid9A0 + 12 * a2->dword9A4 + 3) = Exc_GetWord(a2, a2->dword128 + 4);
Line 7   *((_DWORD *)a2->pvoid9A0 + 6 * a2->dword9A4 + 3) = 0xFFFF;
Line 8   *((_BYTE *)a2->pvoid9A0 + 24 * a2->dword9A4) = 1;
Line 9   v5 = a2->dword7A4 + (a2->dword7B0 << 6) - 64;
Line 10  v6 = *((_DWORD *)a2->pvoid9A0 + 6 * a2->dword9A4 + 2);
Line 11  if ( v6 >= *(_DWORD *)(v5 + 24) )
Line 12    *(_DWORD *)(a2->dword7A4 + (a2->dword7B0 << 6) - 64 + 24) = v6 + 1;
Line 13  v7 = a2->dword7A4 + (a2->dword7B0 << 6) - 64;
Line 14  v8 = *((_WORD *)a2->pvoid9A0 + 12 * a2->dword9A4 + 2); //XXX
Line 15  if ( v8 >= *(_WORD *)(v7 + 30) )
Line 16    *(_WORD *)(a2->dword7A4 + (a2->dword7B0 << 6) - 64 + 30) = v8 + 1;//XXX

Manipulations on 0x200 value are made at lines marked by an //XXX comment. As we can notice at line 5 that value is read directly from the file via the Exc_GetWord function and later stored at line 16. During this entire procedure the value is checked once, at line 15 whether its bigger than some field. There is NO CHECK for an upper limit. We know now, seeing the code above that the value used for memset as a size argument is read almost directly from the file. Looking for this value in our PoC file we can find it at offset : 0x3155. The entire record looks as follows:

0x3150:     7E 02 0A 00 1B 00 00 02 1A 00 1A 1E 00 00

which is : type : 0x27E Len : 0xA Data : 0x1B ...

According to the documentation https://msdn.microsoft.com/en-us/library/office/cc313154(v=office.12).aspx, a record with type 0x27E is an https://msdn.microsoft.com/en-us/library/office/dd907549%28v=office.14%29.aspx?f=255&MSPPError=-2147217396. The 0x200 value is a col (2 bytes): A Col structure that specifies a column index. In the description for this we read:

`
An unsigned integer that specifies the zero-based column index of the column in the
sheet that contains this structure. MUST be greater than or equal to the colMic field of the
Dimensions record of the sheet that contains this structure and MUST be less than the colMac
field of the Dimensions record of the sheet that contains this structure. MUST be less than or equal
to 0x00FF.
`

We saw in the previous analysis that there was a check to see whether col is bigger than colMic, but that there was no check to ensure that col does not exceed colMac or 0x00FF, which led to the overflow.

Crash Information

[email protected]:~/bugs/cvtofc_86$ valgrind ./convert config_xls/
==30152== Memcheck, a memory error detector
==30152== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==30152== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==30152== Command: ./convert config_xls/
==30152== 
input=/home/icewall/bugs/cvtofc_86/config_xls/toconv.xls
output=/home/icewall/bugs/cvtofc_86/config_xls/conv.html
type=2
info.options='0'
Return from GetFileInfo=0
HtmlInfo.GroupName=UTF-8
HtmlInfo.DefLangName=English
HtmlInfo.bBigEndian=0
HtmlInfo.options=0
HtmlInfo.SheetId=0
HtmlInfo.SlideId=0
HtmlInfo.lpFunc=(nil)
HtmlInfo.szImageFolder=
==30152== Source and destination overlap in strcpy(0x43e178d, 0x43e178d)
==30152==    at 0x402D56F: strcpy (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==30152==    by 0x635186F: DHF_WOpen (in /home/icewall/bugs/cvtofc_86/libdhf_whtml.so)
==30152==    by 0x4039779: FilterToHtml (in /home/icewall/bugs/cvtofc_86/libdhf_htmlif.so)
==30152==    by 0x4038AFB: DHF_GetHtml_V11 (in /home/icewall/bugs/cvtofc_86/libdhf_htmlif.so)
==30152==    by 0x8049AF7: main (in /home/icewall/bugs/cvtofc_86/convert)
==30152== 
==30152== Invalid write of size 4
==30152==    at 0x403087D: memset (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==30152==    by 0x404E896: FillRowFormat (in /home/icewall/bugs/cvtofc_86/libdhf_rxls.so)
==30152==    by 0x4052633: FillCell (in /home/icewall/bugs/cvtofc_86/libdhf_rxls.so)
==30152==    by 0x405327E: DHF_RGetObject (in /home/icewall/bugs/cvtofc_86/libdhf_rxls.so)
==30152==    by 0x403979E: FilterToHtml (in /home/icewall/bugs/cvtofc_86/libdhf_htmlif.so)
==30152==    by 0x4038AFB: DHF_GetHtml_V11 (in /home/icewall/bugs/cvtofc_86/libdhf_htmlif.so)
==30152==    by 0x8049AF7: main (in /home/icewall/bugs/cvtofc_86/convert)
==30152==  Address 0x57af9c8 is 0 bytes after a block of size 186,368 alloc'd
==30152==    at 0x402C109: calloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==30152==    by 0x42DCD24: DMC_calloc (in /home/icewall/bugs/cvtofc_86/libdmc_comm.so)
==30152==    by 0x404101A: InitMem (in /home/icewall/bugs/cvtofc_86/libdhf_rxls.so)
==30152==    by 0x40526BB: DHF_ROpen (in /home/icewall/bugs/cvtofc_86/libdhf_rxls.so)
==30152==    by 0x4039765: FilterToHtml (in /home/icewall/bugs/cvtofc_86/libdhf_htmlif.so)
==30152==    by 0x4038AFB: DHF_GetHtml_V11 (in /home/icewall/bugs/cvtofc_86/libdhf_htmlif.so)
==30152==    by 0x8049AF7: main (in /home/icewall/bugs/cvtofc_86/convert)

input=/home/icewall/bugs/cvtofc_86/config_xls/toconv.xls
output=/home/icewall/bugs/cvtofc_86/config_xls/conv.html
type=2
info.options='0'
Return from GetFileInfo=0
HtmlInfo.GroupName=UTF-8
HtmlInfo.DefLangName=English
HtmlInfo.bBigEndian=0
HtmlInfo.options=0
HtmlInfo.SheetId=0
HtmlInfo.SlideId=0
HtmlInfo.lpFunc=(nil)
HtmlInfo.szImageFolder=

Program received signal SIGSEGV, Segmentation fault.
[----------------------------------registers-----------------------------------]
EAX: 0x0 
EBX: 0x800 
ECX: 0xb6b6 
EDX: 0x0 
ESI: 0xdffd8dc0 --> 0xdffdcfe4 --> 0x750054 ('T')
EDI: 0xe0289000 --> 0x0 
EBP: 0xfffec7f8 --> 0xfffec888 --> 0xfffec8b8 --> 0xfffec8e8 --> 0xfffeccc8 --> 0xffffd038 --> 0x0 
ESP: 0xfffec6a4 --> 0xfffec7e0 --> 0x1 
EIP: 0xf7e73a6a (rep stos DWORD PTR es:[edi],eax)
EFLAGS: 0x10246 (carry PARITY adjust ZERO sign trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
   0xf7e73a62:  mov    edx,ecx
   0xf7e73a64:  shr    ecx,0x2
   0xf7e73a67:  and    edx,0x3
=> 0xf7e73a6a:  rep stos DWORD PTR es:[edi],eax
   0xf7e73a6c:  je     0xf7e73a80
   0xf7e73a6e:  cmp    edx,0x2
   0xf7e73a71:  jb     0xf7e73a7e
   0xf7e73a73:  mov    WORD PTR [edi],ax
[------------------------------------stack-------------------------------------]
0000| 0xfffec6a4 --> 0xfffec7e0 --> 0x1 
0004| 0xfffec6a8 --> 0xf7fbbc24 --> 0x1fb2c 
0008| 0xfffec6ac --> 0xf7faf897 (add    esp,0x10)
0012| 0xfffec6b0 --> 0xe025b800 --> 0x0 
0016| 0xfffec6b4 --> 0x0 
0020| 0xfffec6b8 --> 0x5b2d8 
0024| 0xfffec6bc --> 0xf7faf819 (pop    ebx)
0028| 0xfffec6c0 --> 0x0 
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
Stopped reason: SIGSEGV
__memset_sse2_rep () at ../sysdeps/i386/i686/multiarch/memset-sse2-rep.S:325
325 ../sysdeps/i386/i686/multiarch/memset-sse2-rep.S: No such file or directory.
gdb-peda$ 
gdb-peda$ exploitable
Description: Access violation on destination operand
Short description: DestAv (9/29)
Hash: 4dcfa6cfd291115355bf95afddf87f0b.c014683870b53e5d81fdd2e9450d6883
Exploitability Classification: EXPLOITABLE
Explanation: The target crashed on an access violation at an address matching the destination operand of the instruction. This likely indicates a write access violation, which means the attacker may control the write address and/or value.
Other tags: AccessViolation (28/29)
gdb-peda$ bt
#0  __memset_sse2_rep () at ../sysdeps/i386/i686/multiarch/memset-sse2-rep.S:325
#1  0xf7faf897 in FillRowFormat () from ./libdhf_rxls.so
#2  0xf7fb3634 in FillCell () from ./libdhf_rxls.so
#3  0xf7fb427f in DHF_RGetObject () from ./libdhf_rxls.so
#4  0xf7fc179f in FilterToHtml () from ./libdhf_htmlif.so
#5  0xf7fc0afc in DHF_GetHtml_V11 () from ./libdhf_htmlif.so
#6  0x08049af8 in main ()
#7  0xf7d60af3 in __libc_start_main (main=0x8049730 <main>, argc=0x2, argv=0xffffd0d4, init=0x8049f70 <__libc_csu_init>, fini=0x8049f60 <__libc_csu_fini>, rtld_fini=0xf7feb160 <_dl_fini>, 
    stack_end=0xffffd0cc) at libc-start.c:287
#8  0x08048ad1 in _start ()

Timeline

2017-02-09 - Vendor Disclosure
2017-05-04 - Public Release

Credit

Discovered by Marcin 'Icewall' Noga of Cisco Talos.