Talos Vulnerability Report

TALOS-2016-0207

AntennaHouse DMC HTMLFilter Doc_SetSummary Code Execution Vulnerability

May 4, 2017
CVE Number

CVE-2016-8382

AntennaHouse DMC HTMLFilter Doc_SetSummary Code Execution Vulnerability

Summary

An exploitable heap corruption vulnerability exists in the Doc_SetSummary functionality of AntennaHouse DMC HTMLFilter. A specially crafted doc file can cause a heap corruption resulting in arbitrary code execution. An attacker can send a malicious doc file to trigger this vulnerability.

Tested Versions

AntennaHouse DMC HTMLFilter shipped with MarkLogic 8.0-5.5

1415cbc784f05db0e9db424636df581a  libdhf_comm.so
81db1b55e18a0cb70a78410147f50b9c  libdhf_htmlif.so
fb1a22fa08c986ec3614284f4e912b0a  libdhf_rdoc.so
b2622da4ce1aa7fa4aac10ee7d3407cf  libdhf_rppt.so
1eabb31236c675f9856a7d001b339334  libdhf_rxls.so
d716dd77c8e9ee88df435e74fad687e6  libdhf_whtml.so
15b0acc464fba28335239f722a62037f  libdmc_comm.so
4ae366fbd4540dd4c750e6679eb63dd4  libdmc_conf.so
84009641f744d88fd1737d59b7c71ab1  libdmc_dtct.so

Product URLs

https://www.antennahouse.com/antenna1/

CVSSv3 Score

8.3 - CVSS:3.0/AV:N/AC:H/PR:N/UI:R/S:C/C:H/I:H/A:H CVSSv3 Calculator: https://www.first.org/cvss/calculator/3.0

Details

This vulnerability is present in the AntennaHouse DMC HTMLFilter which is used among other things to convert doc files to (x)html form.
This product is mainly used by MarkLogic for doc document conversions as part of their web based document search and rendering engine. A specially crafted DOC file can lead to a heap corruption and ultimately to remote code execution.

Let’s investigate this vulnerability. After execution the DOC to HTML converter with a malformed doc file as an input we can easily observe a couple of flaws using Valgrind:

[email protected]:~/bugs/cvtofc$ valgrind ./convert config_doc/
==47051== Memcheck, a memory error detector
==47051== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==47051== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==47051== Command: ./convert config_doc/
==47051== 
input=/home/icewall/bugs/cvtofc/config_doc/toconv.doc
output=/home/icewall/bugs/cvtofc/config_doc/conv.html
type=1
info.options='0'
Return from GetFileInfo=0
HtmlInfo.GroupName=UTF-8
HtmlInfo.DefLangName=English
HtmlInfo.bBigEndian=0
HtmlInfo.options=0
HtmlInfo.SheetId=0
HtmlInfo.SlideId=0
HtmlInfo.lpFunc=(nil)
HtmlInfo.szImageFolder=
==47051== 
==47051== Invalid write of size 1
==47051==    at 0x402F04B: memcpy (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==47051==    by 0x638FD10: Doc_SetSummary (in /home/icewall/bugs/cvtofc/libdhf_rdoc.so)
==47051==    by 0x635D32F: SimReadText (in /home/icewall/bugs/cvtofc/libdhf_rdoc.so)
==47051==    by 0x6342C7E: DHF_ROpen (in /home/icewall/bugs/cvtofc/libdhf_rdoc.so)
==47051==    by 0x4039765: FilterToHtml (in /home/icewall/bugs/cvtofc/libdhf_htmlif.so)
==47051==    by 0x4038AFB: DHF_GetHtml_V11 (in /home/icewall/bugs/cvtofc/libdhf_htmlif.so)
==47051==    by 0x8049AF7: main (in /home/icewall/bugs/cvtofc/convert)
==47051==  Address 0x43df8ac is 0 bytes after a block of size 16,172 alloc'd
==47051==    at 0x402A17C: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==47051==    by 0x42DACB1: DMC_malloc (in /home/icewall/bugs/cvtofc/libdmc_comm.so)
==47051==    by 0x40383A2: DHF_GetHtml_V11 (in /home/icewall/bugs/cvtofc/libdhf_htmlif.so)
==47051==    by 0x8049AF7: main (in /home/icewall/bugs/cvtofc/convert)
==47051== 
==47051== Syscall param write(buf) points to uninitialised byte(s)
==47051==    at 0x41DD003: __write_nocancel (syscall-template.S:81)
==47051==    by 0x4170D20: [email protected]@GLIBC_2.1 (fileops.c:1261)
==47051==    by 0x416FF5E: new_do_write (fileops.c:538)
==47051==    by 0x4171CCD: [email protected]@GLIBC_2.1 (fileops.c:511)
==47051==    by 0x417159F: [email protected]@GLIBC_2.1 (fileops.c:165)
==47051==    by 0x416567F: [email protected]@GLIBC_2.1 (iofclose.c:59)
==47051==    by 0x42DAA49: DMC_FileClose (in /home/icewall/bugs/cvtofc/libdmc_comm.so)
==47051==    by 0x636221E: SimReadText (in /home/icewall/bugs/cvtofc/libdhf_rdoc.so)
==47051==    by 0x6342C7E: DHF_ROpen (in /home/icewall/bugs/cvtofc/libdhf_rdoc.so)
==47051==    by 0x4039765: FilterToHtml (in /home/icewall/bugs/cvtofc/libdhf_htmlif.so)
==47051==    by 0x4038AFB: DHF_GetHtml_V11 (in /home/icewall/bugs/cvtofc/libdhf_htmlif.so)
==47051==    by 0x8049AF7: main (in /home/icewall/bugs/cvtofc/convert)
==47051==  Address 0x40403fe is not stack'd, malloc'd or (recently) free'd
==47051== 
Return from GetHtml=25
==47051== 
==47051== HEAP SUMMARY:
==47051==     in use at exit: 2,078 bytes in 3 blocks
==47051==   total heap usage: 34,472 allocs, 34,469 frees, 41,026,839 bytes allocated
==47051== 
==47051== LEAK SUMMARY:
==47051==    definitely lost: 2,058 bytes in 2 blocks
==47051==    indirectly lost: 0 bytes in 0 blocks
==47051==      possibly lost: 0 bytes in 0 blocks
==47051==    still reachable: 20 bytes in 1 blocks
==47051==         suppressed: 0 bytes in 0 blocks
==47051== Rerun with --leak-check=full to see details of leaked memory
==47051== 
==47051== For counts of detected and suppressed errors, rerun with: -v
==47051== Use --track-origins=yes to see where uninitialised values come from
==47051== ERROR SUMMARY: 4574 errors from 4 contexts (suppressed: 0 from 0)

Let’s focus on the out bounds write that appears inside the Doc_SetSummary function. Pseudo code of this function looks like this:

Line 1  int __cdecl Doc_SetSummary(struct_a1 *a1, struct_a2 *a2)
Line 2  {
Line 3    __int16 v3; // [esp-20h] [ebp-38h]@1
Line 4    __int16 v4; // [esp-1Ch] [ebp-34h]@1
Line 5    __int16 v5; // [esp-18h] [ebp-30h]@1
Line 6    char *v6; // [esp-14h] [ebp-2Ch]@1
Line 7    __int16 v7; // [esp-10h] [ebp-28h]@1
Line 8    __int16 v8; // [esp-Ch] [ebp-24h]@1
Line 9    __int16 v9; // [esp-8h] [ebp-20h]@1
Line 10   char *v10; // [esp-4h] [ebp-1Ch]@1
Line 11   char *v11; // [esp+8h] [ebp-10h]@1
Line 12
Line 13   v10 = (_BYTE *)&loc_51C8A;
Line 14   a2->dword2668 = a1->dword3830->unsigned60;
Line 15   a2->dword2640 = a1->dword3830->dword90;
Line 16   memcpy(&a2->char242C, &a1->dword3830->char11A, a1->dword3830->signed116);
Line 17   memcpy(&a2->char252D, &a1->dword3830->char1E4, a1->dword3830->signed118);
Line 18   v11 = &a2->char3AE4;
Line 19   v6 = &a2->char3AE4;
Line 20   qmemcpy(&v3, &a1->dword3830->char6A, 0xCu);

Let’s investigate the memcpy parameters in line 17 (that's the place where the OOB write appears) under a debugger:

0xf7cb9d0b in Doc_SetSummary () from ./libdhf_rdoc.so
gdb-peda$ 
[----------------------------------registers-----------------------------------]
EAX: 0x80a16cc --> 0x0 
EBX: 0xf7ccea84 --> 0x6698c 
ECX: 0x8096d1d --> 0x0 
EDX: 0x4321 ('!C')
ESI: 0x0 
EDI: 0x1 
EBP: 0xfffeb128 --> 0xfffec8c8 --> 0xfffec8f8 --> 0xfffec928 --> 0xfffecd08 --> 0xffffd078 --> 0x0 
ESP: 0xfffeb100 --> 0x8096d1d --> 0x0 
EIP: 0xf7cb9d0c (call   0xf7c6c56c <[email protected]>)
EFLAGS: 0x286 (carry PARITY adjust zero SIGN trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
   0xf7cb9d09:  push   edx
   0xf7cb9d0a:  push   eax
   0xf7cb9d0b:  push   ecx
=> 0xf7cb9d0c:  call   0xf7c6c56c <[email protected]>
   0xf7cb9d11:  mov    eax,DWORD PTR [ebp+0xc]
   0xf7cb9d14:  add    eax,0x3ae4
   0xf7cb9d19:  mov    DWORD PTR [ebp-0x10],eax
   0xf7cb9d1c:  push   eax
Guessed arguments:
arg[0]: 0x8096d1d --> 0x0 
arg[1]: 0x80a16cc --> 0x0 
arg[2]: 0x4321 ('!C')
[------------------------------------stack-------------------------------------]
0000| 0xfffeb100 --> 0x8096d1d --> 0x0 
0004| 0xfffeb104 --> 0x80a16cc --> 0x0 
0008| 0xfffeb108 --> 0x4321 ('!C')
0012| 0xfffeb10c --> 0xf7cb9c8a (pop    ebx)
0016| 0xfffeb110 --> 0x0 
0020| 0xfffeb114 --> 0x0 
0024| 0xfffeb118 --> 0x0 
0028| 0xfffeb11c --> 0xf7ccea84 --> 0x6698c 
[------------------------------------------------------------------------------]

We see that the third size parameter is fully controllable by attacker and equal to 0x4321. Where dst buffer has the following amount of space:

(gdb) heap /b $ecx

[In-use]
[Address] 0x80947f0
[Size]    16180
[Offset]  +9517

space = (0x80947f0 + 16180) - (0x80947f0+9517) so the space equals 6663 bytes. The 3rd parameter to memcpy is much bigger which in consequences lead to heap corruption.

Using rr debugger we can easily find place where memcpy size parameter is set:

(rr) rc
Continuing.
Warning: not running or target is remote
Hardware watchpoint 2: *0x9fbf2e8

Old value = <unreadable>
New value = 0x4321
0xf736721a in Doc_GetDop () from ./libdhf_rdoc.so
(rr) context
$60 = 0xbee9
[----------------------------------registers-----------------------------------]
EAX: 0x21 ('!')
EBX: 0xf7388a84 --> 0x6698c 
ECX: 0x9fbfe28 --> 0x21 ('!')
EDX: 0x9fbf1ea --> 0x10000 
ESI: 0x9fbfe28 --> 0x21 ('!')
EDI: 0x0 
EBP: 0xffb016f8 --> 0xffb02e98 --> 0xffb02ec8 --> 0xffb02ef8 --> 0xffb032d8 --> 0xffb13648 --> 0x0 
ESP: 0xffb016d0 --> 0x9fbfe28 --> 0x21 ('!')
EIP: 0xf736721a --> 0x7a848966
EFLAGS: 0x202 (carry parity adjust zero sign trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
   0xf736720f:  call   0xf732610c <[email protected]>
   0xf7367214:  mov    edx,DWORD PTR [ebp-0x10]
   0xf7367217:  add    edx,0x1a
=> 0xf736721a:  mov    WORD PTR [edx+edi*2+0x100],ax
   0xf7367222:  add    esi,0x2
   0xf7367225:  add    esp,0x10
   0xf7367228:  inc    edi
   0xf7367229:  cmp    edi,0x64
[------------------------------------stack-------------------------------------]
0000| 0xffb016d0 --> 0x9fbfe28 --> 0x21 ('!')
0004| 0xffb016d4 --> 0x0 
0008| 0xffb016d8 --> 0x0 
0012| 0xffb016dc --> 0xf736681e --> 0x66c3815b 
0016| 0xffb016e0 --> 0x0 
0020| 0xffb016e4 --> 0x9fbfdc8 --> 0x40020 
0024| 0xffb016e8 --> 0x9fbf1d0 --> 0x0 
0028| 0xffb016ec --> 0xf7388a84 --> 0x6698c 
[------------------------------------------------------------------------------]

pseudo code

Line 1 signed int __cdecl Doc_GetDop(struct_a1_1 *a1)
Line 2 {
Line 3   unsigned __int16 v2; // [email protected]
Line 4   int bufferPtr; // [email protected]
Line 5   __int16 v4; // [email protected]
Line 6   unsigned __int16 v5; // [email protected]
Line 7   unsigned __int16 v6; // [email protected]
Line 8   unsigned __int16 v7; // [email protected]
Line 9   unsigned __int16 v8; // [email protected]
Line 10  unsigned __int16 v9; // [email protected]
Line 11  unsigned __int16 v10; // [email protected]
Line 12  unsigned __int16 v11; // [email protected]
Line 13  unsigned __int16 v12; // [email protected]
Line 14  unsigned __int16 v13; // [email protected]
Line 15  int v14; // [email protected]
Line 16  signed int v15; // [email protected]
Line 17  signed int v16; // [email protected]
Line 18  int v17; // [email protected]
Line 19  unsigned __int16 v18; // [email protected]
Line 20  unsigned __int16 v19; // [email protected]
Line 21  unsigned __int16 v20; // [email protected]
Line 22  unsigned __int16 v21; // [email protected]
Line 23  unsigned int v22; // [email protected]
Line 24  int v23; // [email protected]
Line 25  signed int v24; // [email protected]
Line 26  int v25; // [email protected]
Line 27  int fileHandler; // [esp+0h] [ebp-18h]@1
Line 28  int v27; // [esp+4h] [ebp-14h]@1
Line 29  void *s; // [esp+8h] [ebp-10h]@1
Line 30
Line 31  v27 = 0;
Line 32  fileHandler = a1->dword38EC;
Line 33  s = (void *)DMC_malloc(0x2EC);
Line 34  if ( !s )
Line 35    return 12;
Line 36  memset(s, 0, 0x2ECu);
Line 37  if ( a1->handle )
Line 38  {
Line 39    v27 = ReadBuf(a1->handle, a1->file->offset, a1->file->someSize);
Line 40    goto LABEL_8;
Line 41  }
Line 42  if ( !a1->dword1C )
Line 43    return 25;
Line 44  v27 = DMC_malloc(a1->file->someSize);
Line 45  if ( !v27 )
Line 46    return 12;
Line 47  DMC_FileSeek(a1->dword1C, a1->file->offset, 0);
Line 48  DMC_FileRead(v27, 1, a1->file->someSize, a1->dword1C);
Line 49LABEL_8:
Line 50  if ( !v27 )
Line 51  {
Line 52    if ( s )
Line 53      FreeBuf(&s);
Line 54    return 25;
Line 55  }
Line 56  v2 = Doc_GetWord(v27, fileHandler);
(...)
Line 57  bufferPtr = v27 + 2;
Line 58  *((_WORD *)s + 140) = Doc_GetWord(bufferPtr, fileHandler);
Line 59  v14 = bufferPtr + 2;
Line 60  v15 = 0;
Line 61  do
Line 62  {
Line 63    *((_WORD *)s + v15 + 141) = Doc_GetWord(v14, fileHandler);
Line 64    v14 += 2;
Line 65    ++v15;
Line 66  }

We see that lower WORD 0x21 is set to a structure field which is later used as a part of memcpy parameter. Its value is read directly from the buffer at line 63 which content has been read from the file at line 39. Interesting is also name of the function we land in: Doc_GetDop. According [MS-DOC]: Word (.doc) Binary File Format documentation Dop is a name of a structure that contains Document Properties. So our memcpy param is a field in a Dop structure. Calculating the difference between the beginning of the buffer and the place where value of this field appears we get info about offset for that field in the Dop structure.

v27 - v14 from line 63
is
0x9fbfe28-0x9fbfdc8 = 96 [0x60]

Let’s go back to the place where the heap corruption appears and perform one step to confirm this theory:

0xf7cb9d0c in Doc_SetSummary () from ./libdhf_rdoc.so
=> 0xf7cb9d0c:  e8 5b 28 fb ff  call   0xf7c6c56c <[email protected]>
(gdb) heap
    Tuning params & stats:
        mmap_threshold=131072
        pagesize=4096
        n_mmaps=0
        n_mmaps_max=65536
        total mmap regions created=3
        mmapped_mem=0
        sbrk_base=0x804c000
    Main arena (0xf7f08420) owns regions:
        [0x804c008 - 0x80b2000] Total 407KB in-use 8689(332KB) free 17(41KB)

    There are 1 arenas Total 373KB
    Total 8689 blocks in-use of 332KB
    Total 17 blocks free of 41KB

(gdb) ni
0xf7cb9d11 in Doc_SetSummary () from ./libdhf_rdoc.so
=> 0xf7cb9d11:  8b 45 0c    mov    eax,DWORD PTR [ebp+0xc]
(gdb) heap
    Tuning params & stats:
        mmap_threshold=131072
        pagesize=4096
        n_mmaps=0
        n_mmaps_max=65536
        total mmap regions created=3
        mmapped_mem=0
        sbrk_base=0x804c000
    Main arena (0xf7f08420) owns regions:
        [0x804c008 - 0x80b2000] Total 407KBFailed to walk arena. The chunk at 0x8098720 may be corrupted. Its size tag is 0x0


1 Errors encountered while walking the heap!
[Error] Failed to walk heap

Crash Information

[email protected]:~/bugs/cvtofc$ valgrind ./convert config_doc/
==47051== Memcheck, a memory error detector
==47051== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==47051== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==47051== Command: ./convert config_doc/
==47051== 
input=/home/icewall/bugs/cvtofc/config_doc/toconv.doc
output=/home/icewall/bugs/cvtofc/config_doc/conv.html
type=1
info.options='0'
Return from GetFileInfo=0
HtmlInfo.GroupName=UTF-8
HtmlInfo.DefLangName=English
HtmlInfo.bBigEndian=0
HtmlInfo.options=0
HtmlInfo.SheetId=0
HtmlInfo.SlideId=0
HtmlInfo.lpFunc=(nil)
HtmlInfo.szImageFolder=
==47051== 
==47051== Invalid write of size 1
==47051==    at 0x402F04B: memcpy (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==47051==    by 0x638FD10: Doc_SetSummary (in /home/icewall/bugs/cvtofc/libdhf_rdoc.so)
==47051==    by 0x635D32F: SimReadText (in /home/icewall/bugs/cvtofc/libdhf_rdoc.so)
==47051==    by 0x6342C7E: DHF_ROpen (in /home/icewall/bugs/cvtofc/libdhf_rdoc.so)
==47051==    by 0x4039765: FilterToHtml (in /home/icewall/bugs/cvtofc/libdhf_htmlif.so)
==47051==    by 0x4038AFB: DHF_GetHtml_V11 (in /home/icewall/bugs/cvtofc/libdhf_htmlif.so)
==47051==    by 0x8049AF7: main (in /home/icewall/bugs/cvtofc/convert)
==47051==  Address 0x43df8ac is 0 bytes after a block of size 16,172 alloc'd
==47051==    at 0x402A17C: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==47051==    by 0x42DACB1: DMC_malloc (in /home/icewall/bugs/cvtofc/libdmc_comm.so)
==47051==    by 0x40383A2: DHF_GetHtml_V11 (in /home/icewall/bugs/cvtofc/libdhf_htmlif.so)
==47051==    by 0x8049AF7: main (in /home/icewall/bugs/cvtofc/convert)
==47051== 
==47051== Syscall param write(buf) points to uninitialised byte(s)
==47051==    at 0x41DD003: __write_nocancel (syscall-template.S:81)
==47051==    by 0x4170D20: [email protected]@GLIBC_2.1 (fileops.c:1261)
==47051==    by 0x416FF5E: new_do_write (fileops.c:538)
==47051==    by 0x4171CCD: [email protected]@GLIBC_2.1 (fileops.c:511)
==47051==    by 0x417159F: [email protected]@GLIBC_2.1 (fileops.c:165)
==47051==    by 0x416567F: [email protected]@GLIBC_2.1 (iofclose.c:59)
==47051==    by 0x42DAA49: DMC_FileClose (in /home/icewall/bugs/cvtofc/libdmc_comm.so)
==47051==    by 0x636221E: SimReadText (in /home/icewall/bugs/cvtofc/libdhf_rdoc.so)
==47051==    by 0x6342C7E: DHF_ROpen (in /home/icewall/bugs/cvtofc/libdhf_rdoc.so)
==47051==    by 0x4039765: FilterToHtml (in /home/icewall/bugs/cvtofc/libdhf_htmlif.so)
==47051==    by 0x4038AFB: DHF_GetHtml_V11 (in /home/icewall/bugs/cvtofc/libdhf_htmlif.so)
==47051==    by 0x8049AF7: main (in /home/icewall/bugs/cvtofc/convert)
==47051==  Address 0x40403fe is not stack'd, malloc'd or (recently) free'd
==47051== 
Return from GetHtml=25
==47051== 
==47051== HEAP SUMMARY:
==47051==     in use at exit: 2,078 bytes in 3 blocks
==47051==   total heap usage: 34,472 allocs, 34,469 frees, 41,026,839 bytes allocated
==47051== 
==47051== LEAK SUMMARY:
==47051==    definitely lost: 2,058 bytes in 2 blocks
==47051==    indirectly lost: 0 bytes in 0 blocks
==47051==      possibly lost: 0 bytes in 0 blocks
==47051==    still reachable: 20 bytes in 1 blocks
==47051==         suppressed: 0 bytes in 0 blocks
==47051== Rerun with --leak-check=full to see details of leaked memory
==47051== 
==47051== For counts of detected and suppressed errors, rerun with: -v
==47051== Use --track-origins=yes to see where uninitialised values come from
==47051== ERROR SUMMARY: 4574 errors from 4 contexts (suppressed: 0 from 0)

Timeline

2016-10-10 - Vendor Disclosure
2017-05-04 - Public Release

Credit

Discovered by Marcin 'Icewall' Noga of Cisco Talos.