Talos Vulnerability Report


Iceni Argus TrueType Font File Cmap Table Code Execution Vulnerability

February 27, 2017
CVE Number



An exploitable heap-based buffer overflow exists in Iceni Argus. When it attempts to convert a PDF containing a malformed font to XML, the tool will attempt to use a size out of the font to search through a linked list of buffers to return. Due to a signedness issue, a buffer smaller than the requested size will be returned. Later when the tool tries to populate this buffer, the overflow will occur which can lead to code execution under the context of the user running the tool.

Tested Versions

Iceni Argus Version 6.6.04 (Sep 7 2012) NK

Product URLs


CVSSv3 Score

8.8 - CVSS:3.0/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H


This is a heap-based buffer overflow that occurs in a conversion tool that comes with Iceni Argus. This tool is used primarily by MarkLogic to convert PDF files to (X)HTML form.

When parsing a font file embedded within a PDF, the tool will call the ipParseFontFile function which will call GetTables to get the different tables located within a font file. Inside the GetTables function, the tool will eventually allocate space for the subtable offsets for each entry in the "cmap" table. If the font is corrupted the following code will get executed and the size will be clamped to 0x64 * 4 (0x190).

 813ff14:   0f b7 c6                movzwl %si,%eax     ; from file
 813ff17:   c1 e0 08                shl    $0x8,%eax
 813ff1a:   09 c2                   or     %eax,%edx
 813ff1c:   83 fa 64                cmp    $0x64,%edx
 813ff1f:   7e 05                   jle    813ff26 <GetTables+0x996>
 813ff21:   ba 64 00 00 00          mov    $0x64,%edx       ; size to clamp to
 813ff26:   8b 8d fc fa ff ff       mov    -0x504(%ebp),%ecx
 813ff2c:   8d 83 c7 ae 49 ff       lea    -0xb65139(%ebx),%eax
 813ff32:   89 91 cc 04 00 00       mov    %edx,0x4cc(%ecx) ; XXX: write 0x64 to 0x4cc(%ecx)
 813ff38:   89 44 24 04             mov    %eax,0x4(%esp)   ; "cmap offsets"
 813ff3c:   8d 04 95 00 00 00 00    lea    0x0(,%edx,4),%eax    ; multiplied by 4
 813ff43:   89 04 24                mov    %eax,(%esp)      ; size
 813ff46:   e8 25 16 06 00          call   81a1570 <icnMalloc>
 813ff4b:   85 c0                   test   %eax,%eax
 813ff4d:   89 85 4c fb ff ff       mov    %eax,-0x4b4(%ebp)
 813ff53:   0f 84 c8 02 00 00       je     8140221 <GetTables+0xc91>

After allocating the space, the tool will read data from the offset table into this buffer. In the following code, the table will be searched for specific values. When the loop terminates, the resulting index will be left in -0x548(%ebp) and the font object in %esi.

 81408a5:   84 c0                   test   %al,%al
 81408a7:   75 5c                   jne    8140905 <GetTables+0x1375>
 81408a9:   89 8d b8 fa ff ff       mov    %ecx,-0x548(%ebp)    ; XXX: index that is aggregated
 81408af:   83 c1 01                add    $0x1,%ecx
 81408b2:   39 d1                   cmp    %edx,%ecx
 81408b4:   0f 84 05 f8 ff ff       je     81400bf <GetTables+0xb2f>
 81408ba:   8b b5 fc fa ff ff       mov    -0x504(%ebp),%esi    ; XXX: check font object
 81408c0:   0f b6 84 4e d0 04 00    movzbl 0x4d0(%esi,%ecx,2),%eax
 81408c7:   00
 81408c8:   3c 03                   cmp    $0x3,%al
 81408ca:   75 d9                   jne    81408a5 <GetTables+0x1315>

After finding the index, the tool will then use this index to grab a size out of the "cmap offset" table. Once this size is fetched, this will be added to the base size property from the object that was discovered above and stored in %esi.

 813fe31:   8b b5 64 fb ff ff       mov    -0x49c(%ebp),%esi    ; object
 813fe37:   8b 85 f8 fa ff ff       mov    -0x508(%ebp),%eax
 813fe3d:   8b 7e 0c                mov    0xc(%esi),%edi       ; read base size
 81400cc:   8b 95 b8 fa ff ff       mov    -0x548(%ebp),%edx    ; read index back out
 81400f4:   8b 85 4c fb ff ff       mov    -0x4b4(%ebp),%eax
 81400fa:   8b 34 90                mov    (%eax,%edx,4),%esi   ; read bad size
 81400fd:   8b 95 f8 fa ff ff       mov    -0x508(%ebp),%edx
 8140103:   89 14 24                mov    %edx,(%esp)
 8140106:   e8 25 14 f7 ff          call   80b1530 <ipStreamReset>
 814010b:   85 c0                   test   %eax,%eax
 814010d:   0f 84 16 08 00 00       je     8140929 <GetTables+0x1399>
 8140929:   01 fe                   add    %edi,%esi        ; add %edi to size

Immediately afterwards, this size is passed to the icnBufferAlloc function which actually contains a signedness issue. If a signed value is used to allocate with this function, then an undersized buffer will be returned as the algorithm for icnBufferAlloc searches through a linked list for a buffer size that is larger than the argument passed to it.

 814092b:   8b bd 0c fb ff ff       mov    -0x4f4(%ebp),%edi
 8140931:   89 74 24 04             mov    %esi,0x4(%esp)   ; size
 8140935:   89 3c 24                mov    %edi,(%esp)      ; icnobject
 8140938:   e8 d3 23 f2 ff          call   8062d10 <icnBufferAlloc>
 814093d:   85 c0                   test   %eax,%eax
 814093f:   0f 84 30 0f 00 00       je     8141875 <GetTables+0x22e5>

The icnBufferAlloc function, which is responsible for allocating memory out of a linked list, will take two arguments, one of which is the "icn" object and the other of which is a size. This function contains a signedness issue with regards to the size that's passed to it. After adding 1 to the size, the function will loop through a linked list pointed to by the first argument while checking to see if the size defined in the list is larger than the size provided as the second argument. If the allocated size is less than 0, then any buffer within the linked list will be returned.

 8062d2f:   8b 45 0c                mov    0xc(%ebp),%eax       ; size
 8062d32:   8d b3 b6 34 44 ff       lea    -0xbbcb4a(%ebx),%esi
 8062d38:   89 75 e8                mov    %esi,-0x18(%ebp)
 8062d3b:   83 c0 01                add    $0x1,%eax
 8062d3e:   89 45 f0                mov    %eax,-0x10(%ebp)     ; size+1
 8062da6:   8b 4f 04                mov    0x4(%edi),%ecx
 8062da9:   89 d0                   mov    %edx,%eax
 8062dab:   29 c8                   sub    %ecx,%eax
 8062dad:   3b 45 f0                cmp    -0x10(%ebp),%eax     ; size+1
 8062db0:   7c 94                   jl     8062d46 <icnBufferAlloc+0x36>

After allocating with icnBufferAlloc, the convert tool will return back to GetTables and then execute the following code. This code passes the potentially undersized buffer along with the size to ipDataFeedRead which will read data from the file directly into the buffer. Due to the buffer that was allocated potentially being smaller than the space that's being read as a result of the signedness issue, a buffer overflow can be made to occur.

 8140945:   8b 95 0c fb ff ff       mov    -0x4f4(%ebp),%edx
 814094b:   89 14 24                mov    %edx,(%esp)
 814094e:   e8 ad 1d f2 ff          call   8062700 <icnBufferGetMemory>
 8140953:   8b 8d f8 fa ff ff       mov    -0x508(%ebp),%ecx
 8140959:   89 74 24 08             mov    %esi,0x8(%esp)       ; size
 814095d:   89 44 24 04             mov    %eax,0x4(%esp)       ; buffer
 8140961:   8b 41 18                mov    0x18(%ecx),%eax
 8140964:   89 04 24                mov    %eax,(%esp)          ; source
 8140967:   e8 24 43 f4 ff          call   8084c90 <ipDataFeedRead>

Crash Information

$ gdb --quiet --args /opt/MarkLogic/converters/cvtpdf/convert ~/config/

Reading symbols from /opt/MarkLogic/Converters/cvtpdf/convert...done.

(gdb) r

Starting program: /opt/MarkLogic/Converters/cvtpdf/convert /home/user/config/
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Loading configuration...
Parsing macros...
Macro synth-bookmarks='true'
Macro image-output='true'
Macro text-output='true'
Macro zones='false'
Macro ignore-text='true'
Macro remove-overprint='false'
Macro illustrations='true'
Macro line-breaks='true'
Macro image-quality='75'
Macro page-start=''
Macro page-end=''
Macro document-start=''
Macro document-end=''
Analysing '/home/user/poc.pdf'
Pages 1 to 1

Catchpoint 4 (signal SIGSEGV), 0xf7ea902d in __memmove_ssse3_rep ()
   from /lib/libc.so.6

(gdb) bt 5

#0  0xf7ea902d in __memmove_ssse3_rep () from /lib/libc.so.6
#1  0x082574f7 in ipFlateFeedRead ()
#2  0x08084cd7 in ipDataFeedRead ()
#3  0x0814096c in GetTables ()
#4  0x08141c6e in ipParseFontFile ()
(More stack frames follow...)

(gdb) h

[eax: 0x0997c6c8] [ebx: 0xf7ea902a] [ecx: 0x6b016400] [edx: 0x099c8ff3]
[esi: 0x00000010] [edi: 0x980283d4] [esp: 0xfffbf9a8] [ebp: 0xfffbf9e8]
[eflags: NZ SF OF CF ND NI]

fffbf9a8 | 08f57000 082574f7 099c8ff3 0997c6c8 | .p...t%.........
fffbf9b8 | 00000010 08084fac 0997c2a0 00000000 | .....O..........
fffbf9c8 | fffbfa08 f7dd54ea 9803a13f 9803a003 | .....T..?.......
fffbf9d8 | fffbf9f8 08f57000 9803a13f 098eefd0 | .....p..?.......

=> 0xf7ea902d <__memmove_ssse3_rep+3773>:       mov    %ecx,0xc(%edx)
   0xf7ea9030 <__memmove_ssse3_rep+3776>:       mov    0x8(%eax),%ecx
   0xf7ea9033 <__memmove_ssse3_rep+3779>:       mov    %ecx,0x8(%edx)
   0xf7ea9036 <__memmove_ssse3_rep+3782>:       mov    0x4(%eax),%ecx
   0xf7ea9039 <__memmove_ssse3_rep+3785>:       mov    %ecx,0x4(%edx)
   0xf7ea903c <__memmove_ssse3_rep+3788>:       mov    (%eax),%ecx


2016-10-10 - Vendor Disclosure
2017-02-27 - Public Release


Discovered by Marcin Noga of Cisco Talos and a Talos team member.