Iceni Argus PDF Font-Encoding GlyphMap Adjustment Code Execution Vulnerability

February 27, 2017

An exploitable arbitrary heap-overwrite vulnerability exists within Iceni Argus. When it attempts to convert a malformed PDF to XML, it will explicitly trust an index within the specific font object and use it to write the font’s name to a single object within an array of objects. Due to it explicitly trusting this index, one can specify an out-of-bounds index which will cause a pointer to a string to be written outside the bounds of the specified array. This can lead to code execution under the context of the account running it.

Tested Versions

Iceni Argus Version 6.6.04 (Sep 7 2012) NK

CVSSv3 Score

8.8 - CVSS:3.0/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H


This is a heap-based arbitrary write vulnerability that occurs in Iceni Argus. This tool is used primarily by MarkLogic Server to convert PDF files to (X)HTML form. While attempting to adjust the glyphmap for a particular font embedded within a .PDF file, the tool will explicitly trust an index and use it to write a pointer to the font’s name outside the bounds of an array. Within the ipFontFromtDict function, the tool will allocate 0x17f4 bytes of space using icnChainAlloc, and write it to -0xdc(%ebp). Later the pointer returned will be passed to ipFontInstallEncoding.

 809e887:	8b 95 1c ff ff ff    	mov    -0xe4(%ebp),%edx
 809e88d:	c7 44 24 04 f4 17 00 	movl   $0x17f4,0x4(%esp)
 809e894:	00
 809e895:	8b 82 34 02 00 00    	mov    0x234(%edx),%eax
 809e89b:	89 04 24             	mov    %eax,(%esp)
 809e89e:	e8 ed 48 fc ff       	call   8063190 <icnChainAlloc>
 809e8a3:	85 c0                	test   %eax,%eax
 809e8a5:	89 85 24 ff ff ff    	mov    %eax,-0xdc(%ebp)         ; allocated pointer
 809e8ab:	0f 84 9f 05 00 00    	je     809ee50 <ipFontFromDict+0x780>
 809ec93:	8b 85 24 ff ff ff    	mov    -0xdc(%ebp),%eax     ; allocated pointer
 809ec99:	8b 95 1c ff ff ff    	mov    -0xe4(%ebp),%edx
 809ec9f:	89 7c 24 08          	mov    %edi,0x8(%esp)
 809eca3:	89 44 24 04          	mov    %eax,0x4(%esp)       ; target
 809eca7:	89 14 24             	mov    %edx,(%esp)
 809ecaa:	e8 e1 ee ff ff       	call   809db90 <ipFontInstallEncoding>
 809ecaf:	85 c0                	test   %eax,%eax
 809ecb1:	0f 85 99 01 00 00    	jne    809ee50 <ipFontFromDict+0x780>

Inside the ipFontInstallEncoding, the tool will take the pointer passed to it and add 0x10. Afterwards this resulting pointer will then be stored in -0x30(%ebp).

 809dd7b:	8b 4d 0c             	mov    0xc(%ebp),%ecx
 809dd7e:	83 c1 10             	add    $0x10,%ecx
 809dd81:	89 4d d0             	mov    %ecx,-0x30(%ebp)

Within the same function the tool will search through a dictionary for a value with the key of “Differences”. This returns the pointer to an object that contains the array from the file containing the bad index. This object is stored in -0x20(%ebp).

 809dd96:	8d 83 23 71 44 ff    	lea    -0xbb8edd(%ebx),%eax     ; "Differences"
 809dd9c:	c7 44 24 08 07 00 00 	movl   $0x7,0x8(%esp)
 809dda3:	00
 809dda4:	89 44 24 04          	mov    %eax,0x4(%esp)
 809dda8:	89 34 24             	mov    %esi,(%esp)
 809ddab:	e8 70 5b ff ff       	call   8093920 <ipDictFindType>
 809ddb0:	85 c0                	test   %eax,%eax
 809ddb2:	89 45 e0             	mov    %eax,-0x20(%ebp)         ; XXX: object
 809ddb5:	0f 84 a4 05 00 00    	je     809e35f <ipFontInstallEncoding+0x7cf>

Afterwards the object containing the index that is trusted is passed to ipGlyphMapAdjust. This function will iterate through the array that is within the object that was grabbed from the “Differences” dictionary.

 809ddca:	8b 4d e0             	mov    -0x20(%ebp),%ecx         ; XXX: object with index
 809ddcd:	8b 41 04             	mov    0x4(%ecx),%eax
 809ddd0:	89 44 24 08          	mov    %eax,0x8(%esp)           ; object
 809ddd4:	8b 7d d0             	mov    -0x30(%ebp),%edi
 809ddd7:	89 7c 24 04          	mov    %edi,0x4(%esp)
 809dddb:	8b 45 0c             	mov    0xc(%ebp),%eax           ; destination
 809ddde:	89 04 24             	mov    %eax,(%esp)
 809dde1:	e8 ea d1 01 00       	call   80bafd0 <ipGlyphMapAdjust>
 809dde6:	8b 55 0c             	mov    0xc(%ebp),%edx
 809dde9:	88 42 03             	mov    %al,0x3(%edx)

Once inside the ipGlyphMapAdjust function, the following loop will iterate through every glyph inside the object grabbed from the dictionary. Once inside this loop, the tool will compare an index that is grabbed to ensure it’s less than a maximum value. Due to a signedness issue, this check will pass. With the provided sample the bad index that is read out of the object is 0xedffffff. This index will get stored into the %edi register.

 80bb023:	8b 45 10             	mov    0x10(%ebp),%eax      ; object
 80bb026:	8d 34 d5 00 00 00 00 	lea    0x0(,%edx,8),%esi    ; grab index from object
 80bb02d:	89 55 c8             	mov    %edx,-0x38(%ebp)     ; store index
 80bb03b:	83 c2 01             	add    $0x1,%edx
 80bb03e:	8b 78 04             	mov    0x4(%eax),%edi       ; XXX: reads bad index into %edi
 80bb041:	39 55 d0             	cmp    %edx,-0x30(%ebp)     ; max
 80bb044:	89 55 c8             	mov    %edx,-0x38(%ebp)     ; index
 80bb047:	7e ce                	jle    80bb017 <ipGlyphMapAdjust+0x47>

To determine the glyph’s name which will get written, the following code gets executed. This first calls ipNameToStr, followed by a call to glyphName. The resulting pointer is written to -0x24(%ebp).

 80bb09e:	8b 40 04             	mov    0x4(%eax),%eax
 80bb0a1:	89 04 24             	mov    %eax,(%esp)
 80bb0a4:	e8 b7 2c 01 00       	call   80cdd60 <ipNameToStr>
 80bb0a9:	89 c6                	mov    %eax,%esi
 80bb0ab:	89 04 24             	mov    %eax,(%esp)          ; key
 80bb0ae:	e8 6d fe ff ff       	call   80baf20 <glyphName>
 80bb0b3:	39 c6                	cmp    %eax,%esi
 80bb0b5:	89 45 dc             	mov    %eax,-0x24(%ebp)     ; string name

After grabbing the name, the index that is stored is then checked if it’s larger or smaller. There is a signed-ness check here for both bounds, however the index is not modified and instead used at 0x80bb076 to write the glyph name too. Due to none of the checks modifying the %edi register, this value is explicitly multiplied by 4 and trusted to index into the destination array passed as the second argument. This can allow for an aggressor to write a pointer to a string at an arbitrary place in memory which can lead to memory corruption. Under the correct circumstances this can lead to code execution.

 80bb060:	39 7d d4             	cmp    %edi,-0x2c(%ebp)         ; XXX: check index in %edi is less than one in lvar
 80bb063:	7e 03                	jle    80bb068 <ipGlyphMapAdjust+0x98>
 80bb065:	89 7d d4             	mov    %edi,-0x2c(%ebp)         ; XXX: write to lvar if so
 80bb068:	39 7d d8             	cmp    %edi,-0x28(%ebp)         ; XXX: check index in %edi is larger than one in lvar
 80bb06b:	7d 03                	jge    80bb070 <ipGlyphMapAdjust+0xa0>
 80bb06d:	89 7d d8             	mov    %edi,-0x28(%ebp)         ; XXX: write to lvar if not
 80bb070:	8b 45 dc             	mov    -0x24(%ebp),%eax         ; glyph name
 80bb073:	8b 55 0c             	mov    0xc(%ebp),%edx           ; destination
 80bb076:	89 04 ba             	mov    %eax,(%edx,%edi,4)       ; XXX: use %edi as index

Crash Information

$ gdb --quiet --args /opt/MarkLogic/converters/cvtpdf/convert ~/config/

Reading symbols from /opt/MarkLogic/Converters/cvtpdf/convert...done.

(gdb) r

Starting program: /opt/MarkLogic/Converters/cvtpdf/convert /home/user/config/
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Loading configuration...
Parsing macros...
Macro synth-bookmarks='true'
Macro image-output='true'
Macro text-output='true'
Macro zones='false'
Macro ignore-text='true'
Macro remove-overprint='false'
Macro illustrations='true'
Macro line-breaks='true'
Macro image-quality='75'
Macro page-start=''
Macro page-end=''
Macro document-start=''
Macro document-end=''
Analysing '/home/user/poc.pdf'
Pages 1 to 1

Catchpoint 4 (signal SIGSEGV), 0x080bb076 in ipGlyphMapAdjust ()

(gdb) bt 5

#0  0x080bb076 in ipGlyphMapAdjust ()
#1  0x0809dde6 in ipFontInstallEncoding ()
#2  0x0809ecaf in ipFontFromDict ()
#3  0x080b5000 in ipfSetFontNameSize ()
#4  0x080e7ee2 in ipDocExecStack ()
(More stack frames follow...)

(gdb) h

[eax: 0x083f7a44] [ebx: 0x08f57000] [ecx: 0x00000000] [edx: 0x09904e24]
[esi: 0x098af25c] [edi: 0xe3ffffff] [esp: 0xfffc0050] [ebp: 0xfffc0098]
[eflags: NZ SF OF CF ND NI]

fffc0050 | 098af25c 0995e5f0 098a857c 083eb488 | \.......|.....>.
fffc0060 | 0000001e 00000000 00000025 e3ffffff | ........%.......
fffc0070 | 0000001b 083f7a44 000000f8 0809396f | ....Dz?.....o9..
fffc0080 | 0990694c 098ad3f4 00000100 08f57000 | Li...........p..

=> 0x80bb076 <ipGlyphMapAdjust+166>:    mov    %eax,(%edx,%edi,4)
   0x80bb079 <ipGlyphMapAdjust+169>:    add    $0x1,%edi
   0x80bb07c <ipGlyphMapAdjust+172>:    addl   $0x1,-0x38(%ebp)
   0x80bb080 <ipGlyphMapAdjust+176>:    mov    -0x38(%ebp),%edx
   0x80bb083 <ipGlyphMapAdjust+179>:    cmp    %edx,-0x30(%ebp)
   0x80bb086 <ipGlyphMapAdjust+182>:    jle    0x80bb017 <ipGlyphMapAdjust+71>


Discovered by Marcin Noga of Cisco Talos.


2016-10-10 - Vendor Disclosure
2017-02-27 - Public Release