Talos Vulnerability Report

TALOS-2018-0711

Atlantis Word Processor open document format unchecked NewAnsiString length remote code execution vulnerability

November 20, 2018
CVE Number

CVE-2018-4038

Summary

An exploitable arbitrary write vulnerability exists in the open document format parser of the Atlantis Word Processor, version 3.2.7.2, while trying to null-terminate a string. A specially crafted document can allow an attacker to pass an untrusted value as a length to a constructor. This constructor will miscalculate a length and then use it to calculate the position to write a null byte. This can allow an attacker to corrupt memory, which can result in code execution under the context of the application. An attacker must convince a victim to open a specially crafted document in order to trigger this vulnerability.

Tested Versions

Atlantis Word Processor 3.2.7.1, 3.2.7.2

Product URLs

https://www.atlantiswordprocessor.com/en/

CVSSv3 Score

8.8 - CVSS:3.0/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H

CWE

CWE-131: Incorrect Calculation of Buffer Size

Details

Atlantis Word Processor is a traditional word processor that provides a number of useful features for a variety of users. The software is fully compatible with other word processors, such as Microsoft Office Word 2007, and even has a similar interface to Microsoft Word. Atlantis also has the ability to encrypt document files and fully customize the interface. This application is written in Delphi and contains the majority of its capabilities within a single relocatable binary.

When opening up a document that follows the open document format specification, the application will first fingerprint it in order to determine the correct file format parser, which is performed by the following code. This code will first fetch the current TDoc object and then check one of its fields that represents the current file format enumeration. When the open document format is selected, this field will have the value “3,” which results in the execution of case 3. At [1], the application will then execute a function that fingerprints and continues parsing the document.

awp+0x1b3139:
005b3139 8b45e8          mov     eax,dword ptr [ebp-18h]    // TDoc
005b313c 8b80dc000000    mov     eax,dword ptr [eax+0DCh]   // TDoc.fileFormatEnumeration
005b3142 83f805          cmp     eax,5
005b3145 776a            ja      awp+0x1b31b1 (005b31b1)
005b3147 ff24854e315b00  jmp     dword ptr awp+0x1b314e (005b314e)[eax*4]
...
awp+0x1b3193:
005b3193 55              push    ebp                        // Case 3
005b3194 e8fbc0ffff      call    awp+0x1af294 (005af294)    // [1]
005b3199 59              pop     ecx
005b319a 8885d7f8ffff    mov     byte ptr [ebp-729h],al
005b31a0 eb1c            jmp     awp+0x1b31be (005b31be)

When processing the document, the application will execute the function at 0x5af294. After initializing a couple of data structures and creating an instance of the TUnpackedZip object, the function will parse a couple of XML files in the document. At [2], the application will extract the “content.xml” file from the document, and then parse the file with the function call at [3]. Once this is performed, the function will then later navigate through the different elements in the “content.xml” file.

awp+0x1af294:
005af294 55              push    ebp
005af295 8bec            mov     ebp,esp
005af297 81c4d8feffff    add     esp,0FFFFFED8h
005af29d 53              push    ebx
005af29e 56              push    esi
005af29f 57              push    edi
005af2a0 33c0            xor     eax,eax
...
005af3d5 8b17            mov     edx,dword ptr [edi]
005af3d7 8d85e0feffff    lea     eax,[ebp-120h]             // Filename
005af3dd b960f95a00      mov     ecx,offset awp+0x1af960 (005af960)
005af3e2 e8c141e5ff      call    awp+0x35a8 (004035a8)      // LStrCat3
005af3e7 8b95e0feffff    mov     edx,dword ptr [ebp-120h]   // Filename
005af3ed 8b4508          mov     eax,dword ptr [ebp+8]      // Frame
005af3f0 8b805cf9ffff    mov     eax,dword ptr [eax-6A4h]   // TUnpackedZip
005af3f6 33c9            xor     ecx,ecx
005af3f8 e89301f2ff      call    awp+0xcf590 (004cf590)     // [2] Extract file from ZIP
005af3fd 8b17            mov     edx,dword ptr [edi]
005af3ff 8d85e0feffff    lea     eax,[ebp-120h]             // Filename
005af405 b960f95a00      mov     ecx,offset awp+0x1af960 (005af960)
005af40a e89941e5ff      call    awp+0x35a8 (004035a8)      // LStrCat3
005af40f 8b85e0feffff    mov     eax,dword ptr [ebp-120h]   // Filename
005af415 8b95e8feffff    mov     edx,dword ptr [ebp-118h]
005af41b e830f1f1ff      call    awp+0xce550 (004ce550)     // [3] Parse XML
005af420 8945f8          mov     dword ptr [ebp-8],eax      // Stored here

Later, inside the same function, the application uses the function calls at [4] to descend through the different nodes within the parsed “content.xml” document. The first element that the application searches for is labelled as “office:body.” Immediately after locating this element, it searches for “office:text.” This element is then immediately passed through the %eax register to the function call at [5] in order for the application to process its children.

awp+0x1af679:
005af679 55              push    ebp
005af67a ba40fb5a00      mov     edx,offset awp+0x1afb40 (005afb40) // "office:body"
005af67f 8b45f8          mov     eax,dword ptr [ebp-8]              // content.xml
005af682 e825dcf1ff      call    awp+0xcd2ac (004cd2ac)             // [4] Search through XML document for element
005af687 ba54fb5a00      mov     edx,offset awp+0x1afb54 (005afb54) // "office:text"
005af68c e81bdcf1ff      call    awp+0xcd2ac (004cd2ac)             // [4] Search through XML document for element
005af691 e8facdffff      call    awp+0x1ac490 (005ac490)            // [5] Proceses element's children
005af696 59              pop     ecx

The function at 0x5ac490 is used by the application in order to process the child elements of an XML element. After storing some information to retain the current state of parsing, the application will then enter a loop at 0x5ac529. This loop is responsible for actually iterating through each of the child elements belonging to the XML element that is currently being parsed. For each child element, the application will increment an index and then pass it to the function call at [6]. This will then return the child element. The XML element’s tag name is at offset +0x20. This tag name is then passed to the function call at [7] in order to convert into a token identifier. This token identifier is then used in a case statement in order to determine how to parse it specifically.

awp+0x1ac490:
005ac490 55              push    ebp
005ac491 8bec            mov     ebp,esp
005ac493 81c4e8feffff    add     esp,0FFFFFEE8h
005ac499 53              push    ebx
005ac49a 56              push    esi
005ac49b 57              push    edi
005ac49c 33d2            xor     edx,edx
...
awp+0x1ac529:
005ac529 8bd3            mov     edx,ebx                    // Index
005ac52b 8b45fc          mov     eax,dword ptr [ebp-4]
005ac52e e8d9b7e5ff      call    awp+0x7d0c (00407d0c)      // [6] Fetch child element
005ac533 8bf0            mov     esi,eax
005ac535 8b4620          mov     eax,dword ptr [esi+20h]    // XML Tag
005ac538 e81ba7ffff      call    awp+0x1a6c58 (005a6c58)    // [7] Convert XML Tag to Token Id
005ac53d 25ff000000      and     eax,0FFh
005ac542 83c0fb          add     eax,0FFFFFFFBh
005ac545 3db4000000      cmp     eax,0B4h
005ac54a 0f8710150000    ja      awp+0x1ada60 (005ada60)
005ac550 8a805dc55a00    mov     al,byte ptr awp+0x1ac55d (005ac55d)[eax]
005ac556 ff248512c65a00  jmp     dword ptr awp+0x1ac612 (005ac612)[eax*4]
...
005ada87 43              inc     ebx                        // Next Index
005ada88 4f              dec     edi
005ada89 0f859aeaffff    jne     awp+0x1ac529 (005ac529)

When the application handles case 152 for “text:s,” the following code is executed. The “text:s” element represents the number of spaces at a given point within a document. When processing this tag name, the application will first extract the “text:c” attribute via the function call at [8]. This attribute will then be converted into a WString and then converted to an integer at [9]. In order to prevent integer overflow, the application will then do a signed-ness check at [10] before passing it to a call to LStrSetLength at [11]. The provided proof-of-concept sets the value of “text:c” to 0x7fffffff thus bypassing the signed-ness check.

awp+0x1ad474:
005ad474 8b4508          mov     eax,dword ptr [ebp+8]              // Frame
005ad477 8d48fc          lea     ecx,[eax-4]                        // XML element
005ad47a ba3cdd5a00      mov     edx,offset awp+0x1add3c (005add3c) // "text:c"
005ad47f 8bc6            mov     eax,esi
005ad481 e8eafbf1ff      call    awp+0xcd070 (004cd070)             // [8] Check property
005ad486 84c0            test    al,al
005ad488 747a            je      awp+0x1ad504 (005ad504)
005ad48a 8d85f8feffff    lea     eax,[ebp-108h]                     // Number
005ad490 8b5508          mov     edx,dword ptr [ebp+8]              // Frame
005ad493 8b52fc          mov     edx,dword ptr [edx-4]              // XML element
005ad496 e88960e5ff      call    awp+0x3524 (00403524)              // LStrFromWStr
005ad49b 8b85f8feffff    mov     eax,dword ptr [ebp-108h]           // Number
005ad4a1 33d2            xor     edx,edx
005ad4a3 e82cebe5ff      call    awp+0xbfd4 (0040bfd4)              // [9] StrToInt
005ad4a8 8bd0            mov     edx,eax
005ad4aa 33c0            xor     eax,eax
005ad4ac e88f13e6ff      call    awp+0xe840 (0040e840)              // [10] Signedness check
005ad4b1 8bd0            mov     edx,eax
005ad4b3 8d45e4          lea     eax,[ebp-1Ch]
005ad4b6 e8d163e5ff      call    awp+0x388c (0040388c)              // [11] LStrSetLength
005ad4bb 837de400        cmp     dword ptr [ebp-1Ch],0
005ad4bf 0f849b050000    je      awp+0x1ada60 (005ada60)

The implementation of LStrSetLength, unfortunately, acts in an insecure fashion when a long string length is set by the application. LStrSetLength is implemented by the following code. After performing a number of checks that tell Delphi whether a string needs to be resized or re-allocated, the implementation will make a call to NewAnsiString at [12]. This function will take the length as a parameter, add 9 to it and then pass it to System.GetMemory. Due to the way System.GetMemory calculates and then aligns the size of a chunk, a length of 0x7fffffff + 9 can result in an undersized allocation. Later at [14], this length of 0x7fffffff will then be written to relative to the beginning of the chunk returned by System.GetMemory. This will write out of bounds of the returned allocation which can cause heap corruption and allow for code execution under the context of the application.

awp+0x388c:
0040388c 53              push    ebx
0040388d 56              push    esi
0040388e 57              push    edi
0040388f 89c3            mov     ebx,eax
00403891 89d6            mov     esi,edx
00403893 31ff            xor     edi,edi
...
004038c2 89d0            mov     eax,edx                // Length
004038c4 e8c7faffff      call    awp+0x3390 (00403390)  // [12] NewAnsiString
\
awp+0x3390:
00403390 85c0            test    eax,eax
00403392 7e1c            jle     awp+0x33b0 (004033b0)
00403394 50              push    eax
00403395 83c009          add     eax,9                  // Add 9 to length
00403398 e8a3efffff      call    awp+0x2340 (00402340)  // [13] System.GetMemory
0040339d 83c008          add     eax,8
004033a0 5a              pop     edx
004033a1 8950fc          mov     dword ptr [eax-4],edx
004033a4 c740f801000000  mov     dword ptr [eax-8],1
004033ab c6041000        mov     byte ptr [eax+edx],0   // [14] Write out of bounds
004033af c3              ret

Crash Information

eax=0892366c ebx=0018ec0c ecx=08923664 edx=7fffffff esi=7fffffff edi=00000000
eip=004033ab esp=0018eae4 ebp=0018ec28 iopl=0         nv up ei pl nz na pe nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00010206
awp+0x33ab:
004033ab c6041000        mov     byte ptr [eax+edx],0       ds:002b:8892366b=??

0:000> ub .
awp+0x3392:
00403392 7e1c            jle     awp+0x33b0 (004033b0)
00403394 50              push    eax
00403395 83c009          add     eax,9
00403398 e8a3efffff      call    awp+0x2340 (00402340)
0040339d 83c008          add     eax,8
004033a0 5a              pop     edx
004033a1 8950fc          mov     dword ptr [eax-4],edx
004033a4 c740f801000000  mov     dword ptr [eax-8],1

0:000> r @edx
edx=7fffffff

0:000> da poi(@ebp-108)
0babda90  "2147483647"

0:000> .formats 0n2147483647
Evaluate expression:
  Hex:     7fffffff
  Decimal: 2147483647
  Octal:   17777777777
  Binary:  01111111 11111111 11111111 11111111
  Chars:   ...
  Time:    ***** Invalid
  Float:   low 1.#QNAN high 0
  Double:  1.061e-314

Exploit Proof of Concept

To use the proof of concept, simply open up or preview the document in the target application. The application should crash at the address specified due to heap corruption.

Timeline

2018-11-16 - Vendor Disclosure
2018-11-20 - Vendor patched; Public Release

Credit

Discovered by a member of Cisco Talos.