Talos Vulnerability Report

TALOS-2018-0713

Atlantis Word Processor rich text format uninitialized TAutoList remote code execution vulnerability

November 20, 2018
CVE Number

CVE-2018-4040

Summary

An exploitable uninitialized pointer vulnerability exists in the rich text format parser of Atlantis Word Processor, version 3.2.7.2. A specially crafted document can cause certain RTF tokens to dereference a pointer that has been uninitialized and then write to it. An attacker must convince a victim to open a specially crafted document in order to trigger this vulnerability.

Tested Versions

Atlantis Word Processor 3.2.7.1, 3.2.7.2

Product URLs

https://www.atlantiswordprocessor.com/en/

CVSSv3 Score

8.8 - CVSS:3.0/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H

CWE

CWE-457: Use of Uninitialized Variable

Details

Atlantis Word Processor is a traditional word processor that provides a number of useful features for a variety of users. The software is fully compatible with other word processors, such as Microsoft Office Word 2007, and even has a similar interface to Microsoft Word. Atlantis also has the ability to encrypt document files and fully customize the interface. This application is written in Delphi and contains the majority of its capabilities within a single relocatable binary.

When opening up an RTF document, the application will first fingerprint it in order to determine the correct file format parser via the following code. This code will first fetch the current TDoc object and then check one of its fields that represents the current file format enumeration. When RTF is selected, this field will have the value 0, which results in case 0 being executed. At [1], the application will then execute a function that fingerprints and parses the document. After calling the function, the application will allocate a number of data structures and then continue by executing the call at [2].

awp+0x1b3139:
005b3139 8b45e8          mov     eax,dword ptr [ebp-18h]    // TDoc
005b313c 8b80dc000000    mov     eax,dword ptr [eax+0DCh]   // TDoc.fileFormatEnumeration
005b3142 83f805          cmp     eax,5
005b3145 776a            ja      awp+0x1b31b1 (005b31b1)
005b3147 ff24854e315b00  jmp     dword ptr awp+0x1b314e (005b314e)[eax*4]
...
awp+0x1b3166:
005b3166 55              push    ebp                        // Case 0
005b3167 e858dcfdff      call    awp+0x190dc4 (00590dc4)    // [1] \
005b316c 59              pop     ecx
005b316d 8885d7f8ffff    mov     byte ptr [ebp-729h],al
005b3173 eb49            jmp     awp+0x1b31be (005b31be)
\
awp+0x190dc4:
00590dc4 55              push    ebp
00590dc5 8bec            mov     ebp,esp
00590dc7 81c44cf5ffff    add     esp,0FFFFF54Ch
00590dcd 53              push    ebx
00590dce 56              push    esi
00590dcf 57              push    edi
00590dd0 33c0            xor     eax,eax
...
00590fdd 55              push    ebp
00590fde 33c0            xor     eax,eax
00590fe0 e8c7a4ffff      call    awp+0x18b4ac (0058b4ac)    // [2]
00590fe5 59              pop     ecx

Once inside the function call at 0x58b4ac, the application will eventually execute the function at [3]. This function will check that the current character is alphabetic. This is used to determine whether the current function should recurse into itself in order to handle grouping or some of the other features provided by the application’s Rich Text Format parser. After checking the beginning of the document, the application will then enter the loop at [4]. This loop will continue to parser the different groups within the document. When a group has been identified by the parser, the function will recurse into itself at [5].

awp+0x18b4ac:
0058b4ac 55              push    ebp
0058b4ad 8bec            mov     ebp,esp
0058b4af 83c4e8          add     esp,0FFFFFFE8h
0058b4b2 53              push    ebx
0058b4b3 56              push    esi
0058b4b4 57              push    edi
0058b4b5 8845ff          mov     byte ptr [ebp-1],al
...
0058b4ca 8b07            mov     eax,dword ptr [edi]
0058b4cc 0303            add     eax,dword ptr [ebx]
0058b4ce 40              inc     eax
0058b4cf e8cc64eaff      call    awp+0x319a0 (004319a0)     // [3] Check alpha character
0058b4d4 84c0            test    al,al
0058b4d6 0f8486000000    je      awp+0x18b562 (0058b562)
...
awp+0x18b5bc:
0058b5bc 8b03            mov     eax,dword ptr [ebx]        // [4] Loop over RTF characters
0058b5be 8b17            mov     edx,dword ptr [edi]
0058b5c0 0fb60402        movzx   eax,byte ptr [edx+eax]
0058b5c4 83f87b          cmp     eax,7Bh
0058b5c7 7f21            jg      awp+0x18b5ea (0058b5ea)
0058b5c9 7442            je      awp+0x18b60d (0058b60d)
...
awp+0x18b60d:
0058b60d 8b07            mov     eax,dword ptr [edi]
0058b60f 0303            add     eax,dword ptr [ebx]
0058b611 e87663eaff      call    awp+0x3198c (0043198c)     // Check current position for '{*\'
0058b616 84c0            test    al,al
0058b618 742a            je      awp+0x18b644 (0058b644)
...
awp+0x18b644:
0058b644 8b4508          mov     eax,dword ptr [ebp+8]
0058b647 50              push    eax
0058b648 b001            mov     al,1
0058b64a e85dfeffff      call    awp+0x18b4ac (0058b4ac)    // [5] Recurse into current function
0058b64f 59              pop     ecx
0058b650 e934040000      jmp     awp+0x18ba89 (0058ba89)
...
awp+0x18bab2:
0058bab2 8b03            mov     eax,dword ptr [ebx]
0058bab4 8b5508          mov     edx,dword ptr [ebp+8]
0058bab7 8b5208          mov     edx,dword ptr [edx+8]
0058baba 3b42dc          cmp     eax,dword ptr [edx-24h]    // Distance to move
0058babd 0f8cf9faffff    jl      awp+0x18b5bc (0058b5bc)

Once the previous function has determined it needs to recurse into itself, the application will re-enter the function at 0x58b4ac. After determining that the currently parsed character is beginning a group or a valid token, the application will execute the function call at [6]. This function will verify that the current character position is pointing at characters that could make up an RTF token and return the token in %eax. After getting the token identifier, the application will then pass the identifier in %eax to the function call at [7] to parse it.

awp+0x18b4dc:
0058b4dc 8b4508          mov     eax,dword ptr [ebp+8]
0058b4df 50              push    eax
0058b4e0 8b4508          mov     eax,dword ptr [ebp+8]
0058b4e3 0508f7ffff      add     eax,0FFFFF708h
0058b4e8 e81fbbffff      call    awp+0x18700c (0058700c)    // [6] Check valid RTF token
0058b4ed 59              pop     ecx
0058b4ee 83f8ff          cmp     eax,0FFFFFFFFh             // Token Id
0058b4f1 7557            jne     awp+0x18b54a (0058b54a)
...
0058b54a 8b5508          mov     edx,dword ptr [ebp+8]      // Frame
0058b54d 52              push    edx
0058b54e 8b5508          mov     edx,dword ptr [ebp+8]      // Frame
0058b551 8b920cf7ffff    mov     edx,dword ptr [edx-8F4h]
0058b557 e8cc510000      call    awp+0x190728 (00590728)    // [7]
0058b55c 59              pop     ecx
0058b55d e97e050000      jmp     awp+0x18bae0 (0058bae0)

Once inside the function at 0x590728, the application will first assign the current token identifier into a variable on the stack. Eventually, this identifier will be used to select a particular case to continue parsing with at [8]. When parsing the token “\subject,” which has the identifier 0x179, the application will then execute the case at [9]. The case at [9] will first grab the current TDoc instance, and then store the offset of the TDoc’s description field into %eax. These will then get passed as arguments to the call at [10].

awp+0x190728:
00590728 55              push    ebp
00590729 8bec            mov     ebp,esp
0059072b 83c4f8          add     esp,0FFFFFFF8h
0059072e 53              push    ebx
0059072f 8bda            mov     ebx,edx
00590731 8945f8          mov     dword ptr [ebp-8],eax      // Case
...
00590775 0fb745f8        movzx   eax,word ptr [ebp-8]       // [8] Case
00590779 3ddf000000      cmp     eax,0DFh
0059077e 0f8f32010000    jg      awp+0x1908b6 (005908b6)
00590784 0f8442040000    je      awp+0x190bcc (00590bcc)
...
005908b6 3d35010000      cmp     eax,135h
005908bb 0f8fa1000000    jg      awp+0x190962 (00590962)
...
00590962 3d77010000      cmp     eax,177h
00590967 7f4a            jg      awp+0x1909b3 (005909b3)
...
005909b3 3da9010000      cmp     eax,1A9h
005909b8 7f1f            jg      awp+0x1909d9 (005909d9)
005909ba 0f848e030000    je      awp+0x190d4e (00590d4e)
005909c0 2d79010000      sub     eax,179h
005909c5 0f8451030000    je      awp+0x190d1c (00590d1c)
...
00590d1c 55              push    ebp                        // [9]
00590d1d 8b4508          mov     eax,dword ptr [ebp+8]      // Frame
00590d20 8b4008          mov     eax,dword ptr [eax+8]
00590d23 8b40e8          mov     eax,dword ptr [eax-18h]    // TDoc
00590d26 05ec000000      add     eax,0ECh                   // TDoc.description
00590d2b b201            mov     dl,1
00590d2d e802aeffff      call    awp+0x18bb34 (0058bb34)    // [10]
00590d32 59              pop     ecx
00590d33 eb7c            jmp     awp+0x190db1 (00590db1)

The function at 0x58bb34 is simply a wrapper around the call to [11]. This function will close around the variables belonging to the caller’s frame, and then call another function. Once inside the function at 0x58afa4, the application will then enter a loop at [12] in order to determine how to parse the different tokens in the document. This loop will read a byte from the current position in the file and then check to see if it is non-printable, a backslash, or one of the types of braces. When processing the token, the application will execute the case for the backslash at [13].

awp+0x18bb34:
0058bb34 55              push    ebp
0058bb35 8bec            mov     ebp,esp
0058bb37 8b4d08          mov     ecx,dword ptr [ebp+8]          // Caller frame
0058bb3a 8b4908          mov     ecx,dword ptr [ecx+8]          // Frame belonging to 0x590dc4
0058bb3d 51              push    ecx
0058bb3e 33c9            xor     ecx,ecx
0058bb40 e85ff4ffff      call    awp+0x18afa4 (0058afa4)        // [11] \
0058bb45 59              pop     ecx
0058bb46 8b4508          mov     eax,dword ptr [ebp+8]          // Caller frame
0058bb49 8b4008          mov     eax,dword ptr [eax+8]          // Frame belonging to 0x590dc4
0058bb4c 8b4008          mov     eax,dword ptr [eax+8]          // Frame belonging to 0x5b2a3c
0058bb4f ff40f8          inc     dword ptr [eax-8]
0058bb52 5d              pop     ebp
0058bb53 c3              ret
\
awp+0x18afa4:
0058afa4 55              push    ebp
0058afa5 8bec            mov     ebp,esp
0058afa7 83c4d4          add     esp,0FFFFFFD4h
0058afaa 53              push    ebx
0058afab 56              push    esi
0058afac 57              push    edi
0058afad 33db            xor     ebx,ebx
...
0058b003 8b03            mov     eax,dword ptr [ebx]        // [12] Loop
0058b005 8b5508          mov     edx,dword ptr [ebp+8]      // Frame
0058b008 8b5208          mov     edx,dword ptr [edx+8]      // Caller's Frame
0058b00b 8b52e0          mov     edx,dword ptr [edx-20h]    // File contents buffer
0058b00e 8a0402          mov     al,byte ptr [edx+eax]
0058b011 2c20            sub     al,20h
0058b013 0f8238030000    jb      awp+0x18b351 (0058b351)    // Non-printable
0058b019 2c3c            sub     al,3Ch
0058b01b 0f84d0000000    je      awp+0x18b0f1 (0058b0f1)    // [13] If character is "\"
0058b021 2c1f            sub     al,1Fh
0058b023 740d            je      awp+0x18b032 (0058b032)    // Left Brace
0058b025 2c02            sub     al,2
0058b027 0f8406030000    je      awp+0x18b333 (0058b333)    // Right Brace
0058b02d e926030000      jmp     awp+0x18b358 (0058b358)
...
awp+0x18b450:
0058b450 8b03            mov     eax,dword ptr [ebx]        // [12] Continue
0058b452 8b5508          mov     edx,dword ptr [ebp+8]      // Frame
0058b455 8b5208          mov     edx,dword ptr [edx+8]      // Caller's Frame
0058b458 3b42dc          cmp     eax,dword ptr [edx-24h]
0058b45b 0f8ca2fbffff    jl      awp+0x18b003 (0058b003)

When processing a backslash which prefixes a Rich Text Format token, the following code will be executed in order to identify the token. First at [14], the application will read a bigram from the current file position and use this to single out which array to match the token against. Once this is determined, at [15] the application will call a function that will do a comparison and store the actual token identifier. After doing a few checks to handle a specific case for another token, the application will then fetch the current paragraph from one of the caller’s frames and then pass it as an argument along with the token identifier to the call at [16].

awp+0x18b0f1:
0058b0f1 8b4508          mov     eax,dword ptr [ebp+8]      // Frame
0058b0f4 8b4008          mov     eax,dword ptr [eax+8]      // Caller's Frame
0058b0f7 8b40e0          mov     eax,dword ptr [eax-20h]    // File contents buffer
0058b0fa 0303            add     eax,dword ptr [ebx]
0058b0fc e89f68eaff      call    awp+0x319a0 (004319a0)     // [14] Read a bigram to identify token
0058b101 84c0            test    al,al
0058b103 0f84f5000000    je      awp+0x18b1fe (0058b1fe)
0058b109 8b4508          mov     eax,dword ptr [ebp+8]      // Frame
0058b10c 8b4008          mov     eax,dword ptr [eax+8]      // Caller's Frame
0058b10f 8b40e0          mov     eax,dword ptr [eax-20h]    // File contents buffer
0058b112 0303            add     eax,dword ptr [ebx]
0058b114 40              inc     eax
0058b115 8d55dc          lea     edx,[ebp-24h]
0058b118 e8af68eaff      call    awp+0x319cc (004319cc)     // Help single out which token array
0058b11d 8d45dc          lea     eax,[ebp-24h]
0058b120 e88f6aeaff      call    awp+0x31bb4 (00431bb4)     // [15] Return the token identifier
...
awp+0x18b12b:
0058b12b 3da7010000      cmp     eax,1A7h
0058b130 755e            jne     awp+0x18b190 (0058b190)
...
awp+0x18b190:
0058b190 8b5508          mov     edx,dword ptr [ebp+8]
0058b193 f6429240        test    byte ptr [edx-6Eh],40h
0058b197 743c            je      awp+0x18b1d5 (0058b1d5)
0058b199 3d7c010000      cmp     eax,17Ch
0058b19e 7535            jne     awp+0x18b1d5 (0058b1d5)
...
awp+0x18b1d5:
0058b1d5 8b5508          mov     edx,dword ptr [ebp+8]      // Frame
0058b1d8 52              push    edx
0058b1d9 8b5508          mov     edx,dword ptr [ebp+8]      // Frame
0058b1dc 8b5208          mov     edx,dword ptr [edx+8]      // Caller's Frame
0058b1df 8b9258f9ffff    mov     edx,dword ptr [edx-6A8h]   // Current Paragraph (TPar)
0058b1e5 83c214          add     edx,14h                    // Paragraph Property
0058b1e8 52              push    edx
0058b1e9 8b5508          mov     edx,dword ptr [ebp+8]      // Frame
0058b1ec 8d4aa4          lea     ecx,[edx-5Ch]
0058b1ef 8d55dc          lea     edx,[ebp-24h]              // Token Identifier
0058b1f2 92              xchg    eax,edx
0058b1f3 e8b8c4ffff      call    awp+0x1876b0 (005876b0)    // [16] Handle Token Operand
0058b1f8 59              pop     ecx
0058b1f9 e929020000      jmp     awp+0x18b427 (0058b427)

Finally, when the application executes the function at 0x5876b0, the application will begin to parse any operands belonging to the token. There are a number of different types of tokens that the application can parse depending on the token identifier that was identified in the calling function. At [17], the application will use the determined token identifier to identify the correct case to execute. When handling case 232 for the token “\listsimple”, the block of code beginning near [18] will be executed. The function call at [18] will simply extract a numerical argument out of the text that follows the token and then store it into the %ebx register. Afterwards, the function call at [19] will be used to fetch the last element of a TAutoList that should’ve been initialized earlier. Due to the application not properly checking the state of the list before the function call at [19], the application can potentially return an invalid value based on the capacity of the list. Due to Delphi’s technique of caching objects, an attacker may be able to get an object with a controlled capacity freed at which point will be used to calculated a pointer. Following this at [20], the application will attempt to write the integer that was parsed from the file into %ebx to an address relative to the returned pointer. This can allow for one to corrupt heap memory which can lead to code execution under the context of the application.

awp+0x1876b0:
005876b0 55              push    ebp
005876b1 8bec            mov     ebp,esp
005876b3 83c4d4          add     esp,0FFFFFFD4h
005876b6 53              push    ebx
005876b7 56              push    esi
005876b8 57              push    edi
005876b9 894dfc          mov     dword ptr [ebp-4],ecx
...
005876c2 0fb7d0          movzx   edx,ax                     // [17]
005876c5 81fac7010000    cmp     edx,1C7h
005876cb 0f878b380000    ja      awp+0x18af5c (0058af5c)
005876d1 ff2495d8765800  jmp     dword ptr awp+0x1876d8 (005876d8)[edx*4]
...
awp+0x18942c:
0058942c 8bc6            mov     eax,esi
0058942e e84586eaff      call    awp+0x31a78 (00431a78)     // [18] Parse numerical argument
00589433 8bd8            mov     ebx,eax                    // User-controlled value
00589435 8b450c          mov     eax,dword ptr [ebp+0Ch]    // Frame
00589438 8b4008          mov     eax,dword ptr [eax+8]      // Caller's Frame
0058943b 8b8040f9ffff    mov     eax,dword ptr [eax-6C0h]   // TAutoList
00589441 e80aeae7ff      call    awp+0x7e50 (00407e50)      // [19] Return last element in TList (possibly uninitialized)
00589446 885818          mov     byte ptr [eax+18h],bl      // [20] Write byte to uninitialized address
00589449 e90e1b0000      jmp     awp+0x18af5c (0058af5c)

The function call to fetch the last item from a TList is as follows. If the items for the TList is pointing directly into the TList’s cache and the length is 0, this will result in the application fetching an item outside the bounds of the chosen items which can result in an arbitrary value being returned. This is then written to as described above.

awp+0x7e50:
00407e50 8b4804          mov     ecx,dword ptr [eax+4]          // Length
00407e53 8b401c          mov     eax,dword ptr [eax+1Ch]        // Items (which can point into cached list)
00407e56 8b4488fc        mov     eax,dword ptr [eax+ecx*4-4]    // Return item from List cache
00407e5a c3              ret

Crash Information

eax=00000001 ebx=00000bb3 ecx=00000000 edx=07354f14 esi=0018ea94 edi=0018fb00
eip=00589446 esp=0018e778 ebp=0018e7b0 iopl=0         nv up ei pl nz ac po nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00010212
awp+0x189446:
00589446 885818          mov     byte ptr [eax+18h],bl      ds:002b:00000019=??

0:000> ub .
awp+0x18942a:
0058942a 0000            add     byte ptr [eax],al
0058942c 8bc6            mov     eax,esi
0058942e e84586eaff      call    awp+0x31a78 (00431a78)
00589433 8bd8            mov     ebx,eax
00589435 8b450c          mov     eax,dword ptr [ebp+0Ch]
00589438 8b4008          mov     eax,dword ptr [eax+8]
0058943b 8b8040f9ffff    mov     eax,dword ptr [eax-6C0h]
00589441 e80aeae7ff      call    awp+0x7e50 (00407e50)

0:000> ? poi(poi(poi(@ebp+c)+8)-6c0)
Evaluate expression: 201277336 = 0bff3f98

0:000> $$>a<c:/users/user/audit/atlantis/scripts/TList.dbgscr poi(poi(poi(@ebp+c)+8)-6c0)
[0bff3f98] <type 'structure' name='TList' size=+0x20>
[0bff3f98] (+0) : p_InfoTable_0 : (00406898) 4221080
[0bff3f9c] (+4) : v_length_4 : (00000000) 0
[0bff3fa0] (+8) : v_capacity_8 : (00000001) 1
[0bff3fa4] (+c) : v_cache(4)_c : { 07366b34, 00000000, 00000000, 00000000 }
[0bff3fb4] (+1c) : p_items_1c : (0bff3fa4) 201277348

Exploit Proof of Concept

To use the proof of concept, simply open up or preview the document in the target application. The application should crash at the address specified due to heap corruption.

Mitigation

This vulnerability is triggered by simply opening up a document file. The only way to mitigate this would be to not open a document file from an untrusted user.

Timeline

2018-11-16 - Vendor Disclosure 2018-11-20 - Vendor Patched; Public Release

Credit

Discovered by a member of Cisco Talos.