Talos Vulnerability Report

TALOS-2017-0461

libxls xls_preparseWorkSheet MULRK Code Execution Vulnerability

November 15, 2017
CVE Number

CVE-2017-12109

Summary

An exploitable integer overflow vulnerability exists in the xls_preparseWorkSheet function of libxls 1.4 when handling a MULRK record. A specially crafted XLS file can cause a memory corruption resulting in remote code execution. An attacker can send malicious XLS file to trigger this vulnerability.

Tested Versions

libxls 1.4 readxl package 1.0.0 for R (tested using Microsoft R 4.3.1)

Product URLs

http://libxls.sourceforge.net/

CVSSv3 Score

8.8 - CVSS:3.0/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H CVSSv3 Calculator: https://www.first.org/cvss/calculator/3.0

CWE

CWE-680: Integer Overflow to Buffer Overflow

Details

libxls is a C library supported on windows, mac, cygwin which can read Microsoft Excel File Format ( XLS ) files. The library is used by the readxl package that can be installed in the R programming language. The integer overflow appears in the xls_preparseWorkSheet function, so let’s take a look at the vulnerable code:

Line 970 void xls_preparseWorkSheet(xlsWorkSheet* pWS)
Line 971 {
			   (...)
Line 985         read = ole2_read(buf, 1,tmp.size,pWS->workbook->olestr);
Line 986 		assert(read == tmp.size);
Line 987 		// xls_showBOF(&tmp);
Line 988         switch (tmp.id)
Line 989         {
				(...)		
Line 1004        /* If the ROW record is incorrect or missing, infer the information from
Line 1005         * cell data. */
Line 1006        case 0x00BD:        //MULRK
Line 1007            if (pWS->rows.lastcol<xlsShortVal(((MULRK*)buf)->col) + (tmp.size - 6)/6 - 1)
Line 1008                pWS->rows.lastcol=xlsShortVal(((MULRK*)buf)->col) + (tmp.size - 6)/6 - 1;
Line 1009            if (pWS->rows.lastrow<xlsShortVal(((MULRK*)buf)->row))
Line 1010                pWS->rows.lastrow=xlsShortVal(((MULRK*)buf)->row);
Line 1011            break;


The general purpose of the `xls_preparseWorkSheet` function is to obtain the maximal size of `col` and `row` value from records present in the worksheet and update the `lastcol` and `lastrow` fields with that value. As we can see in lines `1007-1008` an integer overflow can occur. This can have two potential impacts. In one case, the maximum value stored in `lastcol` will not be updated even if the MULRK `col` field is greater than the value in `lastcol`. The other case will result in the `lastcol` value being updated to the overflowed value.

The malformed MULRK record is located at offset : 0xCB1F and looks as follows:

0xCB1F BD 00 20 00 AA AA FF FF

Setting breakpoint at line 1007 we can obtain the following information:

(gdb) p/x tmp
$1 = {id = 0xbd, size = 0x20}
(gdb) p/x *((MULRK*)buf)
$3 = {row = 0xaaaa, col = 0xffff, rk = 0x60f3e4}
(gdb) p/x pWS->rows.lastcol
$4 = 0x0

stepping further after line 1008:

(gdb) p/x pWS->rows.lastcol
$5 = 0x2

We see that lastcol field has been updated with overflowed value 0x2.

It has further consequences in function xls_makeTable where based on the lastcol field an array for cells is allocated:

Line 409	void xls_makeTable(xlsWorkSheet* pWS)
Line 410	{
Line 412		struct st_row_data* tmp;
Line 418		for (t=0;t<=pWS->rows.lastrow;t++)
				{
Line 420			tmp=&pWS->rows.row[t];
					(...)
Line 425			tmp->cells.count = pWS->rows.lastcol+1;
Line 426			tmp->cells.cell=(struct st_cell_data *)calloc(tmp->cells.count,sizeof(struct st_cell_data));					

Next during the final parsing of the malformed MULRK record in xls_addCell, an out of bound write occurs because the col value being used as index for the cell array is greater than amount of allocated elements for that array.

Line 446	struct st_cell_data *xls_addCell(xlsWorkSheet* pWS,BOF* bof,BYTE* buf)
Line 447	{
Line 448		struct st_cell_data*	cell;
				(...)
Line 457		cell=&row->cells.cell[xlsShortVal(((COL*)buf)->col)];
Line 458		cell->id=bof->id;
Line 459		cell->xf=xlsShortVal(((COL*)buf)->xf);	

At line 457 the cell element points to a value outside of the array range which leads to an out-of-bounds write at lines 458,459...

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7bce936 in xls_addCell (pWS=0x60f3a0, bof=0x7fffffffdb80, buf=0x607c50 
    "\252\252\377\377\273\273\273\273\314\314\314\314\335\335\335\335", <incomplete sequence \345>) at xls.c:458
458         cell->id=bof->id;
gdb-peda$ p cell
$8 = (struct st_cell_data *) 0xdea898
gdb-peda$ vmmap 0xdea898
Warning: not found or cannot access procfs

Crash Information

Microsoft R crash

(94c.c74): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
*** ERROR: Symbol file could not be found.  Defaulted to export symbols for C:\Users\Icewall\Documents\R\win-
 library\3.4\readxl\libs\x64\readxl.dll - 
readxl!xls_addCell+0x61:
00000000`6534b191 668903          mov     word ptr [rbx],ax ds:00000000`312f5f58=????
0:000> kb 10
 # RetAddr           : Args to Child                                                           : Call Site
00 00000000`6534c8e2 : 00000000`48332fd0 00000000`31135ff0 00000000`00000006 00000000`00000000 : 
readxl!xls_addCell+0x61
01 00000000`65382714 : 00000000`044079c0 00000000`00000000 00000000`00000000 00000000`00000000 : 
readxl!xls_parseWorkSheet+0x302
02 00000000`6534572b : 00000000`04407750 00000000`044076b0 00000000`00000000 00007ffe`179295c9 : readxl!
ZN12XlsWorkSheetC1E11XlsWorkBookiN4Rcpp6VectorILi13ENS1_15PreserveStorageEEEb+0x2b4
03 00000000`65343ddb : 00000000`044079b0 00007ffe`02922b4f 00000000`00000000 00000000`00000000 : readxl!
Z9read_xls_SsiN4Rcpp6VectorILi13ENS_15PreserveStorageEEEbNS_12RObject_ImplIS1_EES4_St6vectorISsSaISsEEbi+0x31b
*** ERROR: Symbol file could not be found.  Defaulted to export symbols for C:\Program Files\Microsoft\R Open\R-
3.4.1\bin\x64\R.dll - 
04 00000000`6c7977cf : 00000000`25cecec8 00000000`10cfdf98 00000000`6cbfdbd0 00000000`04407cd0 : 
readxl!readxl_read_xls_+0x1db
05 00000000`6c797f04 : 00000000`00000272 00000000`04407cc8 00000000`00000139 00000000`04407f00 : 
R!Rf_NewFrameConfirm+0x7b6f
06 00000000`6c7ee8c8 : 00000000`23747fa0 00000000`2607c7c0 00000000`23747fa0 00000000`6cbfdcb8 : 
R!Rf_NewFrameConfirm+0x82a4
07 00000000`6c7f11dd : 00000000`6cbfdc08 00000000`6cbfdd20 00000000`6cbfdc90 00000000`6c7c070c : R!Rf_eval+0x6f8
08 00000000`6c7ee6ba : 00000000`23748010 00000000`14595aa8 00000000`23748010 00000000`25e0b818 : 
R!R_execMethod+0x8bd
09 00000000`6c7f0550 : 00000000`31f17618 00000000`145dbd58 00000000`25f56dc0 00000000`264e3c08 : R!Rf_eval+0x4ea
0a 00000000`6c7f08d2 : 00000000`237b8cb8 00000000`261e92b0 00000000`237b8cb8 00000000`264e4270 : 
R!R_cmpfun1+0xf50
0b 00000000`6c7e28ba : 00000000`6cbfdbd0 00000000`2607f130 00000000`14590600 00000000`6c7f08d2 : 
R!Rf_applyClosure+0x192
0c 00000000`6c7ee341 : 00000000`31efd850 00000000`264e4270 00000000`2607cf08 00000000`6c7eebd5 : 
R!R_initAssignSymbols+0xf27a
0d 00000000`6c7eeb9c : 00000000`31dfaad8 0000002e`2375f578 00000000`00000000 00000000`2607c4e8 : R!Rf_eval+0x171
0e 00000000`6c7ee3dc : 00000000`00000000 00000000`14590670 00000000`2607cf08 00000000`324006f8 : R!Rf_eval+0x9cc
0f 00000000`6c82251b : 00000000`2fa88528 00000000`14590788 00000000`23992b28 00000000`6c7f08d2 : R!Rf_eval+0x20c
0:000> lmv m readxl
Browse full module list
start             end                 module name
00000000`65340000 00000000`6543f000   readxl     (export symbols)       C:\Users\Icewall\Documents\R\win-
library\3.4\readxl\libs\x64\readxl.dll
	Loaded symbol image file: C:\Users\Icewall\Documents\R\win-library\3.4\readxl\libs\x64\readxl.dll
	Image path: C:\Users\Icewall\Documents\R\win-library\3.4\readxl\libs\x64\readxl.dll
	Image name: readxl.dll
	Browse all global symbols  functions  data
	Timestamp:        Wed Aug 30 18:38:23 2017 (59A6E9FF)
	CheckSum:         001018F6
	ImageSize:        000FF000
	Translations:     0000.04b0 0000.04e4 0409.04b0 0409.04e4


Linux version	

[----------------------------------registers-----------------------------------]
RAX: 0xdea898 
RBX: 0xb6a8c0 --> 0xaaaa0201 
RCX: 0x7fffffffdb80 --> 0x2000bd 
RDX: 0xbd 
RSI: 0x7fffffffdb80 --> 0x2000bd 
RDI: 0xffffffff 
RBP: 0x7fffffffdb60 --> 0x7fffffffdc00 --> 0x7fffffffdc70 --> 0x401610 (<__libc_csu_init>:      push   r15)
RSP: 0x7fffffffdb00 --> 0x60f3a0 --> 0x8000000c90b 
RIP: 0x7ffff7bce936 (<xls_addCell+144>: mov    WORD PTR [rax],dx)
R8 : 0x60f360 --> 0x0 
R9 : 0x800000003000000 
R10: 0x6f ('o')
R11: 0x7ffff7bce8a6 (<xls_addCell>:     push   rbp)
R12: 0x400b60 (<_start>:        xor    ebp,ebp)
R13: 0x7fffffffdd50 --> 0x2 
R14: 0x0 
R15: 0x0
EFLAGS: 0x10202 (carry parity adjust zero sign trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
   0x7ffff7bce92b <xls_addCell+133>:    mov    rax,QWORD PTR [rbp-0x50]
   0x7ffff7bce92f <xls_addCell+137>:    movzx  edx,WORD PTR [rax]
   0x7ffff7bce932 <xls_addCell+140>:    mov    rax,QWORD PTR [rbp-0x28]
=> 0x7ffff7bce936 <xls_addCell+144>:    mov    WORD PTR [rax],dx
   0x7ffff7bce939 <xls_addCell+147>:    mov    rax,QWORD PTR [rbp-0x58]
   0x7ffff7bce93d <xls_addCell+151>:    movzx  eax,WORD PTR [rax+0x4]
   0x7ffff7bce941 <xls_addCell+155>:    cwde   
   0x7ffff7bce942 <xls_addCell+156>:    mov    edi,eax
[------------------------------------stack-------------------------------------]
0000| 0x7fffffffdb00 --> 0x60f3a0 --> 0x8000000c90b 
0008| 0x7fffffffdb08 --> 0x607c50 --> 0xbbbbbbbbffffaaaa 
0016| 0x7fffffffdb10 --> 0x7fffffffdb80 --> 0x2000bd 
0024| 0x7fffffffdb18 --> 0x60f3a0 --> 0x8000000c90b 
0032| 0x7fffffffdb20 --> 0x60f3a0 --> 0x8000000c90b 
0040| 0x7fffffffdb28 --> 0x60f360 --> 0x0 
0048| 0x7fffffffdb30 --> 0x800000003000000 
0056| 0x7fffffffdb38 --> 0xdea898 
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
Stopped reason: SIGSEGV
0x00007ffff7bce936 in xls_addCell (pWS=0x60f3a0, bof=0x7fffffffdb80, buf=0x607c50 
"\252\252\377\377\273\273\273\273\314\314\314\314\335\335\335\335", <incomplete sequence \345>) at xls.c:458
458         cell->id=bof->id;
gdb-peda$ bt
#0  0x00007ffff7bce936 in xls_addCell (pWS=0x60f3a0, bof=0x7fffffffdb80, buf=0x607c50 
"\252\252\377\377\273\273\273\273\314\314\314\314\335\335\335\335", <incomplete sequence \345>) at xls.c:458
#1  0x00007ffff7bd0b1a in xls_parseWorkSheet (pWS=0x60f3a0) at xls.c:1150
#2  0x0000000000400ff8 in main (argc=0x2, argv=0x7fffffffdd58) at xls2csv.c:149
#3  0x00007ffff781c830 in __libc_start_main (main=0x400d78 <main>, argc=0x2, argv=0x7fffffffdd58, init=<optimized out>, 
fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffdd48) at ../csu/libc-start.c:291
#4  0x0000000000400b89 in _start ()
gdb-peda$ exploitable -m
__main__:102: UserWarning: GDB v7.11 may not support required Python API
Warning: machine string printing is deprecated and may be removed in a future release.
EXCEPTION_FAULTING_ADDRESS:0x00000000dea898
EXCEPTION_CODE:0xb
FAULTING_INSTRUCTION:mov    WORD PTR [rax],dx
MAJOR_HASH:34754c1fb7e7f36e18e374f783e4c876
MINOR_HASH:34754c1fb7e7f36e18e374f783e4c876
STACK_DEPTH:3
STACK_FRAME:/home/icewall/bugs/libxls-1.4.0/build/lib/libxlsreader.so.1.2.1!xls_addCell+0x0
STACK_FRAME:/home/icewall/bugs/libxls-1.4.0/build/lib/libxlsreader.so.1.2.1!xls_parseWorkSheet+0x0
STACK_FRAME:/home/icewall/bugs/libxls-1.4.0/build/bin/xls2csv!main+0x0
INSTRUCTION_ADDRESS:0x007ffff7bce936
INVOKING_STACK_FRAME:0
DESCRIPTION:Access violation on destination operand
SHORT_DESCRIPTION:DestAv (9/29)
OTHER_RULES:AccessViolation (28/29)
CLASSIFICATION:EXPLOITABLE
EXPLANATION:The target crashed on an access violation at an address matching the destination operand of the instruction. This    
     likely  indicates a write access violation, which means the attacker may control the write address and/or value.
Description: Access violation on destination operand
Short description: DestAv (9/29)
Hash: 34754c1fb7e7f36e18e374f783e4c876.34754c1fb7e7f36e18e374f783e4c876
Exploitability Classification: EXPLOITABLE
Explanation: The target crashed on an access violation at an address matching the destination operand of the instruction. This     
     likely indicates a write access violation, which means the attacker may control the write address and/or value.
Other tags: AccessViolation (28/29)

Timeline

2017-10-25 - Vendor Disclosure
2017-11-15 - Public Release

Credit

Discovered by Marcin 'Icewall' Noga of Cisco Talos.