Talos Vulnerability Report

TALOS-2017-0403

libxls xls_mergedCells Code Execution Vulnerability

November 15, 2017
CVE Number

CVE-2017-2896

Summary

An exploitable out-of-bounds write vulnerability exists in the xls_mergedCells function of libxls 1.4. A specially crafted XLS file can cause a memory corruption resulting in remote code execution. An attacker can send malicious xls file to trigger this vulnerability.

Tested Versions

libxls 1.4 readxl package 1.0.0 for R (tested using Microsoft R 4.3.1)

Product URLs

http://libxls.sourceforge.net/

CVSSv3 Score

8.8 - CVSS:3.0/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H

CWE

CWE-787: Out-of-bounds Write

Details

libxls is a C library supported on Windows, Mac and Linux which can read Microsoft Excel File Format (XLS) files. The library is used by the readxl package that can be installed in the R programming language. An out-of-bounds write appears in the xls_mergedCells function. Let’s take a look at the vulnerable code:

Line 606	void xls_mergedCells(xlsWorkSheet* pWS,BOF* bof,BYTE* buf)
Line 607	{
Line 608		int count=*((WORD*)buf);
Line 609		int i,c,r;
Line 610		struct MERGEDCELLS* span;
Line 611		verbose("Merged Cells");
Line 612		for (i=0;i<count;i++)
Line 613		{
Line 614			span=(struct MERGEDCELLS*)(buf+(2+i*sizeof(struct MERGEDCELLS)));
Line 615			//		printf("Merged Cells: [%i,%i] [%i,%i] \n",span->colf,span->rowf,span->coll,span->rowl);
Line 616			for (r=span->rowf;r<=span->rowl;r++)
Line 617				for (c=span->colf;c<=span->coll;c++)
Line 618					pWS->rows.row[r].cells.cell[c].ishiden=1;
Line 619			pWS->rows.row[span->rowf].cells.cell[span->colf].colspan=(span->coll-span->colf+1);
Line 620			pWS->rows.row[span->rowf].cells.cell[span->colf].rowspan=(span->rowl-span->rowf+1);
Line 621			pWS->rows.row[span->rowf].cells.cell[span->colf].ishiden=0;
Line 622		}
Line 623	}	

Important variables and especially their content are: buf and bof which have been read in raw form from a file. We see at line 612 that the count value, which is exactly bof.size, controls a loop. Next further parts of the buf buffer are pointed to by the span variable at line 614. Because the span structure is based on data directly read from file, an attacker not only fully controls the amount of executions of the for loops at lines 616 and 617 but also the offsets during writes to the pWs->rows structure. Using our PoC we can observe the following values during a crash:

Starting program: /home/icewall/bugs/libxls-1.4.0/build/bin/xls2csv ./crashes/49a5608059427ce2f2c479e33c5e3ae4

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7bd1e57 in xls_mergedCells (pWS=0x605830, bof=0x7fffffffdc10, buf=0x607230 "\b") at xls.c:619
619             pWS->rows.row[span->rowf].cells.cell[span->colf].colspan=(span->coll-span->colf+1);
(gdb) p/x *span
$1 = {rowf = 0xabcd, rowl = 0x1122, colf = 0x3344, coll = 0x6655}
(gdb) p/x *pWS->rows.row
$3 = {index = 0x0, fcell = 0x0, lcell = 0x0, height = 0x0, flags = 0x0, xf = 0x0, xfflags = 0x0, cells = {count = 0x0, cell = 0x607280}}

MergedCell record starts at offset : 7ED4Ch And has form : [BOF][SPAN][SPAN]…BOF.size*sizeof(MERGEDCELLS)…[SPAN]

Crash Information

Crash in the Microsoft R platform:

> library(readxl)
> path <- readxl_example("49a5608059427ce2f2c479e33c5e3ae4.xls")
> lapply(excel_sheets(path), read_excel, path = path)
fread: wanted 1 got 0 loc=519168
fread: wanted 1 got 0 loc=519168
fread: wanted 1 got 0 loc=519168
fread: wanted 1 got 0 loc=519168
fread: wanted 1 got 0 loc=519168
fread: wanted 1 got 0 loc=519168
fread: wanted 1 got 0 loc=519168
fread: wanted 1 got 0 loc=519168
fread: wanted 1 got 0 loc=519168
fread: wanted 1 got 0 loc=519168
fread: wanted 1 got 0 loc=519168

 *** caught segfault ***
address 0x54, cause 'memory not mapped'

Traceback:
 1: .Call("readxl_read_xls_", PACKAGE = "readxl", path, sheet_i,     limits, shim, col_names, col_types, na, trim_ws, guess_max)
 2: read_fun(path = path, sheet = sheet, limits = limits, shim = shim,     col_names = col_names, col_types = col_types, na = na, trim_ws = trim_ws,     guess_max = guess_max)
 3: tibble::as_tibble(read_fun(path = path, sheet = sheet, limits = limits,     shim = shim, col_names = col_names, col_types = col_types,     na = na, trim_ws = trim_ws, 
        guess_max = guess_max), validate = FALSE)
 4: tibble::repair_names(tibble::as_tibble(read_fun(path = path,     sheet = sheet, limits = limits, shim = shim, col_names = col_names,     col_types = col_types, na = na,
        trim_ws = trim_ws, guess_max = guess_max),     validate = FALSE), prefix = "X", sep = "__")
 5: read_excel_(path = path, sheet = sheet, range = range, col_names = col_names,     col_types = col_types, na = na, trim_ws = trim_ws, skip = skip,     n_max = n_max, guess_max 
        = guess_max, excel_format(path))
 6: FUN(X[[i]], ...)
 7: lapply(excel_sheets(path), read_excel, path = path)


    directly in libxls lib:

==70269== Memcheck, a memory error detector
==70269== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==70269== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==70269== Command: ./xls2csv ./crashes/49a5608059427ce2f2c479e33c5e3ae4
==70269== 
==70269== Invalid write of size 1
==70269==    at 0x4C3106F: strcpy (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==70269==    by 0x4E42F05: xls_open (xls.c:927)
==70269==    by 0x400956: main (xls2csv.c:45)
==70269==  Address 0x5425415 is 0 bytes after a block of size 21 alloc'd
==70269==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==70269==    by 0x4E42EE3: xls_open (xls.c:926)
==70269==    by 0x400956: main (xls2csv.c:45)
==70269== 
==70269== Invalid read of size 8
==70269==    at 0x4E41E57: xls_mergedCells (xls.c:619)
==70269==    by 0x4E42CBC: xls_parseWorkSheet (xls.c:861)
==70269==    by 0x400AEC: main (xls2csv.c:90)
==70269==  Address 0x55e8b66 is 1,095,926 bytes inside an unallocated block of size 3,358,064 in arena "client"
==70269== 
==70269== Invalid write of size 2
==70269==    at 0x4E41E92: xls_mergedCells (xls.c:619)
==70269==    by 0x4E42CBC: xls_parseWorkSheet (xls.c:861)
==70269==    by 0x400AEC: main (xls2csv.c:90)
==70269==  Address 0x7cf7f is not stack'd, malloc'd or (recently) free'd

Timeline

2017-08-29 - Vendor Disclosure
2017-11-14 - Public Release

Credit

Discovered by Marcin 'Icewall' Noga of Cisco Talos.