Talos Vulnerability Report

TALOS-2023-1815

GTKWave VZT vzt_rd_block_vch_decode dict parsing integer overflow vulnerabilities

January 8, 2024
CVE Number

CVE-2023-38653,CVE-2023-38652

SUMMARY

Multiple integer overflow vulnerabilities exist in the VZT vzt_rd_block_vch_decode dict parsing functionality of GTKWave 3.3.115. A specially crafted .vzt file can lead to memory corruption. A victim would need to open a malicious file to trigger these vulnerabilities.

CONFIRMED VULNERABLE VERSIONS

The versions below were either tested or verified to be vulnerable by Talos or confirmed to be vulnerable by the vendor.

GTKWave 3.3.115

PRODUCT URLS

GTKWave - https://gtkwave.sourceforge.net

CVSSv3 SCORE

7.0 - CVSS:3.1/AV:L/AC:H/PR:N/UI:R/S:U/C:H/I:H/A:H

CWE

CWE-190 - Integer Overflow or Wraparound

DETAILS

GTKWave is a wave viewer, often used to analyze FPGA simulations and logic analyzer captures. It includes a GUI to view and analyze traces, as well as convert across several file formats (.lxt, .lxt2, .vzt, .fst, .ghw, .vcd, .evcd) either by using the UI or its command line tools. GTKWave is available for Linux, Windows and MacOS. Trace files can be shared within teams or organizations, for example to compare results of simulation runs across different design implementations, to analyze protocols captured with logic analyzers or just as a reference when porting design implementations.

GTKWave sets up mime types for its supported extensions. So, for example, it’s enough for a victim to double-click on a wave file received by e-mail to cause the gtkwave program to be executed and load a potentially malicious file.

VZT (Verilog Zipped Trace) files are parsed by the functions found in vzt_read.c. These functions are used in the vzt2vcd file conversion utility, vztminer, and by the GUI portion of GTKwave. Thus both are affected by the issue described in this report.

To parse VZT files, the function vzt_rd_init_smp is called:

     struct vzt_rd_trace *vzt_rd_init_smp(const char *name, unsigned int num_cpus) {
[1]      struct vzt_rd_trace *lt = (struct vzt_rd_trace *)calloc(1, sizeof(struct vzt_rd_trace));
         ...

[2]      if (!(lt->handle = fopen(name, "rb"))) {
             vzt_rd_close(lt);
             lt = NULL;
         } else {
             vztint16_t id = 0, version = 0;
             ...
[3]          if (!fread(&id, 2, 1, lt->handle)) {
                 id = 0;
             }
             if (!fread(&version, 2, 1, lt->handle)) {
                 id = 0;
             }
             if (!fread(&lt->granule_size, 1, 1, lt->handle)) {
                 id = 0;
             }
         ...

At [1] the lt structure is initialized. This is the structure that will contain all the information about the input file.
The input file is opened [2] and 3 fields are read [3] to make sure the input file is a supported VZT file.

         ...
         rcf = fread(&lt->numfacs, 4, 1, lt->handle);
[4]      lt->numfacs = rcf ? vzt_rd_get_32(&lt->numfacs, 0) : 0;
         ...
         rcf = fread(&lt->numfacbytes, 4, 1, lt->handle);
         lt->numfacbytes = rcf ? vzt_rd_get_32(&lt->numfacbytes, 0) : 0;
         rcf = fread(&lt->longestname, 4, 1, lt->handle);
         lt->longestname = rcf ? vzt_rd_get_32(&lt->longestname, 0) : 0;
         rcf = fread(&lt->zfacnamesize, 4, 1, lt->handle);
         lt->zfacnamesize = rcf ? vzt_rd_get_32(&lt->zfacnamesize, 0) : 0;
         rcf = fread(&lt->zfacname_predec_size, 4, 1, lt->handle);
         lt->zfacname_predec_size = rcf ? vzt_rd_get_32(&lt->zfacname_predec_size, 0) : 0;
         rcf = fread(&lt->zfacgeometrysize, 4, 1, lt->handle);
         lt->zfacgeometrysize = rcf ? vzt_rd_get_32(&lt->zfacgeometrysize, 0) : 0;
         rcf = fread(&lt->timescale, 1, 1, lt->handle);
         ...

Several fields are then read from the file [4]:

  • numfacs: the number of facilities (elements in facnames)
  • numfacbytes: unused
  • longestname: keeps the longest length of all defined facilities’ names
  • zfacnamesize: compressed size of facnames
  • zfacname_predec_size: decompressed size of facnames
  • zfacgeometrysize: compressed size of facgeometry

Then, the facnames and facgeometry structures are extracted. They can be compressed with either gzip, bzip2 or lzma, depending on the first 2 bytes within the structure buffer.

Right after these two structures, there’s a sequence of blocks that can be arbitrarily long.

     for (;;) {
         ...
[5]      b = calloc(1, sizeof(struct vzt_rd_block));
         b->last_rd_value_idx = ~0;

[6]      rcf = fread(&b->uncompressed_siz, 4, 1, lt->handle);
         b->uncompressed_siz = rcf ? vzt_rd_get_32(&b->uncompressed_siz, 0) : 0;
         rcf = fread(&b->compressed_siz, 4, 1, lt->handle);
         b->compressed_siz = rcf ? vzt_rd_get_32(&b->compressed_siz, 0) : 0;
         rcf = fread(&b->start, 8, 1, lt->handle);
         b->start = rcf ? vzt_rd_get_64(&b->start, 0) : 0;
         rcf = fread(&b->end, 8, 1, lt->handle);
         b->end = rcf ? vzt_rd_get_64(&b->end, 0) : 0;
         pos = ftello(lt->handle);

[7]      if ((b->rle = (b->start > b->end))) {
             vztint64_t tb = b->start;
             b->start = b->end;
             b->end = tb;
         }

         ...
         if ((b->uncompressed_siz) && (b->compressed_siz) && (b->end)) {
             /* fprintf(stderr, VZT_RDLOAD"block [%d] %lld / %lld\n", lt->numblocks, b->start, b->end); */
             fseeko(lt->handle, b->compressed_siz, SEEK_CUR);

             lt->numblocks++;
             if (lt->numblocks <= lt->pthreads) {
                 vzt_rd_pthread_mutex_init(lt, &b->mutex, NULL);
                 vzt_rd_decompress_blk_pth(lt, b); /* prefetch first block */
             }

[8]          if (lt->block_curr) {
                 b->prev = lt->block_curr;
                 lt->block_curr->next = b;
                 lt->block_curr = b;
                 lt->end = b->end;
             } else {
                 lt->block_head = lt->block_curr = b;
                 lt->start = b->start;
                 lt->end = b->end;
             }
         } else {
             free(b);
             break;
         }

         pos += b->compressed_siz;
     }

At [5] the block structure is initialized. At [6] some fields are extracted, and at [7] there’s a check to make sure that start is always smaller or equal to end (and sets rle if this isn’t true). Finally, the block is saved inside a linked list [8].

From this code we can see the file structure for a block is as follows:

  • uncompressed_siz - unsigned big endian 32-bit
  • compressed_siz - unsigned big endian 32-bit
  • start_time - unsigned big endian 64-bit
  • end_time - unsigned big endian 64-bit
  • compressed data of size compressed_siz

Upon return from the current vzt_rd_init_smp function, the blocks are parsed inside vzt_rd_iter_blocks.

Eventually, a call to vzt_rd_decompress_blk decompresses the contents of the block and sets b->mem to point to the contents of the decompressed data.

Once b->mem is set, we reach a call to vzt_rd_block_vch_decode that parses the compressed block contents.

     static void vzt_rd_block_vch_decode(struct vzt_rd_trace *lt, struct vzt_rd_block *b) {
         vzt_rd_pthread_mutex_lock(lt, &b->mutex);

         if ((!b->times) && (b->mem)) {
             vztint64_t *times = NULL;
             vztint32_t *change_dict = NULL;
             vztint32_t *val_dict = NULL;
             unsigned int num_time_ticks, num_sections, num_dict_entries;
[9]          unsigned char *pnt = b->mem;
             vztint32_t i, j, m, num_dict_words;
             /* vztint32_t *block_end = (vztint32_t *)(pnt + b->uncompressed_siz); */
             vztint32_t *val_tmp;
             unsigned int num_bitplanes;
             uintptr_t padskip;
             ...

At [9] pnt is set to point to the decompressed block data.

     ...
[10] num_sections = vzt_rd_get_v32(&pnt);
     num_dict_entries = vzt_rd_get_v32(&pnt);
     padskip = ((uintptr_t)pnt)&3; pnt += (padskip) ? 4-padskip : 0; /* skip pad to next 4 byte boundary */

     /* fprintf(stderr, "num_sections: %d, num_dict_entries: %d\n", num_sections, num_dict_entries); */

At [10] num_sections and num_dict_entries are extracted as 32-bit varints from the decompressed block data.

At this point, we’ll describe two similar issues separately.
Both issues lead to a memory corruption. However, the decompression code can happen in threads, and GTKWave itself is a multi-threaded application. For these reasons, depending on the threading and heap implementations, writing out-of-bounds in the decompression thread may lead to corrupting a thread nearby, possibly leading to arbitrary code execution.

CVE-2023-38652 - val_dict allocation

If b->rle is set (that is, if start_time is bigger than end_time in the input file), we enter this block:

     ...
     if(b->rle)
        {
        vztint32_t *curr_dec_dict;
        vztint32_t first_bit = 0, curr_bit = 0;
        vztint32_t runlen;

[11]    val_dict = calloc(1, b->num_rle_bytes = (num_dict_words = num_sections * num_dict_entries) * sizeof(vztint32_t));
[11]    curr_dec_dict = val_dict;

        vzt_rd_pthread_mutex_lock(lt, &lt->mutex);
        lt->block_mem_consumed += b->num_rle_bytes;
        vzt_rd_pthread_mutex_unlock(lt, &lt->mutex);

[12]    for(i=0;i<num_dict_entries;i++)
            {
            vztint32_t curr_dec_bit = 0, curr_dec_word = 0;
            for(;;)
                {
                runlen = vzt_rd_get_v32(&pnt);
                if(!runlen)
                    {
                    first_bit = first_bit ^ 1;
                    }
                curr_bit ^= 1;
                if((!curr_dec_word)&&(!curr_dec_bit))
                    {
                    curr_bit = first_bit;
                    }

                for(j=0;j<runlen;j++)
                    {
[13]                if(curr_bit) *curr_dec_dict |= (1<<curr_dec_bit);
                    curr_dec_bit++;
                    if(curr_dec_bit != 32) continue;

                    curr_dec_bit = 0;
                    curr_dec_dict++;
                    curr_dec_word++;
                    if(curr_dec_word == num_sections) goto iloop;
                    }
                }
            iloop: i+=0; /* deliberate...only provides a jump target to loop bottom */
            }

        goto bpcalc;
        }

At [11] val_dict/curr_dec_dict is allocated with a size calculated as num_sections * num_dict_entries) * sizeof(vztint32_t). This calculation can overflow, leading to an overly small allocation of the val_dict buffer. The same happens if num_sections is 0.
At [12] a loop walks curr_dec_bit for num_dict_entries times, and curr_dec_bit is read and written at [13], leading to an out-of-bounds write in heap, as the buffer is too small for num_dict_entries.

CVE-2023-38653 - change_dict allocation

Later on, in the same function, we reach this code to populate change_dict:

[14] num_dict_words = (num_sections * num_dict_entries) * sizeof(vztint32_t);
     change_dict = malloc(num_dict_words ? num_dict_words : sizeof(vztint32_t)); /* scan-build */
     m = 0;
     for(i=0;i<num_dict_entries;i++)
        {
        vztint32_t pbit = 0;
        for(j=0;j<num_sections;j++)
            {
                     vztint32_t k = val_dict[m];
                     vztint32_t l = k^((k<<1)^pbit);
[15]        change_dict[m++] = l;
            pbit = k >> 31;
            }
        }

At [14], change_dict is allocated with a size calculated as (num_sections * num_dict_entries) * sizeof(vztint32_t). This calculation can overflow, to an overly small allocation of the change_dict buffer.
Eventually change_dict is written to at [15], num_sections and num_dict_entries as conditions in the loop, which don’t however represent the real size of the buffer in case of overflow. This leads to an out-of-bounds write in heap.

Crash Information

==401223==ERROR: AddressSanitizer: heap-buffer-overflow on address 0xf5e00354 at pc 0x56559492 bp 0xffffd478 sp 0xffffd46c
WRITE of size 4 at 0xf5e00354 thread T0
    #0 0x56559491 in vzt_rd_block_vch_decode src/helpers/vzt_read.c:458
    #1 0x5655ca7a in vzt_rd_process_block src/helpers/vzt_read.c:817
    #2 0x565619d4 in vzt_rd_iter_blocks src/helpers/vzt_read.c:1513
    #3 0x5656c5fc in process_vzt src/helpers/vzt2vcd.c:299
    #4 0x5656cf15 in main src/helpers/vzt2vcd.c:464
    #5 0xf7611294 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
    #6 0xf7611357 in __libc_start_main_impl ../csu/libc-start.c:381
    #7 0x565574f6 in _start (vzt2vcd+0x24f6)

0xf5e00354 is located 0 bytes to the right of 4-byte region [0xf5e00350,0xf5e00354)
allocated by thread T0 here:
    #0 0xf7a55ffb in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:69
    #1 0x5655939c in vzt_rd_block_vch_decode src/helpers/vzt_read.c:449
    #2 0x5655ca7a in vzt_rd_process_block src/helpers/vzt_read.c:817
    #3 0x565619d4 in vzt_rd_iter_blocks src/helpers/vzt_read.c:1513
    #4 0x5656c5fc in process_vzt src/helpers/vzt2vcd.c:299
    #5 0x5656cf15 in main src/helpers/vzt2vcd.c:464
    #6 0xf7611294 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

SUMMARY: AddressSanitizer: heap-buffer-overflow src/helpers/vzt_read.c:458 in vzt_rd_block_vch_decode
Shadow bytes around the buggy address:
  0x3ebc0010: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x3ebc0020: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x3ebc0030: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x3ebc0040: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x3ebc0050: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x3ebc0060: fa fa fa fa fa fa fa fa fa fa[04]fa fa fa 00 fa
  0x3ebc0070: fa fa 04 fa fa fa fd fa fa fa fd fd fa fa fd fa
  0x3ebc0080: fa fa fd fd fa fa fd fa fa fa fd fd fa fa fd fa
  0x3ebc0090: fa fa fd fd fa fa fd fd fa fa 00 04 fa fa 00 05
  0x3ebc00a0: fa fa 00 04 fa fa 00 04 fa fa 00 04 fa fa 04 fa
  0x3ebc00b0: fa fa 04 fa fa fa 04 fa fa fa 04 fa fa fa 04 fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
VENDOR RESPONSE

Fixed in version 3.3.118, available from https://sourceforge.net/projects/gtkwave/files/gtkwave-3.3.118/

TIMELINE

2023-08-02 - Vendor Disclosure
2023-12-31 - Vendor Patch Release
2024-01-08 - Public Release

Credit

Discovered by Claudio Bozzato of Cisco Talos.