Talos Vulnerability Report

TALOS-2023-1806

GTKWave VCD get_vartoken realloc use-after-free vulnerabilities

January 8, 2024
CVE Number

CVE-2023-37576,CVE-2023-37577,CVE-2023-37573,CVE-2023-37578,CVE-2023-37575,CVE-2023-37574

SUMMARY

Multiple use-after-free vulnerabilities exist in the VCD get_vartoken realloc functionality of GTKWave 3.3.115. A specially crafted .vcd file can lead to arbitrary code execution. A victim would need to open a malicious file to trigger these vulnerabilities.

CONFIRMED VULNERABLE VERSIONS

The versions below were either tested or verified to be vulnerable by Talos or confirmed to be vulnerable by the vendor.

GTKWave 3.3.115

PRODUCT URLS

GTKWave - https://gtkwave.sourceforge.net

CVSSv3 SCORE

7.8 - CVSS:3.1/AV:L/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H

CWE

CWE-416 - Use After Free

DETAILS

GTKWave is a wave viewer, often used to analyze FPGA simulations and logic analyzer captures. It includes a GUI to view and analyze traces, as well as convert across several file formats (.lxt, .lxt2, .vzt, .fst, .ghw, .vcd, .evcd) either by using the UI or its command line tools. GTKWave is available for Linux, Windows and MacOS. Trace files can be shared within teams or organizations, for example to compare results of simulation runs across different design implementations, to analyze protocols captured with logic analyzers or just as a reference when porting design implementations.

GTKWave sets up mime types for its supported extensions. For example, it’s enough for a victim to double-click on a wave file received by e-mail to trigger some of the vulnerabilities described in this advisory.

VCD (Value Change Dump) files are parsed by the vcd_parse function. This function is duplicated in several conversion utilities (vcd2lxt, vcd2lxt2, vcd2vzt) and in the GUI portion of GTKWave. In general the various implementations are very similar or identical, and in this case they are all affected by the issue described in this advisory.

Let’s describe the execution flow for the vcd2lxt utility. The other implementations have very similar behavior.

The function vcd_parse loops over each line in the file [1] and, depending on which token has been read [2], a different switch block is executed:

     static void vcd_parse(int linear) {
         int tok;

[1]      for (;;) {
[2]          switch (get_token()) {
                 ...

The get_token() function simply extracts a token from the file at the current cursor position, saving it to the global yytext buffer and assigning the token’s length to the global yylen.
Moreover, if the token does not start with “$”, the token is considered a special symbol, and it has to match one of these tokens:

char *tokens[] = {"var", "end", "scope", "upscope",
                  "comment", "date", "dumpall", "dumpoff", "dumpon",
                  "dumpvars", "enddefinitions",
                  "dumpports", "dumpportsoff", "dumpportson", "dumpportsall",
                  "timescale", "version", "vcdclose", "timezero",
                  "", "", ""};

The return value of get_token is a token type, which is an index inside the tokens array above.
If the token does not start with “$”, the token is considered a string and the returned token type will be T_STRING.

Going back to the switch above, if the parsed token is “$var”, the token type will be T_VAR and we will enter the block at [3]:

[3]  case T_VAR: {
         int vtok;
         struct vcdsymbol *v = NULL;

         var_prevch = 0;
         ...
[4]      vtok = get_vartoken(1);

A new token is read using get_vartoken() [5] and saved into vtok.
This function, similarly to get_token(), extracts a token from the file, separated by any of “ “, “\t”, “\n”, or “\r”.
However, it has additional functionalities, as it allows for specifying variable names using a range format to declare their width.

     static int get_vartoken(int match_kw) {
         int ch;
         int i, len = 0;

         ...
         if (!var_prevch) {
             for (;;) {
[5]              ch = getch();
                 if (ch < 0) return (V_END);
                 if ((ch == ' ') || (ch == '\t') || (ch == '\n') || (ch == '\r')) continue;
                 break;
             }
         } else {
             ch = var_prevch;
             var_prevch = 0;
         }
         ...

[6]      for (yytext[len++] = ch;; yytext[len++] = ch) {
[12]         if (len == T_MAX_STR) {
                yytext = (char *)realloc_2(yytext, (T_MAX_STR = T_MAX_STR * 2) + 1);
             }

[7]          ch = getch();
             if (ch == ' ') {
                 if (match_kw) break;
[8]              if (getch_peek() == '[') {
                     ch = getch();
[9]                  varsplit = yytext + len; /* keep looping so we get the *last* one */
                     continue;
                 }
             }

             if ((ch == ' ') || (ch == '\t') || (ch == '\n') || (ch == '\r') || (ch < 0)) break;
[10]          if ((ch == '[') && (yytext[0] != '\\')) {
[11]            varsplit = yytext + len; /* keep looping so we get the *last* one */
             } else if (((ch == ':') || (ch == ']')) && (!varsplit) && (yytext[0] != '\\')) {
                 var_prevch = ch;
                 break;
             }
         }
         ...

At [5], the first non-separator character is read. Then, the loop at [6] is used to read the current variable name into yytext.
A new character is read [7], and if it matches “[” [8, 10], the varsplit pointer is set to the current location within yytext [9, 11]. This is used to keep a reference to the start of the range notation within the variable name.

At the start of the loop we can see a check for len == T_MAX_STR. The size of yytext is initially equal to T_MAX_STR. Whenever the number of characters written (tracked by len) reach T_MAX_STR, yytext is reallocated by doubling the current size, so that it’ll be able to fit the rest of the data.
The issue is that realloc may move the buffer to a completely different location. While yytext will be reallocated correctly, varsplit may now point to a portion of memory that has been freed (by realloc).

Keeping in mind that varsplit now points to freed memory, let’s see how the function continues after the loop:

         ...
         yytext[len] = 0; /* absolute terminator */
[12]     if ((varsplit) && (yytext[len - 1] == ']')) {
             char *vst;
[13]         vst = malloc_2(strlen(varsplit) + 1);
             strcpy(vst, varsplit);

[14]         *varsplit = 0x00; /* zero out var name at the left bracket */
[15]         len = varsplit - yytext;

[16]         varsplit = vsplitcurr = vst;
             var_prevch = 0;
         } else {
             varsplit = NULL;
         }

         if (match_kw)
             for (i = 0; i < NUM_VTOKENS; i++) {
                 if (!strcmp(yytext, vartypes[i])) {
                     return (varenums[i]);
                 }
             }

[17]     yylen = len;
         return (V_STRING);
     }

At [12], if the variable definition ends with “]” (end of the range definition), a new vst buffer is allocated [13], with a length of strlen(varsplit). This is an out-of-bounds read that has limited consequences, as the strcpy that comes afterwards won’t be able to write out-of-bounds. At [14] however, a NULL byte is written to freed memory. This command may already overwrite heap metadata, depending on the heap implementation being used. Also note that it is possible to choose the offset where varsplit points within the old yytext buffer, so it is possible to precisely write the NULL byte in a specific spot within the current freed chunk’s metadata.

There is also another avenue for exploitation, since len is calculated as the difference between varsplit and yytext [15]. This difference depends on where the new yytext buffer is, but with careful heap manipulation, it can be set indirectly, by controlling the allocation for yytext when realloc is being called. We can thus assume at this point that we have some kind of control on len as well.
At [16], varsplit is set to point to vst, so from this point on it will point to a valid memory location.

Finally at [17], yylen is set to len. Upon return of this function, yytext should contain a text of size yylen. However as previously stated, this won’t be true, as yylen can be controlled thanks to the use-after-free on varsplit.

Back in the T_VAR case above, one usage of get_vartoken is for reading the symbol name:

[18] vtok = get_vartoken(0);
     if (vtok != V_STRING) goto err;
     if (slisthier_len) {
         v->name = (char *)malloc_2(slisthier_len + 1 + yylen + 1);
         strcpy(v->name, slisthier);
         strcpy(v->name + slisthier_len, vcd_hier_delimeter);
         strcpy(v->name + slisthier_len + 1, yytext);
     } else {
[19]     v->name = (char *)malloc_2(yylen + 1);
[20]     strcpy(v->name, yytext);
     }

At [18] get_vartoken(0) is called, which sets any arbitrary content in yytext (read from file) and also sets an arbitrary yylen (controlled via the use-after-free). If an attacker manages to set yylen to a small value (smaller than the length of yytext’s text), v->name will be created with a relatively small buffer [19], and the strcpy at [20] will write out-of-bounds.

For this reason, this issue has potential to be used to achieve arbitrary code execution.

As mentioned before, this issue affects all vcd_parse implementations (6 different source files), that we list separately below.

CVE-2023-37573 - VCD GUI recoder

The GUI’s recoder VCD parsing code (default parser) may use memory after it has been freed at line vcd_recoder.c:1057. This may lead to arbitrary code execution as described above.

This issue does not need any special command-line switch to be triggered when starting GTKWave.

CVE-2023-37574 - VCD GUI legacy

The GUI’s legacy VCD parsing code may use memory after it has been freed at line vcd.c:552. This may lead to arbitrary code execution as described above.

This issue can be triggered by using the -L flag when starting GTKWave.

CVE-2023-37575 - VCD GUI interactive

The GUI’s interactive VCD parsing code may use memory after it has been freed at line vcd_partial.c:528. This may lead to arbitrary code execution as described above.

This issue can be triggered by using the -I flag when starting GTKWave.

CVE-2023-37576 - vcd2vzt

The VCD parsing code in the vcd2vzt conversion utility may use memory after it has been freed at line vcd2vzt.c:566. This may lead to arbitrary code execution as described above.

CVE-2023-37577 - vcd2lxt2

The VCD parsing code in the vcd2lxt2 conversion utility may use memory after it has been freed at line vcd2lxt2.c:564. This may lead to arbitrary code execution as described above.

CVE-2023-37578 - vcd2lxt

The VCD parsing code in the vcd2lxt conversion utility may use memory after it has been freed at line vcd2lxt.c:559. This may lead to arbitrary code execution as described above.

Crash Information

==209676==ERROR: AddressSanitizer: heap-use-after-free on address 0xf5703789 at pc 0x5655a73d bp 0xffffd728 sp 0xffffd71c
WRITE of size 1 at 0xf5703789 thread T0
    #0 0x5655a73c in get_vartoken src/helpers/vcd2lxt.c:593
    #1 0x5655e53a in vcd_parse src/helpers/vcd2lxt.c:1233
    #2 0x56561640 in vcd_main src/helpers/vcd2lxt.c:1704
    #3 0x56562dad in main src/helpers/vcd2lxt.c:1959
    #4 0xf7647294 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
    #5 0xf7647357 in __libc_start_main_impl ../csu/libc-start.c:381
    #6 0x565583f6 in _start (vcd2lxt+0x33f6)

0xf5703789 is located 9 bytes inside of 1025-byte region [0xf5703780,0xf5703b81)
freed by thread T0 here:
    #0 0xf7a55144 in __interceptor_realloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:85
    #1 0x56578b08 in realloc_2 src/helpers/v2l_debug.c:108
    #2 0x5655a499 in get_vartoken src/helpers/vcd2lxt.c:559
    #3 0x5655e53a in vcd_parse src/helpers/vcd2lxt.c:1233
    #4 0x56561640 in vcd_main src/helpers/vcd2lxt.c:1704
    #5 0x56562dad in main src/helpers/vcd2lxt.c:1959
    #6 0xf7647294 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

previously allocated by thread T0 here:
    #0 0xf7a55ffb in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:69
    #1 0x56578a6d in malloc_2 src/helpers/v2l_debug.c:92
    #2 0x56561274 in vcd_main src/helpers/vcd2lxt.c:1654
    #3 0x56562dad in main src/helpers/vcd2lxt.c:1959
    #4 0xf7647294 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

SUMMARY: AddressSanitizer: heap-use-after-free src/helpers/vcd2lxt.c:593 in get_vartoken
Shadow bytes around the buggy address:
  0x3eae06a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x3eae06b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x3eae06c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x3eae06d0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x3eae06e0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x3eae06f0: fd[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x3eae0700: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x3eae0710: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x3eae0720: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x3eae0730: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x3eae0740: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
VENDOR RESPONSE

Fixed in version 3.3.118, available from https://sourceforge.net/projects/gtkwave/files/gtkwave-3.3.118/

TIMELINE

2023-08-01 - Vendor Disclosure
2023-12-31 - Vendor Patch Release
2024-01-08 - Public Release

Credit

Discovered by Claudio Bozzato of Cisco Talos.