Talos Vulnerability Report

TALOS-2022-1667

Open Babel CSR format title out-of-bounds write vulnerability

July 21, 2023
CVE Number

CVE-2022-41793

SUMMARY

An out-of-bounds write vulnerability exists in the CSR format title functionality of Open Babel 3.1.1 and master commit 530dbfa3. A specially crafted malformed file can lead to arbitrary code execution. An attacker can provide a malicious file to trigger this vulnerability.

CONFIRMED VULNERABLE VERSIONS

The versions below were either tested or verified to be vulnerable by Talos or confirmed to be vulnerable by the vendor.

Open Babel 3.1.1
Open Babel master commit 530dbfa3

PRODUCT URLS

Open Babel - https://openbabel.org/

CVSSv3 SCORE

9.8 - CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H

CWE

CWE-120 - Buffer Copy without Checking Size of Input (‘Classic Buffer Overflow’)

DETAILS

Open Babel is a popular library for converting chemical file formats, currently supporting about 130 different file formats. It implements bindings for several programming languages. Because of the nature of the library, and since there are many online chemical format converters and molecule viewers which might be using Open Babel in their backend for parsing and conversion, we consider this software as potentially accessible via network.

Open Babel ships a simple converter application called obabel that can be used to trigger the issue described in this advisory. obabel supports -i and -o parameters, which select the input and output formats to perform the conversion. obabel supports multiple input and output files (as does the Open Babel library itself): this technically allows multiple vulnerabilities to trigger in sequence, which in turn could make some vulnerabilities easier to exploit. In this advisory, however, we focus on only one input file and a corresponding output file.

When a single input file and output file are supplied, obabel.cpp records the input and output formats (if supplied), and calls OBConversion::FullConvert in obconversion.cpp. Inside this function, there’s a call to OpenAndSetFormat, which uses FormatFromExt to derive the input format from the filename extension if no -i parameter was supplied. Similarly, OpenInAndOutFiles can be used to derive both input and output formats from the filename extensions when none are supplied.

Depending on how the obabel application is invoked, different paths could actually take place. Eventually, pInFormat and pOutFormat (of base class OBFormat) objects are allocated, which are instances of the classes that implement the selected input and output formats.

The code then proceeds with a call to OBConversion::Convert, which eventually leads to calling pInFormat->ReadMolecule and pOutFormat->WriteMolecule.

In this advisory, we describe an issue in the CSR file format (formats/CSRformat.cpp.cpp) when writing an output file via WriteMolecule.

    bool CSRFormat::WriteMolecule(OBBase* pOb, OBConversion* pConv)
    {
      OBMol* pmol = dynamic_cast<OBMol*>(pOb);
      if (pmol == nullptr)
        return false;

      //Define some references so we can use the old parameter names
      ostream &ofs = *pConv->GetOutStream();
      OBMol &mol = *pmol;

      //  if (FirstTime)
      if(pConv->GetOutputIndex()==1)
        {
          WriteCSRHeader(ofs,mol);
          //FirstTime = false;
          MolCount=1;
        }

[1]   WriteCSRCoords(ofs,mol);
      MolCount++;

      return(true);
    }

Let’s follow the call to WriteCSRCoords at [1].

    void CSRFormat::WriteCSRCoords(ostream &ofs,OBMol &mol)
    {
      int the_size,jconf;
      double x,y,z,energy;
      char title[100];
      char *tag;

      the_size = sizeof(int) + sizeof(double) + (80 * sizeof(char));

      jconf = 1;
      energy = -2.584565;

[2]   snprintf(title, 80, "%s:%d",mol.GetTitle(),MolCount);
[3]   tag = PadString(title,80);

      WriteSize(the_size,ofs);
      ofs.write((char*)&jconf,sizeof(int));
      ofs.write((char*)&energy,sizeof(double));
      ofs.write(tag,80*sizeof(char));
      WriteSize(the_size,ofs);

      WriteSize(mol.NumAtoms()*sizeof(double),ofs);

At [2], the molecule’s title is retrieved, and the title buffer is filled. Then PadString is called, passing title and 80 as size [3].

    char* CSRFormat::PadString(char *input, int size)
    {
      char *output;

[4]   output = new char[size];
      memset(output, ' ', size);
[5]   strncpy(output, input, strlen(input));
      output[ size - 1] = '\0';
      return(output);
    }

At [4], a new char array of size size is allocated, and at [5] it is filled with input using strncpy. The maximum size passed to strncpy is not related to the destination buffer size. Rather, it’s the length of the input (that is, the title), which is potentially controlled by the input file. If a long title (longer than 80 characters) is supplied, an out-of-bounds write on the heap will happen, which can lead to arbitrary code execution.

Crash Information

$ ./bin/obabel -i caccrt strncpy.oobw.caccrt_csr -o csr
=================================================================
==1329606==ERROR: AddressSanitizer: heap-buffer-overflow on address 0xf560ed24 at pc 0xf79ff8c4 bp 0xffffb9e8 sp 0xffffb5c0
WRITE of size 169 at 0xf560ed24 thread T0
    #0 0xf79ff8c3 in __interceptor_strncpy ../../../../src/libsanitizer/asan/asan_interceptors.cpp:470
    #1 0xf4f3b6a4 in OpenBabel::CSRFormat::PadString(char*, int) ./src/formats/CSRformat.cpp:192
    #2 0xf4f3b7d0 in OpenBabel::CSRFormat::WriteCSRHeader(std::ostream&, OpenBabel::OBMol&) ./src/formats/CSRformat.cpp:105
    #3 0xf4f3c86a in OpenBabel::CSRFormat::WriteMolecule(OpenBabel::OBBase*, OpenBabel::OBConversion*) ./src/formats/CSRformat.cpp:89
    #4 0xf7516072 in OpenBabel::OBMoleculeFormat::WriteChemObjectImpl(OpenBabel::OBConversion*, OpenBabel::OBFormat*) ./src/obmolecformat.cpp:174
    #5 0xf63c355c in OpenBabel::OBMoleculeFormat::WriteChemObject(OpenBabel::OBConversion*) ./include/openbabel/obmolecformat.h:120
    #6 0xf72a23e6 in OpenBabel::OBConversion::Convert() ./src/obconversion.cpp:607
    #7 0xf72c717a in OpenBabel::OBConversion::Convert(std::istream*, std::ostream*) ./src/obconversion.cpp:481
    #8 0xf72cf4f3 in OpenBabel::OBConversion::FullConvert(std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >&) ./src/obconversion.cpp:1514
    #9 0x565594ea in main ./tools/obabel.cpp:370
    #10 0xf77923b4 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
    #11 0xf779247e in __libc_start_main_impl ../csu/libc-start.c:389
    #12 0x5655c356 in _start (./bin/obabel+0x7356)

0xf560ed24 is located 0 bytes to the right of 100-byte region [0xf560ecc0,0xf560ed24)
allocated by thread T0 here:
    #0 0xf7a58bb3 in operator new[](unsigned int) ../../../../src/libsanitizer/asan/asan_new_delete.cpp:98
    #1 0xf4f3b67c in OpenBabel::CSRFormat::PadString(char*, int) ./src/formats/CSRformat.cpp:190
    #2 0xf4f3b7d0 in OpenBabel::CSRFormat::WriteCSRHeader(std::ostream&, OpenBabel::OBMol&) ./src/formats/CSRformat.cpp:105
    #3 0xf4f3c86a in OpenBabel::CSRFormat::WriteMolecule(OpenBabel::OBBase*, OpenBabel::OBConversion*) ./src/formats/CSRformat.cpp:89
    #4 0xf7516072 in OpenBabel::OBMoleculeFormat::WriteChemObjectImpl(OpenBabel::OBConversion*, OpenBabel::OBFormat*) ./src/obmolecformat.cpp:174
    #5 0xf63c355c in OpenBabel::OBMoleculeFormat::WriteChemObject(OpenBabel::OBConversion*) ./include/openbabel/obmolecformat.h:120
    #6 0xf72a23e6 in OpenBabel::OBConversion::Convert() ./src/obconversion.cpp:607
    #7 0xf72c717a in OpenBabel::OBConversion::Convert(std::istream*, std::ostream*) ./src/obconversion.cpp:481
    #8 0xf72cf4f3 in OpenBabel::OBConversion::FullConvert(std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >&) ./src/obconversion.cpp:1514
    #9 0x565594ea in main ./tools/obabel.cpp:370
    #10 0xf77923b4 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

SUMMARY: AddressSanitizer: heap-buffer-overflow ../../../../src/libsanitizer/asan/asan_interceptors.cpp:470 in __interceptor_strncpy
Shadow bytes around the buggy address:
  0x3eac1d50: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x3eac1d60: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x3eac1d70: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x3eac1d80: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x3eac1d90: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
=>0x3eac1da0: 00 00 00 00[04]fa fa fa fa fa fa fa fa fa fd fd
  0x3eac1db0: fd fd fd fd fd fd fd fd fd fd fd fa fa fa fa fa
  0x3eac1dc0: fa fa fa fa fd fd fd fd fd fd fd fd fd fd fd fd
  0x3eac1dd0: fd fa fa fa fa fa fa fa fa fa fd fd fd fd fd fd
  0x3eac1de0: fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa fa
  0x3eac1df0: fd fd fd fd fd fd fd fd fd fd fd fd fd fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb

Exploit Proof of Concept

To show how to control the title and trigger this vulnerability, let’s use the “Cacao Cartesian format” as input, to easily choose a title which is independent from the input filename:

    bool CacaoFormat::ReadMolecule(OBBase* pOb, OBConversion* pConv)
    {

      OBMol* pmol = pOb->CastAndClear<OBMol>();
      if (pmol == nullptr)
        return false;

      //Define some references so we can use the old parameter names
      istream &ifs = *pConv->GetInStream();
      OBMol &mol = *pmol;
[6]   mol.SetTitle( pConv->GetTitle()); //default title is the filename

      char buffer[BUFF_SIZE];
      int natoms;
      double A,B,C,Alpha,Beta,Gamma;
      matrix3x3 m;

[7]   ifs.getline(buffer,BUFF_SIZE);
[8]   mol.SetTitle(buffer);
      ifs.getline(buffer,BUFF_SIZE);
[9]   sscanf(buffer,"%d",&natoms);
      ...

This format has a default title taken from the filename [6]. However, the title can be freely chosen at [7, 8], in order to use long titles for easier exploitation. At [9] we should specify at least 1 atom, to make sure WriteMolecule is called, and satisfy additional fields in this specific input format, in order to eventually trigger the vulnerability in CSRFormat.

VENDOR RESPONSE

Since the maintainer of this software did not release a patch during the 90 day window specified in our policy, we have now decided to release the information regarding this vulnerability, to make users of the software aware of this problem. See Cisco’s Coordinated Vulnerability Disclosure Policy for more information: https://tools.cisco.com/security/center/resources/vendor_vulnerability_policy.html

TIMELINE

2022-12-20 - Initial Vendor Contact
2023-01-12 - Vendor Disclosure
2023-07-21 - Public Release

Credit

Discovered by Claudio Bozzato of Cisco Talos.