Talos Vulnerability Report

TALOS-2022-1666

Open Babel translationVectors parsing out-of-bounds write vulnerabilities

July 21, 2023

CVE Number

CVE-2022-46292,CVE-2022-46295,CVE-2022-46294,CVE-2022-46293,CVE-2022-46291

SUMMARY

Multiple out-of-bounds write vulnerabilities exist in the translationVectors parsing functionality in multiple supported formats of Open Babel 3.1.1 and master commit 530dbfa3. A specially-crafted malformed file can lead to arbitrary code execution. An attacker can provide a malicious file to trigger this vulnerability.

CONFIRMED VULNERABLE VERSIONS

The versions below were either tested or verified to be vulnerable by Talos or confirmed to be vulnerable by the vendor.

Open Babel 3.1.1
Open Babel master commit 530dbfa3

PRODUCT URLS

Open Babel - https://openbabel.org/

CVSSv3 SCORE

9.8 - CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H

CWE

CWE-119 - Improper Restriction of Operations within the Bounds of a Memory Buffer

DETAILS

Open Babel is a popular library for converting chemical file formats, currently supporting about 130 different file formats. It implements bindings for several programming languages. Because of the nature of the library, and since there are many online chemical format converters and molecule viewers which might be using Open Babel in their backend for parsing and conversion, we consider this software as potentially accessible via network.

Open Babel ships a simple converter application called obabel that can be used to trigger the issue described in this advisory. obabel supports -i and -o parameters, which select the input and output formats to perform the conversion. obabel supports multiple input and output files (as does the Open Babel library itself): this technically allows multiple vulnerabilities to trigger in sequence, which in turn could make some vulnerabilities easier to exploit. In this advisory, however, we focus on only one input file and a corresponding output file.

When a single input file and output file are supplied, obabel.cpp records the input and output formats (if supplied), and calls OBConversion::FullConvert in obconversion.cpp. Inside this function, there’s a call to OpenAndSetFormat, which uses FormatFromExt to derive the input format from the filename extension if no -i parameter was supplied. Similarly, OpenInAndOutFiles can be used to derive both input and output formats from the filename extensions when none are supplied.

Depending on how the obabel application is invoked, different paths could actually take place. Eventually, pInFormat and pOutFormat (of base class OBFormat) objects are allocated, which are instances of the classes that implement the selected input and output formats.

The code then proceeds with a call to OBConversion::Convert, which eventually leads to calling pInFormat->ReadMolecule and pOutFormat->WriteMolecule.

In this advisory, we describe an issue which is present in 4 different file formats (it’s a piece of code that has likely been copy/pasted between formats): MSI, MOPAC (two issues), MOPAC Cartesian and Gaussian. Let’s describe the issue in each file format.

Note that for all the described file formats, this issue leads to writing out-of-bounds on the stack, which, depending on how the code is compiled and the stack is layed out, can in turn lead to arbitrary code execution.

CVE-2022-46291 - MSI PeriodicType

The msi file format (formats/msiformat.cpp) parses an input file via ReadMolecule.

    bool MSIFormat::ReadMolecule(OBBase* pOb, OBConversion* pConv)
    {
      OBMol* pmol = pOb->CastAndClear<OBMol>();
      if (pmol == nullptr)
        return false;

      //Define some references so we can use the old parameter names
      istream &ifs = *pConv->GetInStream();
      OBMol &mol = *pmol;
      const char* title = pConv->GetTitle();
[1]   char buffer[BUFF_SIZE];

      stringstream errorMsg;

      if (!ifs)
        return false; // we're attempting to read past the end of the file

[2]   if (!ifs.getline(buffer,BUFF_SIZE))
        {
          obErrorLog.ThrowError(__FUNCTION__,
                                "Problems reading an MSI file: Cannot read the first line.", obWarning);
          return(false);
        }

[3]   if (!EQn(buffer, "# MSI CERIUS2 DataModel File", 28))
        {
          obErrorLog.ThrowError(__FUNCTION__,
                                "Problems reading an MSI file: The first line must contain the MSI header.", obWarning);
          return(false);
        }
      ...

The function defines several variables. We’re especially interested in buffer at [1], which is used to read lines in the input file. At [2] a line is read, which is expected to contain the string “# MSI CERIUS2 DataModel File”.

    ...
    unsigned int openParens = 0; // the count of "open parentheses" tags
    unsigned int startBondAtom, endBondAtom, bondOrder;
    bool atomRecord = false;
    bool bondRecord = false;
    OBAtom *atom;
    vector<string> vs;
    const SpaceGroup *sg;
    bool setSpaceGroup = false;
    double x,y,z;
[4] vector3 translationVectors[3];
[5] int numTranslationVectors = 0;

    mol.BeginModify();
[6] while (ifs.getline(buffer,BUFF_SIZE))
      {
        ...

[7]     if (strstr(buffer, "PeriodicType") != nullptr) {
          ifs.getline(buffer,BUFF_SIZE); // next line should be translation vector
[8]       tokenize(vs,buffer);
[9]            while (vs.size() == 6) {
              x = atof((char*)vs[3].erase(0,1).c_str());
              y = atof((char*)vs[4].c_str());
              z = atof((char*)vs[5].c_str());

[10]           translationVectors[numTranslationVectors++].Set(x, y, z);
              if (!ifs.getline(buffer,BUFF_SIZE))
                break;
              tokenize(vs,buffer);
            }
        }
        ...

At 4, an array of 3 vector3 is defined, together with an index called numTranslationVectors. At [6], there’s a while loop that only stops when there’s no more lines found in the file. The check at [7] can be entered if the current line contains the string “PeriodicType”, which then reads the next line in the file. The line is then tokenized at [8]. The buffer string is split on white spaces, and each token is put in the vs vector. If 6 tokens are found [9], the code reaches [10], where translationVectors is populated and the numTranslationVectors index is incremented.

This loop continues as long as the current line contains 6 tokens. Since there is no other way to terminate the loop, the index numTranslationVectors can increase arbitrarily, which would eventually lead to writing out-of-bounds from the translationVectors variable. The Set function is invoked with controlled values x, y and z:

void Set(const double inX, const double inY, const double inZ)
{
  _vx = inX;
  _vy = inY;
  _vz = inZ;
}

This leads to writing out-of-bounds in the stack, which, depending on how the code is compiled and the stack is laid out, can in turn lead to arbitrary code execution.

CVE-2022-46292 - MOPAC Unit Cell Translation

The MOPAC file format (formats/mopacformat.cpp) parses an input file via ReadMolecule.

If a line contains “UNIT CELL TRANSLATION” [1], the code enters this block:

    ...
[1] else if (strstr(buffer, "UNIT CELL TRANSLATION") != nullptr)
      {
        numTranslationVectors = 0; // ignore old translationVectors
        ifs.getline(buffer,BUFF_SIZE);  // blank
        ifs.getline(buffer,BUFF_SIZE);  // column headings
        ifs.getline(buffer,BUFF_SIZE);
[2]     tokenize(vs,buffer);
[3]     while (vs.size() == 5)
          {
            x = atof((char*)vs[2].c_str());
            y = atof((char*)vs[3].c_str());
            z = atof((char*)vs[4].c_str());

[4]         translationVectors[numTranslationVectors++].Set(x, y, z);
[5]         if (!ifs.getline(buffer,BUFF_SIZE))
              break;
            tokenize(vs,buffer);
          }
      }

The line is tokenized at [2], and, as long as the read line contains 5 tokens [3, 5], the loop will continue. At [4], the index numTranslationVectors is increased at each iteration, eventually writing out-of-bounds as already described in the MSI format.

CVE-2022-46293 - MOPAC Final Point and Derivatives

The MOPAC file format (formats/mopacformat.cpp) parses an input file via ReadMolecule.

If a line contains “FINAL POINT AND DERIVATIVES” [1], the code enters this block:

    ...
[1] else if (strstr(buffer, "FINAL  POINT  AND  DERIVATIVES") != nullptr)
      {
        numTranslationVectors = 0; // Reset
        ifs.getline(buffer,BUFF_SIZE);  // blank
        ifs.getline(buffer,BUFF_SIZE);  // column headings
        ifs.getline(buffer,BUFF_SIZE);
[2]     tokenize(vs,buffer);
[3]     while (vs.size() == 8)
          {
            // Skip coords -- these would be overwritten by the later
            // CARTESIAN COORDINATES block anyway
            if (strcmp(vs.at(2).c_str(), "Tv") != 0)
              {
                if (!ifs.getline(buffer,BUFF_SIZE))
                  break;
                tokenize(vs,buffer);
                continue;
              }
[4]         const char coord = vs[4].at(0);
            double val = atof(vs[5].c_str());
            bool isZ = false;
[4]         switch (coord) {
            case 'X':
              x = val;
              break;
            case 'Y':
              y = val;
              break;
[5]         case 'Z':
              z = val;
              isZ = true;
              break;
            default:
              cerr << "Reading MOPAC Tv values: unknown coordinate '"
                   << coord << "', value: " << val << endl;
              break;
            }

            if (isZ)
[6]           translationVectors[numTranslationVectors++].Set(x, y, z);

[7]         if (!ifs.getline(buffer,BUFF_SIZE))
              break;
            tokenize(vs,buffer);
          }

The line is tokenized at [2], and, as long as the read line contains 5 tokens [3, 7], the loop will continue.
In order to reach the issue at [6], we need a token at position 4 [4] starting with “Z” [5]. This will set isZ to true, which will in turn land us at [6]. Here, the index numTranslationVectors is increased at each iteration, eventually writing out-of-bounds as already described in the MSI format.

CVE-2022-46294 - MOPAC Cartesian

The MOPACCART file format (formats/mopacformat.cpp) parses an input file via ReadMolecule.

    ...
[1] while (ifs.getline(buffer,BUFF_SIZE))
      {
        isotopeMass = 0;
        elementSymbol = "";

        //First see if this is a comment line - skip comment lines
        if (buffer[0] == '*') continue;

        //First see if there is a label defined
[2]     tokenize(vs,buffer,"()");
        if (vs.size() > 3) //Only one label allowed per line
            ...
        else if (1 < vs.size() && vs.size() <= 3) //There is a label
          {
[3]         elementSymbol = vs[0];
            atomLabel = vs[1];
            strcpy(buffer,vs[2].c_str());
          }
        else //no label, reset buffer
          strcpy(buffer,vs[0].c_str());

        //Now parse the rest of the line
        //There should be three cases:
        //1. There are 7 tokens and the first token is a number specifying the isotope mass
        //2. There are 7 tokens and the first token is a string containing the element symbol
        //3. There are 6 tokens and the first token is a number specifying the Cartesian x coordinate
[4]     tokenize(vs,buffer);
        if (vs.size() == 0)
          break;
        ...
[5]     else if (vs.size() == 7)
          {
            if (elementSymbol == "")
[6]           elementSymbol = vs[0];
            else
              isotopeMass = atof((char*)vs[0].c_str());

            x = atof((char*)vs[1].c_str());
            y = atof((char*)vs[3].c_str());
            z = atof((char*)vs[5].c_str());
          }
        else //vs.size() == 6
          {
            x = atof((char*)vs[0].c_str());
            y = atof((char*)vs[2].c_str());
            z = atof((char*)vs[4].c_str());
          }

[7]     if (elementSymbol == "Tv") //MOPAC translation vector
          {
[8]         translationVectors[numTranslationVectors++].Set(x, y, z);
          }
        ...
      }

As long as there are lines in the file, the while loop above [1] will be executed. Each line is tokenized with “(“ and “)” as delimiters [2], rather than with white spaces. If there are 2 or 3 tokens defined this way, elementSymbol is set [3] to the first token. Then, the line is tokenized again, this time using white spaces. If there are 7 tokens [5] present and there’s no elementSymbol defined yet, then elementSymbol is set to the first token [6]. Once elementSymbol is equal to “Tv”, the line at [8] is executed, where the index numTranslationVectors is increased at each iteration, eventually writing out-of-bounds as already described in the MSI format.

Note that “Tv” has to be present in each line, since elementSymbol is reset at every iteration. For clarity, these are the two ways for setting elementSymbol:

(Tv(x(x 2 3 4 5 6 or Tv(x(x 2 3 4 5 6 can be used to set it at [3]
Tv 2 3 4 5 6 7 can be used to set it at [6]

CVE-2022-46295 - Gaussian orientation

The Gaussian file format (formats/gaussformat.cpp) parses an input file via ReadMolecule.

    // Translation vectors (if present)
    vector3 translationVectors[3];
    int numTranslationVectors = 0;
    ...
    int i=0;
    bool no_symmetry=false;
    char coords_type[25];

    //Prescan file to find second instance of "orientation:"
    //This will be the kind of coords used in the chk/fchk file
    //Unless the "nosym" keyword has been requested
    while (ifs.getline(buffer,BUFF_SIZE))
      {
        if (strstr(buffer, "Symmetry turned off by external request.") != nullptr)
          {
            // The "nosym" keyword has been requested
            no_symmetry = true;
          }
[1]     if (strstr(buffer, "orientation:") != nullptr)
          {
            i++;
            tokenize (vs, buffer);
            // gotta check what types of orientation are present
[2]         strncpy (coords_type, vs[0].c_str(), 24);
            strcat (coords_type, " orientation:");
          }
        if ((no_symmetry && i==1) || i==2)
           break;
      }
    // Reset end-of-file pointers etc.
[3] ifs.clear();
    ifs.seekg(0);  //rewind

At [1], if the “orientation:” string is present anywhere in the file, coords_type is filled using the first token of the line [2] (it’s expecting a line like “vs0 orientation: …”.
The stream is then reset, restarting the file parsing from the first line of the file [3].

    mol.BeginModify();
    while (ifs.getline(buffer,BUFF_SIZE))
      {
        if(strstr(buffer, "Entering Gaussian") != nullptr)
        ...
[4]     else if (strstr(buffer, coords_type) != nullptr)
          {
            numTranslationVectors = 0; // ignore old translationVectors
[5]         ifs.getline(buffer,BUFF_SIZE);      // ---------------
            ifs.getline(buffer,BUFF_SIZE);      // column headings
            ifs.getline(buffer,BUFF_SIZE);  // column headings
            ifs.getline(buffer,BUFF_SIZE);  // ---------------
            ifs.getline(buffer,BUFF_SIZE);
            tokenize(vs,buffer);
[6]         while (vs.size()>4)
              {
                int corr = vs.size()==5 ? -1 : 0; //g94; later versions have an extra column
                x = atof((char*)vs[3+corr].c_str());
                y = atof((char*)vs[4+corr].c_str());
                z = atof((char*)vs[5+corr].c_str());
[7]             int atomicNum = atoi((char*)vs[1].c_str());

                if (atomicNum > 0) // translation vectors are "-2"
                  {
                    if (natoms == 0) { // first time reading the molecule, create each atom
                      atom = mol.NewAtom();
                      atom->SetAtomicNum(atoi((char*)vs[1].c_str()));
                    }
                    coordinates.push_back(x);
                    coordinates.push_back(y);
                    coordinates.push_back(z);
                  }
                else {
[8]               translationVectors[numTranslationVectors++].Set(x, y, z);
                }

                if (!ifs.getline(buffer,BUFF_SIZE)) {
                  break;
                }
                tokenize(vs,buffer);
              }

At [4], if the “orientation: “ line set at [2] is found (actually the last orientation line), 5 lines are discarded, and the next line is tokenized. If more than 4 tokens are found [6], the atomicNum is extracted from the second token, and, if 0 or negative, the line at [8] is executed. Here, the index numTranslationVectors is increased at each iteration, eventually writing out-of-bounds as already described in the MSI format.

Crash Information

$ ./bin/obabel -i gal translationVectors.oobw.gal -o sdf
=================================================================
==1274683==ERROR: AddressSanitizer: stack-buffer-overflow on address 0xffff3428 at pc 0xf47a4bbd bp 0xffff1508 sp 0xffff14fc
WRITE of size 8 at 0xffff3428 thread T0
    #0 0xf47a4bbc in OpenBabel::vector3::Set(double, double, double) ./include/openbabel/math/vector3.h:89
    #1 0xf47a4bbc in OpenBabel::GaussianOutputFormat::ReadMolecule(OpenBabel::OBBase*, OpenBabel::OBConversion*) ./src/formats/gaussformat.cpp:635
    #2 0xf751a915 in OpenBabel::OBMoleculeFormat::ReadChemObjectImpl(OpenBabel::OBConversion*, OpenBabel::OBFormat*) ./src/obmolecformat.cpp:102
    #3 0xf63c358c in OpenBabel::OBMoleculeFormat::ReadChemObject(OpenBabel::OBConversion*) ./include/openbabel/obmolecformat.h:116
    #4 0xf72a204e in OpenBabel::OBConversion::Convert() ./src/obconversion.cpp:545
    #5 0xf72c717a in OpenBabel::OBConversion::Convert(std::istream*, std::ostream*) ./src/obconversion.cpp:481
    #6 0xf72cf4f3 in OpenBabel::OBConversion::FullConvert(std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >&) ./src/obconversion.cpp:1514
    #7 0x565594ea in main ./tools/obabel.cpp:370
    #8 0xf77923b4 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
    #9 0xf779247e in __libc_start_main_impl ../csu/libc-start.c:389
    #10 0x5655c356 in _start (./bin/obabel+0x7356)

Address 0xffff3428 is located in stack of thread T0 at offset 7384 in frame
    #0 0xf478912f in OpenBabel::GaussianOutputFormat::ReadMolecule(OpenBabel::OBBase*, OpenBabel::OBConversion*) ./src/formats/gaussformat.cpp:437

  This frame has 228 object(s):
    [32, 33) '<unknown>'
    [48, 49) '<unknown>'
    [64, 65) '<unknown>'
    [80, 81) '<unknown>'
    [96, 97) '<unknown>'
    [112, 113) '<unknown>'
    [128, 129) '<unknown>'
    [144, 145) '<unknown>'
    [160, 161) '<unknown>'
    [176, 177) '<unknown>'
    [192, 193) '<unknown>'
    [208, 209) '<unknown>'
    [224, 225) '<unknown>'
    [240, 241) '<unknown>'
    [256, 257) '<unknown>'
    [272, 273) '<unknown>'
    [288, 289) '<unknown>'
    [304, 305) '<unknown>'
    [320, 321) '<unknown>'
    [336, 337) '<unknown>'
    [352, 353) '<unknown>'
    [368, 369) '<unknown>'
    [384, 385) '<unknown>'
    [400, 401) '<unknown>'
    [416, 417) '<unknown>'
    [432, 433) '<unknown>'
    [448, 449) '<unknown>'
    [464, 465) '<unknown>'
    [480, 481) '<unknown>'
    [496, 497) '<unknown>'
    [512, 513) '<unknown>'
    [528, 529) '<unknown>'
    [544, 545) '<unknown>'
    [560, 561) '<unknown>'
    [576, 577) '<unknown>'
    [592, 593) '<unknown>'
    [608, 609) '<unknown>'
    [624, 625) '<unknown>'
    [640, 641) '<unknown>'
    [656, 657) '<unknown>'
    [672, 673) '<unknown>'
    [688, 689) '<unknown>'
    [704, 705) '<unknown>'
    [720, 721) '<unknown>'
    [736, 737) '<unknown>'
    [752, 753) '<unknown>'
    [768, 769) '<unknown>'
    [784, 785) '<unknown>'
    [800, 801) '<unknown>'
    [816, 817) '<unknown>'
    [832, 833) '<unknown>'
    [848, 849) '<unknown>'
    [864, 865) '<unknown>'
    [880, 881) '<unknown>'
    [896, 898) '<unknown>'
    [912, 916) 'tmpCoords' (line 648)
    [928, 932) 'fgpi' (line 879)
    [944, 948) '<unknown>'
    [960, 964) '<unknown>'
    [976, 980) '<unknown>'
    [992, 996) '<unknown>'
    [1008, 1012) '__guard'
    [1024, 1028) '__dnew'
    [1040, 1044) '__guard'
    [1056, 1060) '__guard'
    [1072, 1076) '__guard'
    [1088, 1092) '__guard'
    [1104, 1108) '__guard'
    [1120, 1124) '__guard'
    [1136, 1140) '__guard'
    [1152, 1156) '__dnew'
    [1168, 1172) '__guard'
    [1184, 1188) '__dnew'
    [1200, 1204) '__guard'
    [1216, 1220) 'd' (line 436)
    [1232, 1236) 'd' (line 436)
    [1248, 1252) 'd' (line 436)
    [1264, 1268) 'd' (line 436)
    [1280, 1284) 'd' (line 436)
    [1296, 1300) 'd' (line 436)
    [1312, 1316) 'd' (line 436)
    [1328, 1332) 'd' (line 436)
    [1344, 1348) 'd' (line 436)
    [1360, 1364) 'd' (line 436)
    [1376, 1380) 'd' (line 436)
    [1392, 1396) 'd' (line 436)
    [1408, 1412) 'd' (line 436)
    [1424, 1428) 'd' (line 436)
    [1440, 1444) 'd' (line 436)
    [1456, 1460) 'd' (line 436)
    [1472, 1476) 'd' (line 436)
    [1488, 1496) 'x' (line 449)
    [1520, 1528) 'y' (line 449)
    [1552, 1560) 'z' (line 449)
    [1584, 1592) 'xx' (line 701)
    [1616, 1624) 'xy' (line 701)
    [1648, 1656) 'yy' (line 701)
    [1680, 1688) 'xz' (line 701)
    [1712, 1720) 'yz' (line 701)
    [1744, 1752) 'zz' (line 701)
    [1776, 1784) '<unknown>'
    [1808, 1816) '<unknown>'
    [1840, 1848) '<unknown>'
    [1872, 1880) 'x' (line 848)
    [1904, 1912) 'y' (line 848)
    [1936, 1944) 'z' (line 848)
    [1968, 1976) 'x' (line 868)
    [2000, 2008) 'y' (line 868)
    [2032, 2040) 'z' (line 868)
    [2064, 2072) '<unknown>'
    [2096, 2104) '<unknown>'
    [2128, 2136) '<unknown>'
    [2160, 2168) '<unknown>'
    [2192, 2200) '<unknown>'
    [2224, 2232) 'wavelength' (line 1114)
    [2256, 2264) 'force' (line 1115)
    [2288, 2296) 's' (line 1127)
    [2320, 2328) 's' (line 1138)
    [2352, 2360) 's' (line 1149)
    [2384, 2392) '<unknown>'
    [2416, 2424) '<unknown>'
    [2448, 2456) '<unknown>'
    [2480, 2488) '<unknown>'
    [2512, 2524) 'vs' (line 451)
    [2544, 2556) 'vs2' (line 451)
    [2576, 2588) 'Scomponents' (line 461)
    [2608, 2620) 'vconf' (line 474)
    [2640, 2652) 'coordinates' (line 475)
    [2672, 2684) 'confDimensions' (line 482)
    [2704, 2716) 'confEnergies' (line 483)
    [2736, 2748) 'confForces' (line 484)
    [2768, 2780) 'Lx' (line 487)
    [2800, 2812) 'Frequencies' (line 488)
    [2832, 2844) 'Intensities' (line 488)
    [2864, 2876) 'RotConsts' (line 490)
    [2896, 2908) 'Forces' (line 499)
    [2928, 2940) 'Wavelengths' (line 499)
    [2960, 2972) 'EDipole' (line 499)
    [2992, 3004) 'RotatoryStrengthsVelocity' (line 500)
    [3024, 3036) 'RotatoryStrengthsLength' (line 500)
    [3056, 3068) 'orbitals' (line 503)
    [3088, 3100) 'symmetries' (line 504)
    [3120, 3132) 'MPA_q' (line 748)
    [3152, 3164) '<unknown>'
    [3184, 3196) 'HPA_q' (line 796)
    [3216, 3228) 'CM5_q' (line 797)
    [3248, 3260) '<unknown>'
    [3280, 3292) '<unknown>'
    [3312, 3324) 'ESP_q' (line 927)
    [3344, 3356) '<unknown>'
    [3376, 3388) 'vib1' (line 997)
    [3408, 3420) 'vib2' (line 997)
    [3440, 3452) 'vib3' (line 997)
    [3472, 3484) '<unknown>'
    [3504, 3516) '<unknown>'
    [3536, 3548) '<unknown>'
    [3568, 3580) 'betaOrbitals' (line 1304)
    [3600, 3612) 'betaSymmetries' (line 1305)
    [3632, 3644) '<unknown>'
    [3664, 3676) '<unknown>'
    [3696, 3708) '<unknown>'
    [3728, 3740) '<unknown>'
    [3760, 3772) '<unknown>'
    [3792, 3804) '<unknown>'
    [3824, 3836) '<unknown>'
    [3856, 3868) '<unknown>'
    [3888, 3904) '<unknown>'
    [3920, 3944) 'str' (line 448)
    [3984, 4008) 'str1' (line 448)
    [4048, 4072) 'str2' (line 448)
    [4112, 4136) 'thermo_method' (line 448)
    [4176, 4200) 'chargeModel' (line 455)
    [4240, 4264) 'comment' (line 543)
    [4304, 4328) '<unknown>'
    [4368, 4392) '<unknown>'
    [4432, 4456) '<unknown>'
    [4496, 4520) '<unknown>'
    [4560, 4584) '<unknown>'
    [4624, 4648) '<unknown>'
    [4688, 4712) '<unknown>'
    [4752, 4776) '<unknown>'
    [4816, 4840) '<unknown>'
    [4880, 4904) '<unknown>'
    [4944, 4968) '<unknown>'
    [5008, 5032) '<unknown>'
    [5072, 5096) '<unknown>'
    [5136, 5160) '<unknown>'
    [5200, 5224) '<unknown>'
    [5264, 5288) '<unknown>'
    [5328, 5352) '<unknown>'
    [5392, 5416) '<unknown>'
    [5456, 5480) '<unknown>'
    [5520, 5544) '<unknown>'
    [5584, 5608) '<unknown>'
    [5648, 5672) '<unknown>'
    [5712, 5736) 'label' (line 1059)
    [5776, 5800) '<unknown>'
    [5840, 5864) '<unknown>'
    [5904, 5928) '<unknown>'
    [5968, 5992) 'shift' (line 1172)
    [6032, 6056) '<unknown>'
    [6096, 6120) '<unknown>'
    [6160, 6184) 'search' (line 1262)
    [6224, 6248) 'mymeth' (line 1263)
    [6288, 6312) 'myindex' (line 1264)
    [6352, 6376) '<unknown>'
    [6416, 6440) '<unknown>'
    [6480, 6504) '<unknown>'
    [6544, 6568) '<unknown>'
    [6608, 6632) '<unknown>'
    [6672, 6696) '<unknown>'
    [6736, 6760) '<unknown>'
    [6800, 6824) '<unknown>'
    [6864, 6888) '<unknown>'
    [6928, 6952) '<unknown>'
    [6992, 7016) '<unknown>'
    [7056, 7080) '<unknown>'
    [7120, 7144) '<unknown>'
    [7184, 7208) '<unknown>'
    [7248, 7273) 'coords_type' (line 510)
    [7312, 7384) 'translationVectors' (line 495) <== Memory access at offset 7384 overflows this variable
    [7424, 7496) 'Q' (line 680)
    [7536, 7608) 'quad' (line 689)
    [7648, 7720) '<unknown>'
    [7760, 7832) 'Q' (line 707)
    [7872, 7944) 'pol' (line 716)
    [7984, 8056) '<unknown>'
    [8096, 40864) 'buffer' (line 447)
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
      (longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-overflow ./include/openbabel/math/vector3.h:89 in OpenBabel::vector3::Set(double, double, double)
Shadow bytes around the buggy address:
  0x3fffe630: f2 f2 f2 f2 00 00 00 f2 f2 f2 f2 f2 00 00 00 f2
  0x3fffe640: f2 f2 f2 f2 00 00 00 f2 f2 f2 f2 f2 00 00 00 f2
  0x3fffe650: f2 f2 f2 f2 00 00 00 f2 f2 f2 f2 f2 00 00 00 f2
  0x3fffe660: f2 f2 f2 f2 00 00 00 f2 f2 f2 f2 f2 00 00 00 f2
  0x3fffe670: f2 f2 f2 f2 00 00 00 01 f2 f2 f2 f2 00 00 00 00
=>0x3fffe680: 00 00 00 00 00[f2]f2 f2 f2 f2 00 00 00 00 00 00
  0x3fffe690: 00 00 00 f2 f2 f2 f2 f2 00 00 00 00 00 00 00 00
  0x3fffe6a0: 00 f2 f2 f2 f2 f2 00 00 00 00 00 00 00 00 00 f2
  0x3fffe6b0: f2 f2 f2 f2 00 00 00 00 00 00 00 00 00 f2 f2 f2
  0x3fffe6c0: f2 f2 00 00 00 00 00 00 00 00 00 f2 f2 f2 f2 f2
  0x3fffe6d0: 00 00 00 00 00 00 00 00 00 f2 f2 f2 f2 f2 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb

VENDOR RESPONSE

Since the maintainer of this software did not release a patch during the 90 day window specified in our policy, we have now decided to release the information regarding this vulnerability, to make users of the software aware of this problem. See Cisco’s Coordinated Vulnerability Disclosure Policy for more information: https://tools.cisco.com/security/center/resources/vendor_vulnerability_policy.html

TIMELINE

2022-12-20 - Initial Vendor Contact
2023-01-12 - Vendor Disclosure
2023-07-21 - Public Release

Credit

Discovered by Claudio Bozzato of Cisco Talos.

TALOS-2022-1665

TALOS-2022-1676

Intelligence Center

Vulnerability Research

Security Resources

Media

Company