Talos Vulnerability Report

TALOS-2018-0637

Epee Levin Packet Deserialization Code Execution Vulnerability

September 25, 2018
CVE Number

CVE-2018-3972

Summary

An exploitable code execution vulnerability exists in the Levin deserialization functionality of the epee library. A specially crafted network packet can cause a logic flaw, resulting in code execution. An attacker can send a packet to trigger this vulnerability.

Tested Versions

Monero ‘Lithium Luna’ (v0.12.2.0-master-ffab6700)

Product URLs

https://github.com/sabelnikov/epee/tree/master/include https://github.com/monero-project/monero

CVSSv3 Score

10.0 - CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:C/C:H/I:H/A:H

CWE

CWE-704: Incorrect Type Conversion or Cast

Details

The Levin network protocol is an implementation of peer-to-peer (P2P) communications found in a large number of cryptocurrencies, including basically any cryptocurrency that is a descendant from the CryptoNote project. A few different implementations of Levin are in existence, but of note today is the epee library, an open-source Levin library. The epee library can be found in a large subset of projects that use Levin, most notably the cryptocurrency Monero.

Before getting in-depth into the bug, a brief overview of the Levin protocol is required. An example of a normal P2P handshake between Monero nodes is below with some explanation. (Note: All numeric values are little-endian, unless otherwise stated).

// Levin COMMAND_HANDSHAKE_T
# Signature
\x01\x21\x01\x01\x01\x01\x01\x01 		//[1]
# Size (~total packet-0x20)
\xe2\x00\x00\x00\x00\x00\x00\x00     	//[2]
# response or request?
\x01
# opcode (handshake)
\xe9\x03\x00\x00\x00\x00\x00\x00		//[3]
# more signature.
\x01\x00\x00\x00\x01\x00\x00\x00\x01\x11\x01\x01\x01\x01\x02\x01\x01

The above struct can be found in every Levin request, starting with a hardcoded QWORD signature [1], the QWORD size of the total packet, another byte that likely determines if it’s a request or response, a QWORD opcode [3], and then, finally, more hardcoded signature bytes. The above bytes are for the COMMAND_HANDSHAKE_T command, opcode 1001, and the exact definition can be found here: p2p_protocol_defs.h.

Assuming all the hardcoded signature checks pass, the rest of the data starts to get deserialized, with the code now looking for opcode specific structures and data. Referring back to the COMMAND_HANDSHAKE_T opcode, the following structures must be found for a valid parse:

    struct request
    {
      basic_node_data node_data;
      t_playload_type payload_data;

      BEGIN_KV_SERIALIZE_MAP()
        KV_SERIALIZE(node_data)
        KV_SERIALIZE(payload_data)
      END_KV_SERIALIZE_MAP()
   };

Within epee::serialization::portable_storage_from_bin.h, the code will start to try to look for a Levin message with two sections, a node_data section and payload_data section. A dump of some bytes will explain what this entails more clearly:

# read_varint() => # of section entries. (8>>2 => 2 sections in message)
\x08 				//[1]
# Section Entry 1
\x09node_data			//[2]
# Object, 4 members
\x0c\x10			//[3] 

The first byte read [1] is the amount of sections within the serialized packet, read in via the read_varint() method, a quick explanation is that the smallest two bits of the first byte determine how many bytes to read in for the total read value (1,2, 4, or 8). Thus, if the first byte is \x82, we’d read in another three bytes from the packet, append that to \x82 & 0xFC and treat it as a little-endian value for the total length of the rest of the packet data.

The next portion at [2] is easier, a byte determining the length of a section name, followed by the section name. The names must match 1-1 for the section names given in the request struct.

At [3], the first section’s data is read in and deserialized. The first byte determines the type, and then based off of that type, the code flow changes. The list of available types is as follows:

#define SERIALIZE_TYPE_INT64                1
#define SERIALIZE_TYPE_INT32                2
#define SERIALIZE_TYPE_INT16                3
#define SERIALIZE_TYPE_INT8                  4
#define SERIALIZE_TYPE_UINT64             5
#define SERIALIZE_TYPE_UINT32             6
#define SERIALIZE_TYPE_UINT16             7
#define SERIALIZE_TYPE_UINT8               8
#define SERIALIZE_TYPE_DUOBLE          9
#define SERIALIZE_TYPE_STRING            0xA
#define SERIALIZE_TYPE_BOOL                0xB
#define SERIALIZE_TYPE_OBJECT            0xC
#define SERIALIZE_TYPE_ARRAY             0xD

Since the type was \x0C, it’s defined as an object, and the epee::serialization::read_se<section>() function is called, which tries to read in a section struct from the packet, which is defined as follows:

	//portable_storage_base.h
	struct section{
	  std::map<std::string, storage_entry> m_entries;
	};

To help lay some more groundwork, here are some assorted structure definitions that will be referred to later:

typedef boost::variant<uint64_t, uint32_t, uint16_t, uint8_t, int64_t, int32_t, int16_t, int8_t, double, bool, std::string, section, array_entry> storage_entry;
	typedef  boost::make_recursive_variant<
	  array_entry_t<section>, 
	  array_entry_t<uint64_t>, 
	  array_entry_t<uint32_t>, 
	  array_entry_t<uint16_t>, 
	  array_entry_t<uint8_t>, 
	  array_entry_t<int64_t>, 
	  array_entry_t<int32_t>, 
	  array_entry_t<int16_t>, 
	  array_entry_t<int8_t>, 
	  array_entry_t<double>, 
	  array_entry_t<bool>, 
	  array_entry_t<std::string>,
	  array_entry_t<section>, 
	  array_entry_t<boost::recursive_variant_> 
	>::type array_entry;

        template<class t_entry_type>
 	struct array_entry_t {
	//[…] truncated for brevity. 
	// It’s just a fancy container for an array of <t_entry_type>
                   std::list<t_entry_type> m_array;
}

To recap on our basic example, we’re currently reading in the next set of bytes from our packet as a node_data object. In order to actually deserialize into the node_data struct, the definition is needed:

  struct basic_node_data
  {
   	 	uuid network_id;                   
   	 	uint64_t local_time;
   	 	uint32_t my_port;
   	 	peerid_type peer_id;
BEGIN_KV_SERIALIZE_MAP()
 	 	KV_SERIALIZE_VAL_POD_AS_BLOB(network_id)
 	 	KV_SERIALIZE(peer_id)
 	 	KV_SERIALIZE(local_time)
 	 	KV_SERIALIZE(my_port)
		END_KV_SERIALIZE_MAP()
  	};

Which on the wire looks like:

# Each entry follows <Byte nameLen><name><type><value> schema. 
# Qword local_time    //[1]
\x0alocal_time\x05\x2atC[\x00\x00\x00\x00
# Dword my_port 
\x07my_port\x06\x00\x00\x00\x00
# String network_id    //[2]
\x0anetwork_id\x0a@\x120\xf1qa\x04Aa\x171\x00\x82\x16\xa1\xa1\x11
# Qword peer_id        //[3]
\x07peer_id\x05\x07\x90\xaa7\xaeO\x06\xaa

Assuming that the types of all the bytes are accurate to the structure definition (a mismatch will throw an error) or that the provided types can be appropriately converted to the destination type (a note on this later), each of the individual fields is then also sanitized when it is read in, and then the values themselves must also pass checks for the variable that they fill. For example, the network ID field [1] must correspond to one of three hardcoded strings, depending on if the node is on the testnet, staging, or production Monero P2P network.

Regarding the type conversions, there are codified conversions that can be performed if the type found in the network packet is a compatible type to that found in the serialization struct. For example, unsigned int to signed int, or vice versa. Most interestingly is that due to compatibility issues with other Monero clients, strings can be converted to uint64_t values, if they match certain regular expressions.

The rest of the packet is similarly deserialized according to the COMMAND_HANDSHAKE type, resulting in the remainder of the packet deserializing into the CORE_SYNC_DATA struct found below:

struct CORE_SYNC_DATA { uint64_t current_height; uint64_t cumulative_difficulty; crypto::hash top_id; uint8_t top_version; BEGIN_KV_SERIALIZE_MAP() KV_SERIALIZE(current_height) KV_SERIALIZE(cumulative_difficulty) KV_SERIALIZE_VAL_POD_AS_BLOB(top_id) KV_SERIALIZE_OPT(top_version, (uint8_t)0) END_KV_SERIALIZE_MAP() };

Which corresponds to the following network bytes:

\x0cpayload_data
# Object, 4 entries. (SYNC_DATA)
\x0c\x10
# Qword cumulative_difficulty
\x15cumulative_difficulty\x05\xb1m\x00\x00\x00\x00\x00\x00
# Qword current_height (referring to blockchain height)
\x0ecurrent_height\x05\x12\x00\x00\x00\x00\x00\x00\x00
# String top_id (of top block)
\x06top_id\x0a\x80\x09\xd7\xd4G\x07\x1d\x90\x86o\xc3\xb0\xef\xda\x8a\xa0\xc6\xe8\xb6	\x01\x8aQ\x0dF\x1ds\xd5jzb\x166\x82
# Byte, version of top block. 
\x0btop_version\x08\x01

Now that the basic example has been covered, we can start to examine the actual bug, which occurs in the epee::serialization::throwable_buffer_reader class from portable\_storage\_from\_bin.h, which controls the deserialization of each individual data type.

All the code from the first object read described above (node_data) onward has been located within this class, so there’s only a few more new things to discuss, the first of which is the throwable_buffer_reader::load_storage_entry() function, which is where the code flow diverges based on deserialization object type.

inline storage_entry throwable_buffer_reader::load_storage_entry() {
	  RECURSION_LIMITATION();
	  uint8_t ent_type = 0;
	  read(ent_type);
	  if(ent_type&SERIALIZE_FLAG_ARRAY)
	    	return load_storage_array_entry(ent_type);     // [1]

  	switch(ent_type)
  	{
  		case SERIALIZE_TYPE_INT64:  return read_se<int64_t>();
	case SERIALIZE_TYPE_INT32: return read_se<int32_t>();
	//….
  		case SERIALIZE_TYPE_STRING: return read_se<std::string>();
  		case SERIALIZE_TYPE_OBJECT: return read_se<section>();
	case SERIALIZE_TYPE_ARRAY: return read_se<array_entry>();
}
//...
}

If the type has the SERIALIZE_FLAG_ARRY set at [1], instead of loading a singular data entry, we instead load an array of entries, the code for which is:

inline storage_entry throwable_buffer_reader::load_storage_array_entry(uint8_t type)
{
  RECURSION_LIMITATION();
  type &= ~SERIALIZE_FLAG_ARRAY;
  switch(type){
    case SERIALIZE_TYPE_INT64:  return read_ae<int64_t>();
    case SERIALIZE_TYPE_INT32:  return read_ae<int32_t>();
    case SERIALIZE_TYPE_INT16:  return read_ae<int16_t>();
    case SERIALIZE_TYPE_INT8:   return read_ae<int8_t>();
    case SERIALIZE_TYPE_UINT64: return read_ae<uint64_t>();
    case SERIALIZE_TYPE_UINT32: return read_ae<uint32_t>();
    case SERIALIZE_TYPE_UINT16: return read_ae<uint16_t>();
    case SERIALIZE_TYPE_UINT8:  return read_ae<uint8_t>();
    case SERIALIZE_TYPE_DUOBLE: return read_ae<double>();
    case SERIALIZE_TYPE_BOOL:   return read_ae<bool>();
    case SERIALIZE_TYPE_STRING: return read_ae<std::string>();
    case SERIALIZE_TYPE_OBJECT: return read_ae<section>();
    case SERIALIZE_TYPE_ARRAY:  return read_ae<array_entry>();
    default: 
      CHECK_AND_ASSERT_THROW_MES(false, "unknown entry_type code = " << type);
  }
}  

In essence, the only real difference is read_ae versus read_se being used, so what’s the difference between those functions? A quick listing of the function definition provides a clue:

//start read_se definitions:
 	template<class t_type>storage_entry throwable_buffer_reader::read_se()
template<> inline storage_entry throwable_buffer_reader::read_se<std::string>()
	template<> inline storage_entry throwable_buffer_reader::read_se<section>()
	template<> inline storage_entry throwable_buffer_reader::read_se<array_entry>()
/// Start read_ae definition:
template<class type_name> storage_entry throwable_buffer_reader::read_ae()

Since the code is inline template based C++ code, the readability (and reversibility) of the code is not too great, but we can quickly see that read_se() has defined different functions for the string, section, and array_entry types, whereas read_ae() has not. Keeping that in mind, let’s examine the read_ae<type_name>() source:

template<class type_name> storage_entry throwable_buffer_reader::read_ae(){
      RECURSION_LIMITATION();
      //for pod types
      array_entry_t<type_name> sa;   //[1]
      size_t size = read_varint();      
      //TODO: add some optimization here later
      while(size--)
        sa.m_array.push_back(read<type_name>());    //[2]
      return storage_entry(array_entry(sa));
}

At [1], an array_entry_t<type_name> is created, depending on whatever type we give it, followed by a call to read<type_name>() at [2], which specifically refers to the epee::serialization::throwable_buffer_reader::read<type_name>() function. Looking through the source again, the different read<type_name>() definitions are given:

 inline void throwable_buffer_reader::read(section& sec)
 inline void throwable_buffer_reader::read(std::string& str)
 template<class t_pod_type>void throwable_buffer_reader::read(t_pod_type& pod_val) //[1]
 template<class t_type> t_type throwable_buffer_reader::read()   

The most important thing to note here is the lack of a specific definition for an array_entry object, unlike that for the section and string object types. Apparently though, libboost has deemed that the array_object is a t_pod_type, and as such, it uses the function at [1], which is rather simple:

template<class t_pod_type> void throwable_buffer_reader::read(t_pod_type& pod_val){
  		RECURSION_LIMITATION();
  		read(&pod_val, sizeof(pod_val)); //[1]
}

And then we arrive at the last function definition of [1]:

inline void throwable_buffer_reader::read(void* target, size_t count){
  		RECURSION_LIMITATION();
  		CHECK_AND_ASSERT_THROW_MES(m_count >= count, " attempt to read " << count << " bytes from buffer with " << m_count << " bytes remained");
  		memcpy(target, m_ptr, count);
  		m_ptr += count;
  		m_count -= count
}

The issue lies in the definition of what we are actually reading into. If we select an array_entry object here, we end up doing a memcpy directly into that object. A refresher as to the object type:

	typedef  boost::make_recursive_variant<
	  array_entry_t<section>, 
	  array_entry_t<uint64_t>, 
	  array_entry_t<uint32_t>, 
	  array_entry_t<uint16_t>, 
	  array_entry_t<uint8_t>, 
	  array_entry_t<int64_t>, 
	  array_entry_t<int32_t>, 
	  array_entry_t<int16_t>, 
	  array_entry_t<int8_t>, 
	  array_entry_t<double>, 
	  array_entry_t<bool>, 
	  array_entry_t<std::string>,
	  array_entry_t<section>, 
	  array_entry_t<boost::recursive_variant_> 
	>::type array_entry; 

It will end up directly copying onto a boost::recursive_variant object, resulting in a crash in the following disassembly:

.text:00005607F7ECB301                 mov     rax, [rsp+0F8h+type_ptr_copy_1]
.text:00005607F7ECB309                 mov     [newObj+18h], rdx
.text:00005607F7ECB30D                 mov     [newObj+20h], rax
.text:00005607F7ECB311                 mov     [rax], rdi      		//[1]
.text:00005607F7ECB314                 mov     [rdx+8], rdi
.text:00005607F7ECB318                 mov     rax, [rsp+0F8h+var_50]
.text:00005607F7ECB320                 mov     [rsp+0F8h+var_50], 0
.text:00005607F7ECB32C                 mov     [newObj+28h], rax
.text:00005607F7ECB330                 mov     eax, [rsp+0F8h+ptr_to_read2_dst]
.text:00005607F7ECB337                 mov     [rsp+0F8h+type_ptr_copy_0], rbp
.text:00005607F7ECB33F                 mov     [rsp+0F8h+type_ptr_copy_1], rbp

At [1], $rax is controlled directly from the serialized packet sent, with $rdi pointing to a user-controlled buffer:

<(^_^)> info reg rax rdi
rax            0x4141414141414141       0x4141414141414141
rdi            0x7fff8c003818   0x7fff8c003818
<(^_^)> x/10gx $rdi
0x7fff8c003818: 0x4343434343434343      0x4141414141414141
0x7fff8c003828: 0x0000000000000000      0x0000000000000000

Crash Information

Thread 2 received signal SIGSEGV, Segmentation fault.
[Switching to Thread 50161.50211]
────[ registers ]────
$rax   : 0x00007f12e4003710  →  0x0f00000060c0c748
$rbx   : 0x00007f12e40037c0  →  0x0000000000000000
$rcx   : 0xccccccccccccc305
$rdx   : 0x0f00000060c0c748
$rsp   : 0x00007f12efffcbf0  →  0x00007f12efffcc60  →  0x00007f12efffcc60  →  [loop detected]
$rbp   : 0x00007f12efffcc88  →  0xffffffffff600000
$rsi   : 0xffffffffff600000  →  0xffffffffff600000
$rdi   : 0x0000000000000000
$rip   : 0x000055c8a8a2f46d  →  <boost::variant<unsigned+0> mov QWORD PTR [rcx], rax
$r8    : 0x00007f12e40000c8  →  0x00007f12e40000b8  →  0x00007f12e40000a8  →  0x00007f12e4000098  →  0x00007f12e4000088  →  0x00007f12e4003730  →  "ion_cont1"
$r9    : 0x0000000000000030
$r10   : 0x00007f12efffdee0  →  0x00007f12e40038e0  →  0x0102010101011101
$r11   : 0x0000000000000001
$r12   : 0x00007f12efffcc50  →  0x00007f12efffd930  →  0x0000000000000007
$r13   : 0x00007f12efffcc80  →  0x000000000000000d
$r14   : 0x00007f12efffd930  →  0x0000000000000007
$r15   : 0x00007f12efffcc40  →  0x00007f12efffd930  →  0x0000000000000007
$eflags: [carry PARITY ADJUST zero SIGN trap INTERRUPT direction overflow RESUME virtualx86 identification]
──────────────[ stack ]────
0x00007f12efffcbf0│+0x00: 0x00007f12efffcc60  →  0x00007f12efffcc60  →  [loop detected]  ← $rsp
0x00007f12efffcbf8│+0x08: 0x0000000000000020
0x00007f12efffcc00│+0x10: 0x00007f12efffd920  →  0x00007f12e4003966  →  0x00000000000003e9
0x00007f12efffcc08│+0x18: 0xffffffffff600000
0x00007f12efffcc10│+0x20: 0x00007f12efffd360  →  0x000055c80000000a
0x00007f12efffcc18│+0x28: 0x00007f12efffcc30  →  0x00007f12efffd930  →  0x0000000000000007
0x00007f12efffcc20│+0x30: 0x0000000000000000
0x00007f12efffcc28│+0x38: 0x00007f12e4003924  →  0x6141046171f13012
────────────[ code:i386:x86-64 ]────
   0x55c8a8a2f459 <boost::variant<unsigned+0> je     0x55c8a8a2f4f9 <_ZN4epee13serialization23throwable_buffer_reader7read_aeIN5boost7variantINS3_6detail7variant14recursive_flagINS0_13array_entry_tINS0_7sectionEEEEEJNS8_ImEENS8_IjEENS8_ItEENS8_IhEENS8_IlEENS8_IiEENS8_IsEENS8_IaEENS8_IdEENS8_IbEENS8_INSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEESA_NS8_INS3_18recursive_variant_EEEEEEEENS4_ImJjthlisadbSR_S9_SV_EEEv+889>
   0x55c8a8a2f45f <boost::variant<unsigned+0> mov    rsi, rcx
   0x55c8a8a2f462 <boost::variant<unsigned+0> mov    QWORD PTR [rax], rdx
   0x55c8a8a2f465 <boost::variant<unsigned+0> mov    rcx, QWORD PTR [rcx+0x8]
   0x55c8a8a2f469 <boost::variant<unsigned+0> mov    QWORD PTR [rax+0x8], rcx
 → 0x55c8a8a2f46d <boost::variant<unsigned+0> mov    QWORD PTR [rcx], rax
   0x55c8a8a2f470 <boost::variant<unsigned+0> mov    rcx, rsi
   0x55c8a8a2f473 <boost::variant<unsigned+0> mov    QWORD PTR [rdx+0x8], rax
   0x55c8a8a2f477 <boost::variant<unsigned+0> mov    rdx, QWORD PTR [rsi+0x10]
   0x55c8a8a2f47b <boost::variant<unsigned+0> mov    QWORD PTR [rax+0x10], rdx
   0x55c8a8a2f47f <boost::variant<unsigned+0> mov    QWORD PTR [rcx], rsi
────────────[ trace ]────
[#0] 0x55c8a8a2f46d → Name: boost::variant<unsigned long, unsigned int, unsigned short, unsigned char, long, int, sho...
[#1] 0x55c8a8a2f62b → Name:it’s 2p epee::serialization::throwable_buffer_reader::load_storage_array_entry[abi:cxx11](unsigne...
[#2] 0x55c8a8a30216 → Name: epee::serialization::throwable_buffer_reader::load_storage_entry[abi:cxx11]()()...
[#3] 0x55c8a8a3057f → Name: epee::serialization::throwable_buffer_reader::read(epee::serialization::section&)()...
[#4] 0x55c8a8a3004a → Name: epee::serialization::throwable_buffer_reader::load_storage_entry[abi:cxx11]()()...
[#5] 0x55c8a8a3057f → Name: epee::serialization::throwable_buffer_reader::read(epee::serialization::section&)()...
[#6] 0x55c8a8a30ee7 → Name: epee::serialization::portable_storage::load_from_binary(std::__cxx11::basic_string<char, ...
[#7] 0x55c8a8a7b8b4 → Name: int epee::net_utils::buff_to_t_adapter<nodetool::node_server<cryptonote::t_cryptonote_pro...
────────────────────────────────────────────────────────
0x000055c8a8a2f46d in boost::variant<unsigned long, unsigned int, unsigned short, unsigned char, long, int, short, signed char, double, bool, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, epee::serialization::section, boost::variant<boost::detail::variant::recursive_flag<epee::serialization::array_entry_t<epee::serialization::section> >, epee::serialization::array_entry_t<unsigned long>, epee::serialization::array_entry_t<unsigned int>, epee::serialization::array_entry_t<unsigned short>, epee::serialization::array_entry_t<unsigned char>, epee::serialization::array_entry_t<long>, epee::serialization::array_entry_t<int>, epee::serialization::array_entry_t<short>, epee::serialization::array_entry_t<signed char>, epee::serialization::array_entry_t<double>, epee::serialization::array_entry_t<bool>, epee::serialization::array_entry_t<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, epee::serialization::array_entry_t<epee::serialization::section>, epee::serialization::array_entry_t<boost::recursive_variant_> > > epee::serialization::throwable_buffer_reader::read_ae<boost::variant<boost::detail::variant::recursive_flag<epee::serialization::array_entry_t<epee::serialization::section> >, epee::serialization::array_entry_t<unsigned long>, epee::serialization::array_entry_t<unsigned int>, epee::serialization::array_entry_t<unsigned short>, epee::serialization::array_entry_t<unsigned char>, epee::serialization::array_entry_t<long>, epee::serialization::array_entry_t<int>, epee::serialization::array_entry_t<short>, epee::serialization::array_entry_t<signed char>, epee::serialization::array_entry_t<double>, epee::serialization::array_entry_t<bool>, epee::serialization::array_entry_t<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, epee::serialization::array_entry_t<epee::serialization::section>, epee::serialization::array_entry_t<boost::recursive_variant_> > >() ()

Timeline

2018-08-02 - Vendor Disclosure
2018-08-07 - Vendor acknowledged testing of patch
2018-09-25 - Vendor released patch & publicly disclosed

Credit

Discovered by Lilith (>_>) of Cisco Talos.