Talos Vulnerability Report

TALOS-2021-1262

Microsoft Azure Sphere Kernel pwm_ioctl_apply_state kfree() code execution vulnerability

April 13, 2021
CVE Number

CVE-2021-28460

Summary

A code execution vulnerability exists in the kernel pwm_ioctl_apply_state functionality of Microsoft Azure Sphere 21.01. A specially crafted ioctl can lead to arbitrary kfree. An attacker can issue an ioctl to trigger this vulnerability.

Tested Versions

Microsoft Azure Sphere 21.01

Product URLs

https://azure.microsoft.com/en-us/services/azure-sphere/

CVSSv3 Score

8.1 - CVSS:3.0/AV:L/AC:H/PR:N/UI:N/S:C/C:H/I:H/A:H

CWE

CWE-590 - Free of Memory not on the Heap

Details

Microsoft’s Azure Sphere is a platform for the development of internet-of-things applications. It features a custom SoC that consists of a set of cores that run both high-level and real-time applications, enforces security and manages encryption (among other functions). The high-level applications execute on a custom Linux-based OS, with several modifications to make it smaller and more secure, specifically for IoT applications.

Among the device capabilities that Azure Sphere provides to developers, this advisory focuses on the “pwm” capability. If a developer defines "Pwm": [ "$MT3620_RDB_PWM_CONTROLLER0" ] within their app_manifest.json, they will be able to utilize the /dev/pwm* chardev drivers to control connected modules using pulse-width modulation. Worth noting, before Azure Sphere 20.10, this capability was not technically required, as there was an unauthorized /dev/security-monitor ioctl that could be used to enable /dev/pwm0, but regardless this is not the case anymore. Assuming that our Azure Sphere device has the /dev/pwm*, and we gain userland code execution to it, what is to be done? To start, let us examine the device driver’s ioctls:

static long pwm_chardev_ioctl(struct file *filp, unsigned int cmd,
                  unsigned long arg_)
{
    void __user *arg = (void __user *)arg_;
    struct pwm_dev *data = ((struct seq_file *)filp->private_data)->private;
    long res = 0;

    mutex_lock(&data->lock);
    if (!data->valid) {
        dev_err(data->device, "accessing deinitialized pwm\n");
        res = -ENODEV;
        goto out;
    }

    switch (cmd) {
    case PWM_APPLY_STATE:
        res = pwm_ioctl_apply_state(arg, data);
        break;
    case PWM_GET_STATE:
        res = pwm_ioctl_get_state(arg, data);
        break;
    case PWM_EXPORT:
        res = pwm_ioctl_export(arg, data);
        break;
    case PWM_UNEXPORT:
        res = pwm_ioctl_unexport(arg, data);
        break;
    default:
        if (data->chip->ops->ioctl)
            res = data->chip->ops->ioctl(data->chip, cmd, arg_);
        else
            res = -ENOTTY;
        break;
    }

out:
    mutex_unlock(&data->lock);
    return res;
}

Nothing fancy or concerning, just four different ioctls. To save time and reader’s patience, let’s just look at pwm_ioctl_apply_state():

// drivers/pwm/chardev.c 
static int pwm_ioctl_apply_state(void __user *arg, struct pwm_dev *data)
{
    int ret = 0;
    struct pwm_chardev_params input_data; // [1]
    void __user *user_extended_state;

    memset(&input_data, 0, sizeof(input_data));

    if (copy_from_user(&input_data, arg, sizeof(input_data))) {  //[2]
        ret = -EFAULT;
        goto out;
    }

    user_extended_state = input_data.state.extended_state;

    if (user_extended_state) {
        input_data.state.extended_state = kzalloc( //[...]

Starting out at [2], we read in the ioctl data to a kernel stack struct, in this case a pwm_chardev_params struct [1], which is structured as so:

/*  // include/uapi/linux/pwm.h
 *
 * struct pwm_state - state of a PWM channel
 * @period: PWM period (in nanoseconds)
 * @duty_cycle: PWM duty cycle (in nanoseconds)
 * @polarity: PWM polarity
 * @enabled: PWM enabled status
 * @extended_state: optional driver-specific state data
 * @extended_state_size: size of data pointed to by extended_state
 */
struct pwm_state {
    unsigned int period;
    unsigned int duty_cycle;
    enum pwm_polarity polarity;
    bool enabled;

    void *extended_state;
    size_t extended_state_size;
};

struct pwm_chardev_params {
    unsigned int pwm_index;
    struct pwm_state state;
};

Continuing on within pwm_ioctl_apply_state, after our pwm_chardev_params have been read:

user_extended_state = input_data.state.extended_state;

if (user_extended_state) {
    input_data.state.extended_state = kzalloc( // [1]
        input_data.state.extended_state_size, GFP_KERNEL);

    if (!input_data.state.extended_state) {
        ret = -ENOMEM;
        goto out;
    }
    if (copy_from_user(input_data.state.extended_state,  // [2]
               user_extended_state,
               input_data.state.extended_state_size)) {
        ret = -EFAULT;
        goto out;
    }
}

At [1], assuming our input structure has extra state data, the driver allocates a kernel chunk of corresponding size, and at [2], another copy_from_user copies this extra data from a pointer we control into this newly allocated chunk of kernel memory. Looking at what happens to this extra memory:

if (input_data.pwm_index >= data->chip->npwm) {  // [1]
    dev_err(data->device, "pwm_index %u does not exist, npwm: %u\n",
        input_data.pwm_index, data->chip->npwm);
    ret = -ENODEV;
    goto out;
}

ret = pwm_apply_state(&data->chip->pwms[input_data.pwm_index],  // [2]
              &input_data.state);
if (ret) {
    dev_err(data->device, "pwm_apply_state error: %d\n", ret);
    goto out;
}

out:
    // pwm_apply_state keeps a copy of extended_state, so free this
    kfree(input_data.state.extended_state);  //[3]
    return ret;
}

After basic error checking at [1], the data is applied to the pwm chip via pwm_apply_state at [2]. Finally at [3], that allocated chunk with the extra copied data is freed. With the vulnerability in sight, but relatively camouflaged, let us examine a particular code flow possible within pwm_ioctl_apply_state, and rid ourselves of all the untrod path:

static int pwm_ioctl_apply_state(void __user *arg, struct pwm_dev *data)
{
    int ret = 0;
    struct pwm_chardev_params input_data;
    void __user *user_extended_state;

    memset(&input_data, 0, sizeof(input_data));

    if (copy_from_user(&input_data, arg, sizeof(input_data))) { //[1]
        ret = -EFAULT;
        goto out;
    }
    
//[....]

out:
    // pwm_apply_state keeps a copy of extended_state, so free this
    kfree(input_data.state.extended_state);  // [2]
    return ret;
}

The copy_from_user function’s most well known fail case is when an invalid address is passed in, i.e. when a sneaky user tries to pass a userland address as a kernel address or vice versa. Another case is when copy_from_user fails in copying all of the bytes. Assume for instance that the last three bytes of our struct pwm_dev *data buffer are either unmapped or in unreadable memory. In this case, copy_from_user[1] will fail, return the amount of bytes that it failed to copy, and clear out that many bytes at the end of the destination buffer. If the last three bytes of our input buffer are unmapped or unreadable, copy_from_user [1] returns 0x3, and we jump down to the out: label, followed immediately by kfree at [2].

Continuing with this example, let’s look again at the pwm_chardev_params struct:

struct pwm_state {
    unsigned int period;
    unsigned int duty_cycle;
    enum pwm_polarity polarity;
    bool enabled;

    void *extended_state;              // [1]
    size_t extended_state_size;    // [2]
};

struct pwm_chardev_params {
    unsigned int pwm_index;
    struct pwm_state state;
};

Assuming the last three bytes of our struct are unreadable, that means input_data.state.extended_state_size [2] fails to be written and is cleared out, but everything before extended_state_size is still written to, including void *extended_state at [1]. Thus, taking a final look back at pwm_ioctl_apply_state:

out:
    // pwm_apply_state keeps a copy of extended_state, so free this
    kfree(input_data.state.extended_state);  // [1]
    return ret;
}

Since input_data.state.extended_state is written to, and is not cleared out if copy_from_user fails, the kfree at [1] is completely controlled by the attacker, allowing for an arbitrary kfree() on any address, resulting in code execution. Worth noting for the sake of triggering the vulnerability, the input pwm_chardev_params buffer must not be 32-bit aligned when passed to pwm_ioctl_apply_state. Otherwise copy_from_user will optimize and copy both void *extended_state and size_t extended_state_size in one fell swoop, failing on both, and leaving input_data.state.extended_state as 0x0.

As for the actual exploitation of this vulnerability, an information leak is needed as well, since the kernel heap is always in an indeterminate state at time of exploitation, even if KASLR is not enabled. To work around this heap indeterminability, one can use a vulnerability such as TALOS-2020-1211 to leak the heap address of certain kernel slab objects, kfree the object with the current vulnerability, and subsequently overlay two objects in the same kernel heap address. Actual Azure Sphere exploit implementation is an exercise for the reader, but we have confirmed it is possible to achieve arbitrary code execution in kernel with this vulnerability and TALOS-2020-1211.

Timeline

2021-02-26 - Vendor Disclosure
2021-04-13 - Public Release

Credit

Discovered by Lilith >_> of Cisco Talos.