Talos Vulnerability Report

TALOS-2021-1342

Microsoft Azure Sphere Security Monitor SMSyscallStageBaseManifests image validation signature check bypass vulnerability

November 9, 2021
CVE Number

CVE-2021-42300

Summary

A signature check bypass vulnerability exists in the Security Monitor SMSyscallStageBaseManifests image validation functionality of Microsoft Azure Sphere 21.01. A specially crafted manifest can lead to a firmware downgrade. An attacker can use syscalls to trigger this vulnerability.

Tested Versions

Microsoft Azure Sphere 21.01

Product URLs

https://azure.microsoft.com/en-us/services/azure-sphere/

CVSSv3 Score

6.0 - CVSS:3.0/AV:L/AC:L/PR:H/UI:N/S:C/C:N/I:H/A:N

CWE

CWE-347 - Improper Verification of Cryptographic Signature

Details

Microsoft’s Azure Sphere is a platform for the development of internet-of-things applications. It features a custom SoC that consists of a set of cores that run both high-level and real-time applications, enforces security and manages encryption (among other functions). The high-level applications execute on a custom Linux-based OS, with several modifications to make it smaller and more secure, specifically for IoT applications.

Processes with AZURE_SPHERE_CAP_* are allowed to interact with Pluton and Security Monitor, but only via the syscalls that they are allowed to access. For instance, when a user holds the AZURE_SPHERE_CAP_UPDATE_IMAGE capability, they are allowed to use the following Secmon syscalls:

static azure_sphere_sm_syscall_permission_t azure_sphere_sm_syscall_required_capabilities[] = {
    {.number = SMSyscallInvalidateImage, .caps = AZURE_SPHERE_CAP_UPDATE_IMAGE, .linux_caps = 0},
    {.number = SMSyscallOpenImageForStaging, .caps = AZURE_SPHERE_CAP_UPDATE_IMAGE, .linux_caps = 0},
    {.number = SMSyscallWriteBlockToStageImage, .caps = AZURE_SPHERE_CAP_UPDATE_IMAGE, .linux_caps = 0},
    {.number = SMSyscallCommitImageStaging, .caps = AZURE_SPHERE_CAP_UPDATE_IMAGE, .linux_caps = 0},
    {.number = SMSyscallAbortImageStaging, .caps = AZURE_SPHERE_CAP_UPDATE_IMAGE, .linux_caps = 0},
    {.number = SMSyscallInstallStagedImages, .caps = AZURE_SPHERE_CAP_UPDATE_IMAGE, .linux_caps = 0},
    {.number = SMSyscallGetComponentCount, .caps = AZURE_SPHERE_CAP_UPDATE_IMAGE, .linux_caps = 0},
    {.number = SMSyscallGetComponentSummary, .caps = AZURE_SPHERE_CAP_UPDATE_IMAGE, .linux_caps = 0},
    {.number = SMSyscallStageComponentManifests, .caps = AZURE_SPHERE_CAP_UPDATE_IMAGE, .linux_caps = 0},
    {.number = SMSyscallGetCountOfMissingImagesToDownload, .caps = AZURE_SPHERE_CAP_UPDATE_IMAGE, .linux_caps = 0},
    {.number = SMSyscallGetMissingImagesToDownload, .caps = AZURE_SPHERE_CAP_UPDATE_IMAGE, .linux_caps = 0},
    {.number = SMSyscallStageBaseManifests, .caps = AZURE_SPHERE_CAP_UPDATE_IMAGE, .linux_caps = 0},
    {.number = SMSyscallGetCountOfMissingBaseImagesToDownload, .caps = AZURE_SPHERE_CAP_UPDATE_IMAGE, .linux_caps = 0},
    {.number = SMSyscallGetMissingBaseImagesToDownload, .caps = AZURE_SPHERE_CAP_UPDATE_IMAGE, .linux_caps = 0},
    {.number = SMSyscallGetSoftwareRollbackInfo, .caps = AZURE_SPHERE_CAP_UPDATE_IMAGE, .linux_caps = 0},

For our present advisory we deal with the SMSyscallStageBaseManifests syscall, but it’s important to stress that an attacker would already needed to have elevated privileges or gained AZURE_SPHERE_CAP_UPDATE_IMAGE.

To start, let us examine the parameters that SMSyscallStageBaseManifests requires:

struct azure_sphere_syscall syscall = {};
syscall.number = SMSyscallStageBaseManifests;
syscall.flags = 0x454;
syscall.args[0] = offset;
syscall.args[1] = manifest_buffer;
syscall.args[2] = manifest_length;

The manifest_buffer must point to valid userspace memory, while manifest_length and offset are just integers without any specific restriction. When the kernel passes this call to Secmon, the only restriction that applies is that the manifest_buffer must be smaller than 0x1060 bytes, but that’s more a generic Security Monitor syscall aspect.

Let’s see how the SMSyscallStageBaseManifests function is implemented in Secmon:

uint32_t SMSyscallStageBaseManifests(int offset, char *buffer, uint manifest_len) {
    int some_global;
    uint32_t ret;

    some_global = get_global_image_struct();
    ret = FUN_803e2020(*(undefined4 **)(some_global + 0x14), (int *)(buffer + offset),
                     manifest_len);
    return ret;
}

This is simply calling the function FUN_803e2020, passing as parameters a global structure, a pointer to an offset inside the input buffer (which should point to the manifest), and the length of the manifest. FUN_803e2020 in turn passes the manifest buffer to the function FUN_803e1d34, which takes care of parsing and validating the manifest. If this is successful, FUN_803e2020 fills a global structure that keeps track of which images are due to be flashed via other syscalls.

Let’s look into FUN_803e1d34, function that parses and validates the manifest:

int FUN_803e1d34(int *manifest_buf,uint manifest_len,undefined *param_3,char *param_4,
                 manifest_entry_list *mentry,undefined4 someglobal) {
    // ... multiple manifests logic ... // [1]
    manifest_counter = 0;
    while( true ) {
        if (manifest_count == manifest_counter) { // [2]
            return 0;
        }
        if (manifest_len2 < 4) {
            return DAT_803e1ff8;
        }
        if (*(short *)manifest_buf == 0) {
            return DAT_803e1ff8;
        }
        manifest_parse_header(&manifest,(ushort *)manifest_buf);
        if ((char)manifest.ok == '\0') {
            return DAT_803e1ff8;
        }
        if (manifest_len2 < manifest.header_size) {
            return DAT_803e1ff8;
        }
        if (manifest.entry_size < 0x28) {
            return DAT_803e1ff8;
        }
        num_objs = (uint)*(ushort *)((int)manifest_buf + 2);
        if ((manifest_len2 - manifest.header_size) / manifest.entry_size < num_objs) {
            return DAT_803e1ff8;
        }
        uVar5 = num_objs * manifest.entry_size + manifest.header_size;
        if (uVar5 == 0) break;
        if (manifest_len2 < uVar5) {
            return DAT_803e1ff8;
        }
        uVar7 = 0x803e1e85;
        manifest_parse_header(&manifest,(ushort *)manifest_buf);
        if ((char)manifest.ok == '\0') {
            panic(DAT_803e200c,DAT_803e2008,0);
        }
        manifest._0_8_ = VectorShiftRight(CONCAT44(uVar7,uVar7),0x20);
        manifest.ok = (uint)PTR_FUN_803d37f0+1_803e2010;
        CreateImageMetadataParser(&aStack144,&manifest,manifest_len2); // [3]
        local_fc = 0;
        uVar5 = FindMetadataSection(&aStack144,0x4449,&meta_sect,0x24,&local_fc); // [4]
        if (((uVar5 & 0xfffff) == 0) && ((*param_4 == '\0' || (meta_sect.img_type != 0x19)))) { // [5]
            dummy = 0;
            if (meta_sect.img_type - 0x17 < 2) { // [6]
                local_f8 = 0;
                local_f0 = 0;
                local_e8 = 0;
                local_e0 = 0;
                manifest._0_8_ = 0;
                manifest.ok = 0;
                VectorShiftRight(CONCAT44(&local_e8,&local_e8),0x30);
                ptr_entry = validate_image(someglobal,&manifest,2,0,0,&dummy); // [7]
                if (ptr_entry == 0) {
                    print(PTR_s__Invalid_Manifest_Signature_803e201c);
                    return DAT_803e1ff8;
                }
            }
            ptr_entry = manifest.header_size + (int)manifest_buf;
            ptr_end = (manifest.entry_size & 0xffff) * num_objs + ptr_entry;
            while (ptr_end != ptr_entry) { // [8]
                // ... fill linked list
            }
        }
        manifest_counter = manifest_counter + 1;
        manifest_buf = (int *)((int)manifest_buf + manifest_len2);
        if ((int)manifest_counter < (int)manifest_count) {
            manifest_len2 = (uint)*(ushort *)((int)local_d8 + manifest_counter * 2 + 0xe);
        }
    }
    return DAT_803e1ff8;
}

At the beginning [1] we have the logic that deals with multiple manifests, which is not relevant for this advisory. Suffice it to say that it sets manifest_count and manifest_counter variables accordingly, but let’s assume that we’re dealing with only one manifest.
At [2] starts a series of sanity checks to make sure that the manifest fields are correct, again this is not relevant for this advisory, we can assume that the manifest fields are normally structured.
At [3] the CreateImageMetadataParser is used to initialize the parsing for the image metadata: this is the usual metadata found in Azure Sphere images, beginning with the “4X4M” magic. At [4] the section 0x4449 is searched for: this is the “Identity” section, where an image stores its image type, component and image IDs. At [5] we can see a check against the image type: if it is 0x19, the whole manifest is skipped, and only if it is smaller than 0x19 [6] is the manifest image verified by Pluton at [7]. The logic then continues at [8] by filling a linked list containing the component ID of the images found in the manifest, that are going to be passed to the parent function as result of the parsing.

At this point it is evident that, if a manifest has any image type larger than 0x19, the check at [5] will succeed, while the one at [6] will fail, avoiding the signature check. The code at [8] will then be executed without signature check

This means that it is possible to stage manifests with arbitrary contents without any signature, as long as the image type is larger than 0x19. Thanks to this, it is possible to add arbitrary component IDs to the manifest, which will allow to flash the relative image later on, via SMSyscallInstallStagedImages. Note that normally, after installing one image declared in the base manifest, the device will reboot automatically. This can be avoided by opening and committing multiple images, and calling SMSyscallInstallStagedImages just at the end of the process.

Normally, by using SMSyscallStageComponentManifests to stage a recovery manifest, it is only possible to re-flash the device’s firmwares with the same currently running version. However, via this issue, an attacker would be able to downgrade the whole device’s firmwares to a previous version. We’ve successfully tested the downgrade from 21.01 to 20.10 (note that all the images being flashed still need to pass the signature checks performed at boot time).
Clearly, it is also possible to mix and match different versions of the images being flashed, for example one could flash Pluton from version 21.01 and Secmon from version 20.10. Because of this, here could be several impacts for this issue: the most obvious one being a denial-of-service, since flashing two incompatible versions (it’s possible that two images won’t be able to even talk to each other if the interfaces have changed between versions) could end up bricking the device and requiring manual recovery.
A more interesting impact is the one where an attacker would find two versions of (for example) Secmon and Pluton that can coexist. This would be trivially possible when a new version of Secmon only addresses a set of critical issues, without changing functionality. An attacker would flash just the previous version of Secmon to later exploit possibly known vulnerabilities in the older version.
In the same vein, an attacker could simply downgrade the whole system, and later exploit older vulnerabilities still present therein.
Note that, flashing individual images this way will not erase the content of the whole flash, so any product-specific application that is installed in the device will continue to run on a previous version (as long as it’s still compatible).

Timeline

2021-07-19 - Vendor Disclosure<
2021-11-09 - Vendor Patch
2021-11-09 - Public Release

Credit

Discovered by Claudio Bozzato and Lilith >_> of Cisco Talos.