Monday, June 6, 2022

Exploiting the Wii U's USB Descriptor parsing

In this write-up we're going to take a look at exploiting the Wii U's USB Host Stack. Over the past few months I spent a lot of time reverse engineering USB related things on the Wii U. 

Overview

The Wii U contains an ARM chip running an embedded operating system called IOSU. IOSU consists of several modules which contain device drivers and other components not handled on the main PPC CPU.

The most relevant module for this write-up is IOS-USB, which handles all USB devices which can be plugged into the external USB ports.
IOS-USB contains UHS which is the USB Host Stack of the Wii U.

USB Descriptor parsing

Every USB device contains several descriptors which describe different information about the device to the host. All devices have a device descriptor and several configuration descriptors. Each descriptor has a size and type field.
After plugging in a device, UHS starts with reading and parsing the device descriptor. The device descriptor contains things like the USB version, the Vendor and Product ID of the device, and the amount of config descriptors.
After that all configuration descriptors are read from the device. The configuration descriptor is a bit more complicated than the device descriptor. Instead of having a fixed sized, the configuration can have any size specified in the wTotalLength field of the configuration descriptor. After the configuration descriptor, multiple interface and endpoint descriptors are appended to the configuration.

UHS will load up to 32 configurations into memory, and then parses all interface and endpoint descriptors of the first configuration.

The bug

To read the full configuration into a buffer, UHS starts with reading the first 9 bytes of the configuration, which contains the configuration descriptor. It then allocates a buffer from the heap with the total configuration size, which is specified in the wTotalLength field. The full configuration is now read into that buffer.

Since all USB descriptor values are stored as little-endian and the ARM is running as big-endian, some descriptor fields need to be byteswapped.
To byteswap all of the endpoint descriptor fields of the configuration the following code is used. This code usually also parses interface descriptors, which has been left out for simplicity.

The loop goes over all endpoint descriptors and byteswaps them until wTotalLength is reached. So, where's the bug?
UHS doesn't verify that wTotalLength matches the total length of the initially read config descriptor, which was used to determine the buffer size. This means the total length can be larger than the actual configuration, which allows pointing endpoint descriptors past the configuration buffer, causing out of bounds byteswaps.

Exploiting a 16-bit byteswap

So can that byteswap be exploited?
There are several devices which make emulating a USB device possible. Microcontrollers like the Raspberry Pi Pico allow full control over descriptors when emulating a USB client
To perform a byteswap of an endpoint descriptor the second byte of the descriptor needs to be 0x05, which is the endpoint descriptor type. Additionally the first byte cannot be 0, otherwise UHS will never break out of the byteswap loop and gets stuck since the offset never increases.
If these conditions are met, bytes 4 and 5 (wMaxPacketSize), containing the maximum packet size of that endpoint, will be swapped.

After looking for days for structures stored after the configuration on the heap which meet these conditions, I gave up. This seems extremely hard, if not impossible to exploit.

After working on other projects for a while, I decided to give this another try. This time I decided to look directly at the heap block headers. 

Heap Blocks

Each allocated block on the heap has a header which contains a magic value, the size of the block, a pointer to the previous block, and a pointer to the next block.

Example of a free heap block

The magic also indicates the state of this heap block. The following magic values are used:
  • 0xBABE0000: Free block
  • 0xBABE0001: Allocated outer block
  • 0xBABE0002: Allocated inner block
The magic never contains a 0x05, the size is rounded to 0x10 and we can only allocates blocks up to 0x10000 bytes in size, so we can't use any of those fields.
What if the previous pointer contains a 0x05 though? This would swap 2 bytes in the next pointer! Ideally the previous pointer should look something like XXXX05XX, this would cause the 2 bytes in the middle of the next pointer (byte 4 and 5 of the endpoint descriptor) to be swapped.

To achieve this we need to prepare the heap a bit. IOSU will merge consecutive free blocks, so we need a way to create "free holes" in the heap.
By placing an endpoint descriptor right at the end of the first configuration and setting the total size accordingly, we can swap 2 bytes in the magic of the next heap block.
Example of swapping the magic of the next configuration buffer

The magic value of Configuration 1's heap block is now
0xBEBA0001 instead of 0xBABE0001. Since the heap state is checked once the block is freed from the heap, this block can no longer be freed.
We can have up to 32 configurations, which will be allocated before parsing the first configuration. If we carefully choose various sizes for these descriptors we can create the ideal heap header. 
Let's connect a device which uses 6 configurations. By controlling the size of buffer 1 we can make sure the address of buffer 2 contains a 0x05This way we can create an ideal heap layout.
After disconnecting our emulated device, the heap looks like this:

The ideal heap layout
Red: corrupted blocks, Green: free Blocks
So what if we reconnect the emulated device and point an endpoint descriptor to the
prev pointer, which now matches the XXXX05XX pattern?
Swapping the next pointer of buffer 4
The
next pointer gets swapped and now points into the middle of the heap!
If we now reconnect the device, the next configuration buffer gets allocated in the middle of the heap, as long as what would be the size field of the memory next is pointing to is large enough. By controlling the size of buffer 5 we can control the address of the free heap block, allowing us to roughly control the next pointer.
This now allows overwriting existing buffers on the heap.

My initial idea was to point the next block directly into the stack, which is also allocated on the heap. Since UHS memsets the buffer after allocation, this would only result in a crash though. So I needed to find some other structure on the heap to overwrite.

UhsCtrlXferMgr

As the name implies, the "UHS Control Transfer Manager (UhsCtrlXferMgr)" manages transfers on the control endpoint of the device. In front of the transfer manager on the heap, is a large buffer where transfers on the control endpoint (Endpoint 0) are transferred to, called pEp0DmaBuf. If we point the next pointer into this buffer, we can overwrite the transfer manager after it.
Since each configuration can "only" be 0xffff bytes in size, we'll use 2 buffers to reach the transfer manager.
Overwriting UhsCtrlXferMgr

With full control over the transfer manager we can insert a custom transfer event into its' transfer queue. This allows us to transfer any amount of data anywhere.
It is now possible to start a transfer into the stack and get kernel code execution using a ROP chain.
The data for the kernel code can simply be placed into one of the existing config descriptors which have been allocated on the heap.
The ROP is based on the one in mocha and extremely similar to the one found in bluubomb, so check out the previous write-up if you're interested in details.

Conclusion

So what am I calling this?
UDPIH (pronounced like "mud pie" without the m), which stands for USB Descriptor Parsing Is Hard, since apparently it's really hard to properly parse these descriptors :P
It is now possible to get IOSU code execution by simply plugging in a Raspberry Pi Pico/Zero or similar device into the console. This even works before the PPC side has booted properly, allowing for things like CBHC bricks to be fixed.

Since everything using IOS-USB will shift around the heap, UDPIH roughly works after you can see the "Wii U" logo. This is right after IOS-NET and IOS-FS have registered their drivers for USB Mass storage and USB Ethernet, and before the Wii U menu has booted and queries connected hard drives.

Additionally to UDPIH itself, I'll also release a simple recovery menu which allows to fix several bricks from the IOSU side.

Hope this is useful for someone :)

Links: