MindShaRE: How to “Just Emulate It With QEMU”

May 27, 2020 | Vincent Lee

MindShaRE is our periodic look at various reverse engineering tips and tricks. The goal is to keep things small and discuss some everyday aspects of reversing. You can view previous entries in this series here.


Often, people dismiss router or IoT security research as easy. “Just emulate it with QEMU!” is usually what they say. They probably will also tell you to “just buy low, sell high” to make money in the stock market. The difficult, if not impossible part of trading stocks is knowing when the prices are at the lowest, and highest. Similarly, knowing how to emulate firmware with QEMU is probably the hardest part for a newcomer to the embedded security research scene. While I cannot offer much help with trading advice, I can help with firmware emulation.

Before we begin, I’ll assume your firmware is UNIX-like and not running on some other real-time operating system (RTOS) such as VxWorks or Windows CE. I’ll also assume you have your firmware decrypted/de-obfuscated and the root file system extracted. If you are having trouble with encrypted firmware, check out my earlier blog post on dealing with encrypted firmware.

Begin with the End in Mind

A key part of device emulation is having an end goal. Do you want to run just one binary? How about just one service? Or do you want full device emulation? Since getting the firmware emulation to work properly is a time-consuming feat, your goal will greatly influence your emulation strategy. Sometimes, the countless hours lost to tweaking the emulation will justify the purchase of some of these low-cost devices.

For running a single binary such as a decryption routine, consider the more light-weight user space QEMU emulation approach. If your goal is to write exploits, working with a physical device may be the best since exploits are ultimately used against real-world devices. Emulation may not account for subtle hardware behavior such as instruction caches, which could affect memory corruption exploits. However, emulation is perfectly fine for developing and testing exploits for higher-level vulnerabilities such as a CGI-script command injection or login logic flaws in PHP pages.

Determining CPU architecture and information gathering

The first step in emulating anything is determining the CPU architecture for our target. Usually, we can determine this without a device on hand. One way to determine the CPU type is through analyzing the firmware binaries. Running the file command on any binary can quickly tell us what CPU architecture we are dealing with:

Figure 1 – Outputs of the file and readelf commands

However, the file command does not provide the most detailed results. The readelf command with the -A option for ARM binary provides much more detailed CPU information that is vital for full system emulation and cross-compilation.

Another way of determining the CPU architecture when working with wireless routers (using the TP-Link TL-WR841-ND as an example [1]) is by searching for the device model on the Internet. This will usually land us on an OpenWRT device page that provides information on the device hardware. A quick search can also tell us the main System on Chip (SoC) part number as well as the device FCC ID. We can then look for the datasheet for the corresponding SoC chip and determine the exact CPU architecture.

This is also a great time to search for the family-specific processor core datasheet to familiarize yourself with the CPU. This datasheet will provide device-specific information such as load address and low-level memory layout which may help with emulation and analysis. You can also lookup the FCC filing reports to get a glimpse of the internal view of the device.

User-mode emulation

Per-process emulation is useful when only one binary needs to be emulated. There are two ways to emulate a single binary in user-mode QEMU. The first option is the user-mode process emulation. This can be done with one of the following commands:

qemu-mipsel -L <prefix> <binary>
qemu-arm -L <prefix> <binary>
qemu-<arch> -L <prefix> <binary>

The -L option is important for when the binary links to external dependencies such as uCLibc or encryption libraries. It tells the dynamic linker to look for dependencies with the provided prefix. Below is an example of running the imgdecrypt binary for the D-Link DIR-882 router.

Figure 2 - imgdecrypt

Another way to emulate the process is to perform a cross-architectural chroot with QEMU. To do this, we copy the qemu-<arch>-static binary to the /usr/bin/ directory of the firmware root file system. We then chroot into the firmware root and obtain a working shell:

Figure 3 - Using QEMU to perform a cross-architectural chroot

This is possible due to QEMU registering with the kernel during installation to handle binaries with certain magic bytes via the binfmt_misc mechanism. Therefore, this technique is incompatible with the Scratchbox cross-compilation toolkit, which leverages the same mechanism. You can find a more detailed explanation of the cross-architectural chroot in this StackOverflow post.

This method is my preferred first attempt to emulate a device. It is quick to set up and allows me to experiment with different binaries within the firmware root file system without worrying too much about dependencies. Note that in this mode of emulation, none of the userland services is initialized in the chroot shell, so none of the system or network services are available. However, this could be sufficient for running just one binary or testing one small component.

Bring out the big guns: Full system emulation

Sometimes, we’ll need to analyze the firmware more comprehensively and will benefit from full system emulation. There are many ways to fully emulate a device. Here are a few of the most common emulation techniques. These techniques have been used by researchers to find real bugs that were subsequently submitted to the ZDI program.

In the first part of the emulation process, we will use QEMU to create a full Linux virtual machine running on the target architecture. We then transfer the firmware root file system into the VM and chroot into the root file system of the firmware. To create a full VM running in QEMU we typically need the following things:

-- A QEMU disk image file (qcow2)
-- A Linux kernel image compiled for the target architecture
-- (sometimes) an initial RAM disk Image (initrd)

To get the above items, you can certainly set up a cross-compiler, build the kernel, and download an installer to get the initial RAM disk. You could then install Linux onto the QEMU disk image file. However, cross compiling a kernel is a substantial side quest for the casual bug hunter or Linux beginner. If you are interested in preparing these files yourself, check out the links in the further reading section. In this blog, we’ll take a simpler approach. We will download and use the pre-built Debian images prepared by Aurelien Jarn, a Debian developer. Alternatively, you could use the images provided by Chris (@_hugsy_), the author of the “gef” plugin.

With all the files in place, we can start a QEMU VM with the proper CPU architecture with one of the following commands:

The -M, or the -machine option, specifies the board model that QEMU supports, this option allows the user to select the target hardware platform. The -append options lets you tweak the kernel options passed into the Linux kernel. I like to put the QEMU command into a bash script to speed up the process of making adjustments and starting of the VM. Additionally, we should append the following options to the QEMU call to connect the network interfaces and add port forwarding settings:

-net user,hostfwd=tcp::80-:80,hostfwd=tcp::443-:443,hostfwd=tcp::2222-:22 \
-net nic

Adding these options will allow us to communicate with the VM via SSH through port 2222 of the host computer as well as the HTTP and HTTPS pages of the emulated firmware.

Figure 4 - Starting a pre-built Debian image

Once the VM is booted up and gives us a working Debian VM, the second part of the emulation begins. Transfer the root file system of the firmware to the VM using SCP or HTTP. I find packing up the whole root file system in a TAR ball is the most effective way to handle the transfer. We then need to mount the /proc, /dev, and /sys directories of the VM to the corresponding files in the firmware file system. Finally, we chroot into the firmware file system using the following command:

chroot ~/firmware_rootfs /bin/sh

The second option tells chroot to run /bin/sh after changing the root directory. You may be required to change this command to /bin/bash or /bin/busybox to obtain a working shell.

Figure 5 - Busybox

With a working shell, we can navigate to /etc/rc.d or /etc/init.d and run the appropriate RC script to kick off the userland services. Closely analyze the rc.d folder and inspect the scripts, you’ll need to tweak the startup scripts to account for missing network interfaces, failing of NVRAM library call, and all sorts of fun stuff. This part of the emulation process is very much like dealing with encrypted firmware; each firmware will be an adventure of its own which is the very definition of research. Often, you’ll want to tweak the rcS scripts just enough to get the target service to run properly. This part of the process can take up weeks of investigation and additional work.

Emulation is a lot of work. Sometimes standing on the shoulders of giants is the better way to go. There are two main projects that help speed up the process of firmware emulation, namely Firmadyne and ARM-X.

60% of the time, it works every time: Firmadyne

Firmadyne is great when it works. It is a firmware emulation platform that attempts to automagically emulate Linux-based device firmware. Firmadyne supports both MIPS and ARM processors. It will extract the root file system, infer network interfaces, and create the QEMU disk image for emulation. It also attempts to emulate the NVRAM. If you need full system emulation for a new target, I recommend giving Firmadyne a try first. You can then attempt to fix some of the errors it runs into before trying other emulation techniques. I have experienced trouble running Firmadyne using newer QEMU releases. However, using Docker to install it typically avoids this problem.

ARM-X

The ARM-X Firmware Emulation Framework targets ARM-based firmware. It is a collection of kernels, scripts, and filesystems to emulate firmware with QEMU. It comes with a few emulation configuration examples to help you with your project. I recommend watching the hour-long Hack In The Box 2019 presentation recording by Saumil Shah (@therealsaumil) on YouTube before trying out the ARM-X VM. If you are completely new to IoT firmware research, the presentation is also a great resource to start with.

Conclusion

Hopefully, with the help of this blog, you are ready to “just emulate it with QEMU.” All the techniques demonstrated above (including ARM-X and Firmadyne) were used in various submissions to our program. All roads may lead to Rome, but there’s not a single, fixed way to emulate firmware with QEMU. Explore different techniques and see what works for you. Familiarize yourself with the knowledge to wield the beast named QEMU and you will be surprised at how it can help you in unexpected ways. Finally, I would love to learn about your emulation techniques and look forward to your next submission.

You can find me on Twitter @TrendyTofu, and follow the team for the latest in exploit techniques and security patches.

Further Reading

MIPSEL QEMU Image Preparation
https://blahcat.github.io/2017/07/14/building-a-debian-stretch-qemu-image-for-mipsel/

Debian on an emulated ARM machine
https://www.aurel32.net/info/debian_arm_qemu.php

Debian on an emulated MIPS(EL) machine
https://www.aurel32.net/info/debian_mips_qemu.php

Arm1176 (ARMv6) QEMU Emulation
https://azeria-labs.com/emulate-raspberry-pi-with-qemu/

AArch64 (ARMv8) QEMU Image Preparation
https://blahcat.github.io/2018/01/07/building-a-debian-stretch-qemu-image-for-aarch64/

ARM emulation
https://balau82.wordpress.com/arm-emulation/

Footnote

[1] This incredibly cheap router is a great target to start your vulnerability research journey. Check out my two-part series to get started with this device