Firmware is a kind of software that is written to a hardware device in order to control user applications and various system functions. The firmware contains low level programming code that enables software to access hardware functions. Devices that run firmware are known as embedded systems which have limited hardware resources, such as storage capabilities as well as memory. Examples of embedded devices that run firmware are smartphones, traffic lights, connected vehicles, some types of computers, drones, and cable set-top boxes.
It is apparent that embedded technology and the firmware that runs on these devices controls our daily lives, from the critical infrastructure cities rely on, to bank ATMs and the homes that consumers live in. It is important to understand what a firmware binary consists of as well as its associated properties. Firmware is comprised of a bootloader, kernel, filesystem, and various other resources. There are different types of firmware built upon embedded Linux, embedded Windows, Windows IoT core, and various Real Time Operating Systems (RTOS). This book will be geared toward an embedded Linux environment, however, the principles will remain platform agnostic.
You can learn more about the firmware at this link:https://wiki.debian.org/Firmware
The following diagram represents what a piece of firmware contains: flash contents, the bootloader, the kernel, and a root filesystem:
Figure 1.1: Firmware contents
Let's first have a look at the bootloader. A bootloader's responsibility is to initialize RAM for volatile data storage, initialize serial port(s), detect the machine type, set up the kernel tagged list, load
initramfs (initial RAM filesystem), and call the kernel image. The bootloader initializes hardware drivers via a Board Support Package (BSP), which is usually developed by a third party. The bootloader resides on a separate Electrically Erasable Programmable Read-only Memory (EEPROM), which is less common, or directly on flash storage, which is more common. Think of the bootloader as a PC's BIOS upon start up. Discussing each of the bootloaders' responsibilities in detail is beyond the scope of this book; however, we will highlight where the bootloader works to our advantage. Some of the common bootloaders for ARM and MIPS architectures are: Redboot, u-boot, and barebox. Once the bootloader starts up the kernel, the filesystem is loaded.
There are many filesystem types employed within the firmware, and sometimes even proprietary file types are used depending on the device. However, some of most common types of filesystems are SquashFS, cramFS, JFFS2, YAFFS2, and ext2. The most common filesystem utilized in devices (especially consumer devices) is SquashFS. There are utilities, such as
unsquashfs and modified
unsquashfs that are used to extract data from squashed filesystems. Modified
unsquashfs tools are utilized when vendors change SquashFS to use non-supported compressions, such as LZMA (prior to SquashFS 4.0, the only officially supported compression was
.zlib), and will have a different offset of where the filesystem starts than regular SquashFS filesystems. We will address locating and identifying offsets later in this book.
For additional reading on filesystems for embedded Linux, please visit the following link: http://elinux.org/images/b/b1/Filesystems-for-embedded-linux.pdf.
Sasquatch is a handy tool to utilize for extracting modified SquashFS filesystems. Sasquash can be found at the following link:https://github.com/devttys0/sasquatch
Similarly, there are many types of file compression utilized for firmware images, such as LZMA,
.arj, to name a few. Each has pros and cons such as the size after compression, compression time, decompression time, as well as the business needs for the device itself. For our purposes, we will think of the filesystem as the location that contains configuration files, services, account passwords, hashes, and application code, as well as start up scripts. In the next chapter, we will walk you through how to find the filesystem in use as well as the compression in use.
Within the filesystem, device-specific code resides, written in C, C++, or other programming languages, such as Lua. Device-specific code, or even all of the firmware itself, can be a mix of third-party developers contracted out, known as Original Design Manufacturers (ODM), or in-house developers working with the Original Equipment Manufacturer (OEM). ODMs are an important piece of the embedded device development supply chain. They are often small companies in Asia and are a dime a dozen. Some OEMs have trusted ODMs they work with on product lines, while others will do business with ODMs that have the lowest fees for only one product. Depending on the industry, an ODM can also be referred to as a supplier. It is important to note that ODMs are free to work with a number of different OEMs and can even distribute the same code base. You may be familiar with this notion or even wondered why a critical public advisory affects ten plus device manufactures for a software bug. This occurs due to a lack of secure development life cycles processes by the ODM and verification by the OEM. Once an ODM completes their application deliverables, which may be an SDK or firmware to the OEM, the OEM will merge its code base(s) into the firmware, which may be as small as OEM logos on web interfaces. The implementation varies depending on how the ODM and OEM merge their code; however, it is not uncommon for an ODM to provide a binary file to the OEM. OEMs are responsible for distributing the firmware, managing firmware, and supporting the device itself. This includes firmware security issues reported by third-party researchers, which puts a strain on OEMs if ODMs retain the source code and the OEM only has access to a binary image.
In Chapter 3, Analyzing and Exploiting Firmware we will learn how to reverse engineer firmware binary images by recognizing the filesystem, identifying compression, and emulating binaries for testing, to take advantage of common firmware issues.