Book Image

Linux Kernel Debugging

By : Kaiwan N. Billimoria
Book Image

Linux Kernel Debugging

By: Kaiwan N. Billimoria

Overview of this book

The Linux kernel is at the very core of arguably the world’s best production-quality OS. Debugging it, though, can be a complex endeavor. Linux Kernel Debugging is a comprehensive guide to learning all about advanced kernel debugging. This book covers many areas in-depth, such as instrumentation-based debugging techniques (printk and the dynamic debug framework), and shows you how to use Kprobes. Memory-related bugs tend to be a nightmare – two chapters are packed with tools and techniques devoted to debugging them. When the kernel gifts you an Oops, how exactly do you interpret it to be able to debug the underlying issue? We’ve got you covered. Concurrency tends to be an inherently complex topic, so a chapter on lock debugging will help you to learn precisely what data races are, including using KCSAN to detect them. Some thorny issues, both debug- and performance-wise, require detailed kernel-level tracing; you’ll learn to wield the impressive power of Ftrace and its frontends. You’ll also discover how to handle kernel lockups, hangs, and the dreaded kernel panic, as well as leverage the venerable GDB tool within the kernel (KGDB), along with much more. By the end of this book, you will have at your disposal a wide range of powerful kernel debugging tools and techniques, along with a keen sense of when to use which.
Table of Contents (17 chapters)
1
Part 1: A General Introduction and Approaches to Kernel Debugging
4
Part 2: Kernel and Driver Debugging Tools and Techniques
11
Part 3: Additional Kernel Debugging Tools and Techniques

Software debugging – what it is, origins, and myths

In the context of a software practitioner, a bug is a defect or an error within code. A key, and often large, part of our job as software developers is to hunt them down and fix them, so that, as far as is humanely possible, the software is defect-free and runs precisely as designed.

Of course, to fix a bug, you first have to find it. Indeed, with non-trivial bugs, it's often the case that you aren't even aware there is a bug (or several) until some event occurs to expose it! Shouldn't we have a disciplined approach to finding bugs before shipping a product or project? Of course we should (and do) – it's the Quality Assurance (QA) process, more commonly known as testing. Though glossed over at times, testing remains one of the – if not the – most important facets of the software life cycle. (Would you voluntarily fly in a new aircraft that's never been tested? Well, unless you're the lucky test pilot...)

Okay, back to bugs; once identified (and filed), your job as a software developer is to then identify what exactly is causing them – what the actual underlying root cause is. A large portion of this book is devoted to tools, techniques, and just thinking about how to do this exactly. Once the root cause is identified, and you have clearly understood the underlying issue, you will, in all probability, be able to fix it. Yay!

This process of identifying a bug – using tools, techniques, and some hard thinking to figure out its root cause – and then fixing it is subsumed into the word debugging. Without bothering to go into details, there's a popular story regarding the origin of the word debugging: on a Tuesday at Harvard University (on September 9, 1947), Admiral Grace Hopper's staff discovered a moth caught in a relay panel of a Mark II computer. As the system malfunctioned because of it, they removed the moth, thus de-bugging the system! Well, as it turns out: one, Admiral Hopper has herself stated that she didn't coin the term, debugging; two, its origins seem to be rooted in aeronautics. Nevertheless, the term debugging has stuck.

The following figure shows the picture at the heart of this story – the unfortunate but posthumously famous moth that inadvertently caught itself in the system that had to be debugged!

Figure 1.1 – The famous moth (by courtesy of the Naval Surface Warfare Center, Dahlgren, VA., 1988. - U.S. Naval Historical Center Online Library Photograph NH 96566-KN. Public Domain, https://commons.wikimedia.org/w/index.php?curid=165211)

Figure 1.1 – The famous moth (by courtesy of the Naval Surface Warfare Center, Dahlgren, VA., 1988. - U.S. Naval Historical Center Online Library Photograph NH 96566-KN. Public Domain, https://commons.wikimedia.org/w/index.php?curid=165211)

Having understood what a bug and debugging basically are, let's move on to something both interesting and important – we'll briefly examine a few real-world cases where a software bug (or bugs) has been the cause of some unfortunate and tragic accidents.