Book Image

Linux Kernel Debugging

By : Kaiwan N. Billimoria
Book Image

Linux Kernel Debugging

By: Kaiwan N. Billimoria

Overview of this book

The Linux kernel is at the very core of arguably the world’s best production-quality OS. Debugging it, though, can be a complex endeavor. Linux Kernel Debugging is a comprehensive guide to learning all about advanced kernel debugging. This book covers many areas in-depth, such as instrumentation-based debugging techniques (printk and the dynamic debug framework), and shows you how to use Kprobes. Memory-related bugs tend to be a nightmare – two chapters are packed with tools and techniques devoted to debugging them. When the kernel gifts you an Oops, how exactly do you interpret it to be able to debug the underlying issue? We’ve got you covered. Concurrency tends to be an inherently complex topic, so a chapter on lock debugging will help you to learn precisely what data races are, including using KCSAN to detect them. Some thorny issues, both debug- and performance-wise, require detailed kernel-level tracing; you’ll learn to wield the impressive power of Ftrace and its frontends. You’ll also discover how to handle kernel lockups, hangs, and the dreaded kernel panic, as well as leverage the venerable GDB tool within the kernel (KGDB), along with much more. By the end of this book, you will have at your disposal a wide range of powerful kernel debugging tools and techniques, along with a keen sense of when to use which.
Table of Contents (17 chapters)
1
Part 1: A General Introduction and Approaches to Kernel Debugging
4
Part 2: Kernel and Driver Debugging Tools and Techniques
11
Part 3: Additional Kernel Debugging Tools and Techniques

Debugging – a few quick tips

I'll start off by saying this: debugging is both a science and an art, refined by experience – the mundane hands-on slogging through to reproduce and identify a bug and its root cause, and (possibly) fix it. I'm of the opinion that the following few debug tips are really nothing new; that said, we do tend to get caught up in the moment and often miss the obvious. The hope is that you'll find these tips useful and return to them time and again!

  • Assumptions – just say NO!

Churchill famously said, "Never, never, never, give up". We say "Never, never, never, make assumptions".

Assumptions are, very often, the root cause behind many, many bugs and defects. Think back, re-read the Software bugs – a few actual cases section!

In fact (hey, I am partially joking here), just look at the word assume: it just begs saying, "Don't make an ASS out of U and ME"!

Using assertions in your code is a great way to catch assumptions. The userspace way is to use the assert() macro. It's well documented in the man page. (We cover more on using macros within the kernel in Chapter 12, A Few More Kernel Debugging Approaches, in the Assertions, warnings and BUG() macros section).

  • Don't lose the forest for the trees!

At times, we do get lost in the twisted mazes of complex code paths. In these circumstances, it's really easy to lose sight of the big idea, the objective of the code. Try and zoom out and think of the bigger picture. It often helps spot the faulty assumption(s) that led to the error(s). Well-written documentation can be a lifesaver.

  • Think small

When faced with a difficult bug, try this: build/configure/get the smallest possible version of your problem (statement) to execute causing the issue or bug you're currently facing to surface. This often helps you track down the root cause of the problem. In fact, very often (in my own experience), the mere act of doing this – or even just the detailed jotting down of the problem you face – triggers you seeing the actual issue and its solution in your mind!

  • "It requires twice the brainpower to debug a piece of code than to write it"

This paraphrased quote is by Brian Kernighan in the book The Elements of Programming Style. So, should we not use our full brainpower while writing code? Ha, of course you should... But, debugging is typically harder than writing code. The real point is this: take the trouble to first carefully do your groundwork: write a brief very high-level design document and write what you expect the code to do, at a high level of abstraction. Then move on to the specifics (with a so-called low-level design doc). Good documentation will save you one day (and blessings shall be showered upon you!).

That reminds me of another quote: An ounce of design is worth a pound of refactoring – Karl Wiegers.

  • Employ "Zen Mind, Beginner's Mind"

Sometimes, the code can become too complex (spaghetti-like; it just smells). In many cases, just giving up and starting from scratch again, if viable, is perhaps the best thing to do.

This Zen-Beginner's Mind state also implies that we at least temporarily stop our (perhaps over-egotistical) thought patterns (I wrote this so well, how can it be wrong!?) and look at the situation from the point of view of somebody completely new to it. It is, in fact, one key reason why a colleague reviewing your code can spot bugs you'd never see! Plus, a good night's rest can do wonders.

  • Variable naming, comments

I recall a Q&A on Quora revealing that the hardest thing a programmer does is name variables well! This is truer than it might appear at first glance. Variable names stick; choose yours carefully. As with commenting, don't go overboard either: a local variable for a loop index? int i is just fine (int theloopindex is just painful). The same goes for comments: they're there to explain the rationale, the design behind the code, what it's designed and implemented to achieve, not how the code works. Any competent programmer can figure that out.

  • Ignore logs at your peril!

It's self-evident perhaps, but we can often miss the obvious when under pressure... Carefully checking kernel (and even app) logs often reveals the source of the issue you might be facing. Logs are usually able to be displayed in reverse-chronological order and give you a view of what actually occurred; Linux's systemd journalctl(1) utility is powerful; learn how to leverage it!

  • Testing can reveal the presence of errors but not their absence

A truism, unfortunately. Still, testing and QA is simply one of the most critical parts of the software process; ignore it at your peril! The time and the trouble taken to write exhaustive test cases – both positive and negative – pays off in large dividends in the long run, helping make the product or project a grand success. Negative test cases and fuzzing are critical for exposing (and subsequently fixing) security vulnerabilities in the code base. Then again, runtime testing only tests the portions of code actually executed. Take the trouble to perform code coverage analysis; 100% code coverage – and runtime testing it is the objective! (Again, we cover more on these key points in Chapter 12, A Few More Kernel Debugging Approaches, in the An introduction to kernel code coverage tools and testing frameworks section).

  • Incurring technical debt

Every now and then, you realize deep down that though what you've coded works, it's not been done well enough (perhaps there still exist corner cases that will trigger bugs or undefined behavior); that nagging feeling that perhaps this design and implementation simply isn't the best. The temptation to quickly check it in and hope for the best can be high, especially as deadlines loom! Please don't; there is really a thing called technical debt. It will come and get you.

  • Silly mistakes

If I had a penny for each time I've made really silly mistakes when developing code, I'd be a rich man! For instance, I once spent nearly half a day racking my head about why my C program would just refuse to work correctly until I realized I was editing the correct code but compiling an old version of it – performing the build in the wrong directory! (I am certain you've faced your share of such pesky frustrations.) Often, a break, a good night's sleep, can do wonders.

  • Empirical model

The word empirical means to validate something (anything) by actual and direct observation or experience rather than relying on theory.

Figure 1.6 – Be empirical!

Figure 1.6 – Be empirical!

So, don't believe the book (this one is an exception of course!), don't believe the tutorial, the article, blog, tutor, or author: be empirical – try it out and see for yourself!

Years (decades, actually) back, on my very first day of work at a company I joined, a colleague emailed me a document that I still hold dear: The Ten Commandments for C Programmers, by Henry Spencer (https://www.electronicsweekly.com/open-source-engineering/linux/the-ten-commandments-for-c-programmers-2009-04/). Do check it out. In a similar, albeit clumsier, manner, I present a quick checklist for you.

A programmer's checklist – seven rules

Very important! Did you remember to do the following?:

  • Check all APIs for their failure case.
  • Compile with warnings on (definitely with -Wall and possibly -Wextra or even -Werror; yes, treating warnings as errors is going to make its way into the kernel!); eliminate all warnings as far as is possible.
  • Never trust (user) input; validate it.
  • Eliminate unused (or dead) code from the code base immediately.
  • Test thoroughly; 100% code coverage is the objective. Take the time and trouble to learn how to use powerful tools: memory checkers, static and dynamic analyzers, security checkers (checksec, lynis, and several others), fuzzers, code coverage tools, fault injection frameworks, and so on. Don't ignore security!
  • With regard to kernels and especially drivers, after eliminating software issues, be aware that (peripheral) hardware issues could be the root cause of the bug. Don't discount it out of hand! (You'll learn this the hard way.)
  • Do not assume anything (assume: make an ASS out of U and ME); using assertions helps catch assumptions, and thus bugs.

We shall elaborate on several of these points in the coming material.