Hands-On System Programming with Linux

By : Kaiwan N. Billimoria, Tigran Aivazian

Hands-On System Programming with Linux

By: Kaiwan N. Billimoria, Tigran Aivazian

Overview of this book

The Linux OS and its embedded and server applications are critical components of today’s software infrastructure in a decentralized, networked universe. The industry's demand for proficient Linux developers is only rising with time. Hands-On System Programming with Linux gives you a solid theoretical base and practical industry-relevant descriptions, and covers the Linux system programming domain. It delves into the art and science of Linux application programming— system architecture, process memory and management, signaling, timers, pthreads, and file IO. This book goes beyond the use API X to do Y approach; it explains the concepts and theories required to understand programming interfaces and design decisions, the tradeoffs made by experienced developers when using them, and the rationale behind them. Troubleshooting tips and techniques are included in the concluding chapter. By the end of this book, you will have gained essential conceptual design knowledge and hands-on experience working with Linux system programming interfaces.

Preface

Who this book is for

What this book covers

To get the most out of this book

Get in touch

Free Chapter

Linux System Architecture

Technical requirements

Linux and the Unix operating system

The Unix philosophy in a nutshell

Linux system architecture

Execution contexts within the kernel

Summary

Virtual Memory

Technical requirements

Virtual memory

Process memory layout

Summary

Resource Limits

Resource limits

Granularity of resource limits

Hard and soft limits

Summary

Dynamic Memory Allocation

The glibc malloc(3) API family

Beyond the basics

Summary

Linux Memory Issues

Common memory issues

Summary

Debugging Tools for Memory Issues

Tool types

Some key points

Summary

Process Credentials

The traditional Unix permissions model

Summary

Process Capabilities

The modern POSIX capabilities model

Miscellaneous

Summary

Process Execution

Technical requirements

Process execution

Summary

Process Creation

Process creation

Summary

Signaling - Part I

Why signals?

Available signals

Handling signals

Summary

Signaling - Part II

Gracefully handling process crashes

Signaling – caveats and gotchas

Real-time signals

Sending signals

Alternative signal-handling techniques

Summary

Timers

Older interfaces

The newer POSIX (interval) timers mechanism

A quick mention

Summary

Multithreading with Pthreads Part I - Essentials

Multithreading concepts

Thread management – the essential pthread APIs

Summary

Multithreading with Pthreads Part II - Synchronization

The racing problem

Locking concepts

Using the pthread APIs for synchronization

Summary

Multithreading with Pthreads Part III

Thread safety

Thread cancelation and cleanup

Threads and signaling

Threads vs processes – look again

Pthreads – a few random tips and FAQs

Summary

CPU Scheduling on Linux

The Linux OS and the POSIX scheduling model

Exploiting Linux's soft real-time capabilities

RTL – Linux as an RTOS

Summary

Advanced File I/O

I/O performance recommendations

Summary

Troubleshooting and Best Practices

Troubleshooting tools

Best practices

Summary

Other Books You May Enjoy

Leave a review - let other readers know what you think

Customer Reviews

5 star

4 star

3 star

2 star

1 star

The Unix philosophy in a nutshell

To understand anyone (or anything), one must strive to first understand their (or its) underlying philosophy; to begin to understand Linux is to begin to understand the Unix philosophy. Here, we shall not attempt to delve into every minute detail; rather, an overall understanding of the essentials of the Unix philosophy is our goal. Also, when we use the term Unix, we very much also mean Linux!

The way that software (particularly, tools) is designed, built, and maintained on Unix slowly evolved into what might even be called a pattern that stuck: the Unix design philosophy. At its heart, here are the pillars of the Unix philosophy, design, and architecture:

Everything is a process; if it's not a process, it's a file
One tool to do one task
Three standard I/O channel
Combine tools seamlessly
Plain text preferred
CLI, not GUI
Modular, designed to be repurposed by others
Provide the mechanism, not the policy

Let's examine these pillars a little more closely, shall we?

Everything is a process – if it's not a process, it's a file

A process is an instance of a program in execution. A file is an object on the filesystem; beside regular file with plain text or binary content; it could also be a directory, a symbolic link, a device-special file, a named pipe, or a (Unix-domain) socket.

The Unix design philosophy abstracts peripheral devices (such as the keyboard, monitor, mouse, a sensor, and touchscreen) as files – what it calls device files. By doing this, Unix allows the application programmer to conveniently ignore the details and just treat (peripheral) devices as though they are ordinary disk files.

The kernel provides a layer to handle this very abstraction – it's called the Virtual Filesystem Switch (VFS). So, with this in place, the application developer can open a device file and perform I/O (reads and writes) upon it, all using the usual API interfaces provided (relax, these APIs will be covered in a subsequent chapter).

In fact, every process inherits three files on creation:

Standard input (stdin: fd 0): The keyboard device, by default
Standard output (stdout: fd 1): The monitor (or terminal) device, by default
Standard error (stderr: fd 2): The monitor (or terminal) device, by default

fd is the common abbreviation, especially in code, for file descriptor; it's an integer value that refers to the open file in question.

Also, note that we mention it's a certain device by default – this implies the defaults can be changed. Indeed, this is a key part of the design: changing standard input, output, or error channels is called redirection, and by using the familiar <, > and 2> shell operators, these file channels are redirected to other files or devices.

On Unix, there exists a class of programs called filters.

A filter is a program that reads from its standard input, possibly modifies the input, and writes the filtered result to its standard output.

Filters on Unix are very common utilities, such as cat, wc, sort, grep, perl, head, and tail.

Filters allow Unix to easily sidestep design and code complexity. How?

Let's take the sort filter as a quick example. Okay, we'll need some data to sort. Let's say we run the following commands:

$ cat fruit.txt
orange
banana
apple
pear
grape
pineapple
lemon
cherry
papaya
mango
$

Now we consider four scenarios of using sort; based on the parameter(s) we pass, we are actually performing explicit or implicit input-, output-, and/or error-redirection!

Scenario 1: Sort a file alphabetically (one parameter, input implicitly redirected to file):

$ sort fruit.txt
 apple
 banana
 cherry
 grape
 lemon
 mango
 orange
 papaya
 pear
 pineapple
$

All right!

Hang on a second, though. If sort is a filter (and it is), it should read from its stdin (the keyboard) and write to its stdout (the terminal). It is indeed writing to the terminal device, but it's reading from a file, fruit.txt.

This is deliberate; if a parameter is provided, the sort program treats it as standard input, as clearly seen.

Also, note that sort fruit.txt is identical to sort < fruit.txt.

Scenario 2: Sort any given input alphabetically (no parameters, input and output from and to stdin/stdout):

$ sort 
mango 
apple
pear
^D 
apple
mango
pear
$

Once you type sort and press the Enter key, and the sort process comes alive and just waits. Why? It's waiting for you, the user, to type something. Why? Recall, every process by default reads its input from standard input or stdin – the keyboard device! So, we type in some fruit names. When we're done, press Ctrl + D. This is the default character sequence that signifies end-of-file (EOF), or in cases such as this, end-of-input. Voila! The input is sorted and written. To where? To the sort process's stdout – the terminal device, hence we see it.

Scenario 3: Sort any given input alphabetically and save the output to a file (explicit output redirection):

$ sort > sorted.fruit.txt
mango 
apple
pear
^D
$

Similar to Scenario 2, we type in some fruit names and then Ctrl + D to tell sort we're done. This time, though, note that the output is redirected (via the > meta-character) to the sorted.fruits.txt file!

So, as expected is the following output:

$ cat sorted.fruit.txt
apple
mango 
pear
$

Scenario 4: Sort a file alphabetically and save the output and errors to a file (explicit input-, output-, and error-redirection):

$ sort < fruit.txt > sorted.fruit.txt 2> /dev/null
$

Interestingly, the end result is the same as in the preceding scenario, with the added advantage of redirecting any error output to the error channel. Here, we redirect the error output (recall that file descriptor 2 always refers to stderr) to the /dev/null special device file; /dev/null is a device file whose job is to act as a sink (a black hole). Anything written to the null device just disappears forever! (Who said there isn't magic on Unix?) Also, its complement is /dev/zero; the zero device is a source – an infinite source of zeros. Reading from it returns zeroes (the first ASCII character, not numeric 0); it has no end-of-file!

One tool to do one task

In the Unix design, one tries to avoid creating a Swiss Army knife; instead, one creates a tool for a very specific, designated purpose and for that one purpose only. No ifs, no buts; no cruft, no clutter. This is design simplicity at its best.

"Simplicity is the ultimate sophistication."

- Leonardo da Vinci

Take a common example: when working on the Linux CLI (command-line interface), you would like to figure out which of your locally mounted filesystems has the most available (disk) space.

We can get the list of locally mounted filesystems by an appropriate switch (just df would do as well):

$ df --local
Filesystem                1K-blocks    Used Available Use% Mounted on
rootfs                     20640636 1155492  18436728   6% /
udev                          10240       0     10240   0% /dev
tmpfs                         51444     160     51284   1% /run
tmpfs                          5120       0      5120   0% /run/lock
tmpfs                        102880       0    102880   0% /run/shm
$

To sort the output, one would need to first save it to a file; one could use a temporary file for this purpose, tmp, and then sort it, using the sort utility, of course. Finally, we delete the offending temporary file. (Yes, there's a better way, piping; refer to the, Combine tools seamlessly section)

Note that the available space is the fourth column, so we sort accordingly:

$ df --local > tmp
$ sort -k4nr tmp
rootfs                     20640636 1155484  18436736   6% /
tmpfs                        102880       0    102880   0% /run/shm
tmpfs                         51444     160     51284   1% /run
udev                          10240       0     10240   0% /dev
tmpfs                          5120       0      5120   0% /run/lock
Filesystem                1K-blocks    Used Available Use% Mounted on
$

Whoops! The output includes the heading line. Let's first use the versatile sed utility – a powerful non-interactive editor tool – to eliminate the first line, the header, from the output of df:

$ df --local > tmp
$ sed --in-place '1d' tmp
$ sort -k4nr tmp
rootfs                     20640636 1155484  18436736   6% /
tmpfs                        102880       0    102880   0% /run/shm
tmpfs                         51444     160     51284   1% /run
udev                          10240       0     10240   0% /dev
tmpfs                          5120       0      5120   0% /run/lock
$ rm -f tmp

So what? The point is, on Unix, there is no one utility to list mounted filesystems and sort them by available space simultaneously.

Instead, there is a utility to list mounted filesystems: df. It does a great job of it, with option switches to choose from. (How does one know which options? Learn to use the man pages, they're extremely useful.)

There is a utility to sort text: sort. Again, it's the last word in sorting text, with plenty of option switches to choose from for pretty much every conceivable sort one might require.

The Linux man pages: man is short for manual; on a Terminal window, type man man to get help on using man. Notice the manual is divided into 9 sections. For example, to get the manual page on the stat system call, type man 2 stat as all system calls are in section 2 of the manual. The convention used is cmd or API; thus, we refer to it as stat(2).

As expected, we obtain the results. So what exactly is the point? It's this: we used three utilities, not one. df , to list the mounted filesystems (and their related metadata), sed, to eliminate the header line, and sort, to sort whatever input its given (in any conceivable manner).

df can query and list mounted filesystems, but it cannot sort them. sort can sort text; it cannot list mounted filesystems.

Think about that for a moment.

Combine them all, and you get more than the sum of its parts! Unix tools typically do one task and they do it to its logical conclusion; no one does it better!

Having said this, I would like to point out – a tiny bit sheepishly – the highly renowned tool Busybox. Busybox (http://busybox.net) is billed as The Swiss Army Knife of Embedded Linux. It is indeed a very versatile tool; it has its place in the embedded Linux ecosystem – precisely because it would be too expensive on an embedded box to have separate binary executables for each and every utility (and it would consume more RAM). Busybox solves this problem by having a single binary executable (along with symbolic links to it from each of its applets, such as ls, ps, df, and sort).
So, nevertheless, besides the embedded scenario and all the resource limitations it implies, do follow the One tool to do one task rule!

Three standard I/O channels

Several popular Unix tools (technically, filters) are, again, deliberately designed to read their input from a standard file descriptor called standard input (stdin) – possibly modify it, and write their resultant output to a standard file descriptor standard output (stdout). Any error output can be written to a separate error channel called standard error (stderr).

In conjunction with the shell's redirection operators (> for output-redirection and < for input-redirection, 2> for stderr redirection), and even more importantly with piping (refer section, Combine tools seamlessly), this enables a program designer to highly simplify. There's no need to hardcode (or even softcode, for that matter) input and output sources or sinks. It just works, as expected.

Let's review a couple of quick examples to illustrate this important point.

Word count

How many lines of source code are there in the C netcat.c source file I downloaded? (Here, we use a small part of the popular open source netcat utility code base.) We use the wc utility. Before we go further, what's wc? word count (wc) is a filter: it reads input from stdin, counts the number of lines, words, and characters in the input stream, and writes this result to its stdout. Further, as a convenience, one can pass filenames as parameters to it; passing the -l option switch has wc only print the number of lines:

$ wc -l src/netcat.c
618 src/netcat.c
$

Here, the input is a filename passed as a parameter to wc.

Interestingly, we should by now realize that if we do not pass it any parameters, wc would read its input from stdin, which by default is the keyboard device. For example is shown as follows:

$ wc -l
hey, a small
quick test
 of reading from stdin
by wc!
^D
4
$

Yes, we typed in 4 lines to stdin; thus the result is 4, written to stdout – the terminal device by default.

Here is the beauty of it:

$ wc -l < src/netcat.c > num
$ cat num
618
$

As we can see, wc is a great example of a Unix filter.

cat

Unix, and of course Linux, users learn to quickly get familiar with the daily-use cat utility. At first glance, all cat does is spit out the contents of a file to the terminal.

For example, say we have two plain text files, myfile1.txt and myfile2.txt:

$ cat myfile1.txt
Hello,
Linux System Programming,
World.
$ cat myfile2.txt
Okey dokey,
bye now.
$

Okay. Now check this out:

$ cat myfile1.txt myfile2.txt
Hello,
Linux System Programming,
World.
Okey dokey,
bye now.
$

Instead of needing to run cat twice, we ran it just once, by passing the two filenames to it as parameters.

In theory, one can pass any number of parameters to cat: it will use them all, one by one!

Not just that, one can use shell wildcards too (* and ?; in reality, the shell will first expand the wildcards, and pass on the resultant path names to the program being invoked as parameters):

$ cat myfile?.txt
Hello,
Linux System Programming,
World.
Okey dokey,
bye now.
$

This, in fact, illustrates another key point: any number of parameters or none is considered the right way to design a program. Of course, there are exceptions to every rule: some programs demand mandatory parameters.

Wait, there's more. cat too, is an excellent example of a Unix filter (recall: a filter is a program that reads from its standard input, modifies its input in some manner, and writes the result to its standard output).

So, quick quiz, if we just run cat with no parameters, what would happen?
Well, let's try it out and see:

$ cat
hello,   
hello,   
oh cool
oh cool
it reads from stdin,
it reads from stdin,
and echoes whatever it reads to stdout!
and echoes whatever it reads to stdout!
ok bye
ok bye
^D
$

Wow, look at that: cat blocks (waits) at its stdin, the user types in a string and presses the Enter key, cat responds by copying its stdin to its stdout – no surprise there, as that's the job of cat in a nutshell!

One realizes the commands shown as follows:

cat fname is the same as cat < fname
cat > fname creates or overwrites the fname file

There's no reason we can't use cat to append several files together:

$ cat fname1 fname2 fname3 > final_fname
$

There's no reason this must be done with only plain text files; one can join together binary files too.

In fact, that's what the utility does – it concatenates files. Thus its name; as is the norm on Unix, is highly abbreviated – from concatenate to just cat. Again, clean and elegant – the Unix way.

cat shunts out file contents to stdout, in order. What if one wants to display a file's contents in reverse order (last line first)? Use the Unix tac utility – yes, that's cat spelled backward!

Also, FYI, we saw that cat can be used to efficiently join files. Guess what: the split (1) utility can be used to break a file up into pieces.

Combine tools seamlessly

We just saw that common Unix utilities are often designed as filters, giving them the ability to read from their standard input and write to their standard output. This concept is elegantly extended to seamlessly combine together multiple utilities, using an IPC mechanism called a pipe.

Also, we recall that the Unix philosophy embraces the do one task only design. What if we have one program that does task A and another that does task B and we want to combine them? Ah, that's exactly what pipes do! Refer to the following code:

prg_does_taskA | prg_does_taskB

A pipe essentially is redirection performed twice: the output of the left-hand program becomes the input to the right-hand program. Of course, this implies that the program on the left must write to stdout, and the program on the read must read from stdin.

An example: sort the list of mounted filesystems by space available (in reverse order).

As we have already discussed this example in the One tool to do one task section, we shall not repeat the same information.

Option 1: Perform the following code using a temporary file (refer section, One tool to do one task):

$ df --local | sed '1d' > tmp
$ sed --in-place '1d' tmp
$ sort -k4nr tmp
rootfs 20640636 1155484 18436736 6% /
tmpfs 102880 0 102880 0% /run/shm
tmpfs 51444 160 51284 1% /run
udev 10240 0 10240 0% /dev
tmpfs 5120 0 5120 0% /run/lock
$ rm -f tmp

Option 2 : Using pipes—clean and elegant:

$ df --local | sed '1d' | sort -k4nr
rootfs                     20640636 1155492  18436728   6% /
tmpfs                        102880       0    102880   0% /run/shm
tmpfs                         51444     160     51284   1% /run
udev                          10240       0     10240   0% /dev
tmpfs                          5120       0      5120   0% /run/lock
$

Not only is this elegant, it is also far superior performance-wise, as writing to memory (the pipe is a memory object) is much faster than writing to disk.

One can extend this notion and combine multiple tools over multiple pipes; in effect, one can build a super tool from several regular tools by combining them.

As an example: display the three processes taking the most (physical) memory; only display their PID, virtual size (VSZ), resident set size (RSS) (RSS is a fairly accurate measure of physical memory usage), and the name:

$ ps au | sed '1d' | awk '{printf("%6d %10d %10d %-32s\n", $2, $5, $6, $11)}' | sort -k3n | tail -n3
 10746    3219556     665252 /usr/lib64/firefox/firefox      
 10840    3444456    1105088 /usr/lib64/firefox/firefox      
  1465    5119800    1354280 /usr/bin/gnome-shell            
$

Here, we've combined five utilities, ps, sed, awk, sort, and tail, over four pipes. Nice!

Another example: display the process, not including daemons*, taking up the most memory (RSS):

ps aux | awk '{if ($7 != "?") print $0}' | sort -k6n | tail -n1

A daemon is a system background process; we'll cover this concept in Daemon Process here: https://www.packtpub.com/sites/default/files/downloads/Daemon_Processes.pdf.

Plain text preferred

Unix programs are generally designed to work with text as it's a universal interface. Of course, there are several utilities that do indeed operate on binary objects (such as object and executable files); we aren't referring to them here. The point is this: Unix programs are designed to work on text as it simplifies the design and architecture of the program.

A common example: an application, on startup, parses a configuration file. The configuration file could be formatted as a binary blob. On the other hand, having it as a plain text file renders it easily readable (invaluable!) and therefore easier to understand and maintain. One might argue that parsing binary would be faster. Perhaps to some extent this is so, but consider the following:

With modern hardware, the difference is probably not significant
A standardized plain text format (such as XML) would have optimized code to parse it, yielding both benefits

Remember, simplicity is key!

CLI, not GUI

The Unix OS, and all its applications, utilities, and tools, were always built to be used from a command-line-interface (CLI), typically, the shell. From the 1980s onward, the need for a Graphical User Interface (GUI) became apparent.

Robert Scheifler of MIT, considered the chief design architect behind the X Window System, built an exceedingly clean and elegant architecture, a key component of which is this: the GUI forms a layer (well, actually, several layers) above the OS, providing libraries for GUI clients, that is, applications.

The GUI was never designed to be intrinsic to applications or the OS—it's always optional.

This architecture still holds up today. Having said that, especially on embedded Linux, performance reasons are seeing the advent of newer architectures, such as the frame buffer and Wayland. Also, though Android, which uses the Linux kernel, necessitates a GUI for the end user, the system developer's interface to Android, ADB, is a CLI.

A huge number of production-embedded and server Linux systems run purely on CLI interfaces. The GUI is almost like an add-on feature, for the end user's ease of operation.

Wherever appropriate, design your tools to work in the CLI environment; adapting it into a GUI at a later point is then straightforward.
Cleanly and carefully separating the business logic of the project or product from its GUI is a key to good design.

Modular, designed to be repurposed by others

From its very early days, the Unix OS was deliberately designed and coded with the tacit assumption that multiple programmers would work on the system. Thus, the culture of writing clean, elegant, and understandable code, to be read and worked upon by other competent programmers, was ingrained.

Later, with the advent of the Unix wars, proprietary and legal concerns overrode this sharing model. Interestingly, history shows that the Unix's were fading in relevance and industry use, until the timely advent of none other than the Linux OS – an open source ecosystem at its very best! Today, the Linux OS is widely acknowledged as the most successful GNU project. Ironic indeed!

Provide mechanisms, not policies

Let's understand this principle with a simple example.

When designing an application, you need to have the user enter a login name and password. The function that performs the work of getting and checking the password is called, let's say, mygetpass(). It's invoked by the mylogin() function: mylogin() → mygetpass().

Now, the protocol to be followed is this: if the user gets the password wrong three times in a row, the program should not allow access (and should log the case). Fine, but where do we check this?

The Unix philosophy: do not implement the logic, if the password is specified wrongly three times, abort in the mygetpass() function. Instead, just have mygetpass() return a Boolean (true when the password is right, false when the password is wrong), and have the mylogin() calling function implement whatever logic is required.

Pseudocode

The following is the wrong approach:

mygetpass()
{
    numtries=1

    <get the password>
  
    if (password-is-wrong) {
         numtries ++
         if (numtries >= 3) {
            <write and log failure message>
            <abort>
         }
 }
 <password correct, continue>
} 
mylogin()
{
    mygetpass()
}

Now let's take a look at the right approach: the Unix way! Refer to the following code:

mygetpass()
{
   <get the password>

   if (password-is-wrong)
      return false;

   return true;
}
mylogin()
{
    maxtries = 3

    while (maxtries--) {
       if (mygetpass() == true)
           <move along, call other routines>
    }

    // If we're here, we've failed to provide the 
    // correct password
    <write and log failure message>
    <abort>
}

The job of mygetpass() is to get a password from the user and check whether it's correct; it returns success or failure to the caller – that's it. That's the mechanism. It is not its job to decide what to do if the password is wrong – that's the policy, and left to the caller.

Now that we've covered the Unix philosophy in a nutshell, what are the important takeaways for you, the system developer on Linux?

Learning from, and following, the Unix philosophy when designing and implementing your applications on the Linux OS will provide a huge payoff. Your application will do the following:

Be a natural fit on the system; this is very important
Have greatly reduced complexity
Have a modular design that is clean and elegant
Be far more maintainable

Hands-On System Programming with Linux

By : Kaiwan N. Billimoria, Tigran Aivazian

Hands-On System Programming with Linux

By: Kaiwan N. Billimoria, Tigran Aivazian

Overview of this book

Related Content you might be interested in

Current Title:

Hands-On System Programming with Linux

Linux Kernel Programming

Linux Kernel Programming

Linux Kernel Programming Part 2 - Char Device Drivers and Kernel Synchronization