The Software Developer's Guide to Linux

By : David Cohen, Christian Sturm

5 (2)

Buy this Book

The Software Developer's Guide to Linux

5 (2)

By: David Cohen, Christian Sturm

Buy this Book

Overview of this book

Developers are always looking to raise their game to the next level, yet most are completely lost when it comes to the Linux command line. This book is the bridge that will take you to the next level in your software development career. Most of the skills in the book can be immediately put to work to make you a more efficient developer. It’s written specifically for software engineers, not Linux system administrators, so each chapter will equip you with just enough theory to understand what you’re doing before diving into practical commands that you can use in your day-to-day work as a software developer. As you work through the book, you’ll quickly absorb the basics of how Linux works while you get comfortable moving around the command line. Once you’ve got the core skills, you’ll see how to apply them in different contexts that you’ll come across as a software developer: building and working with Docker images, automating boring build tasks with shell scripts, and troubleshooting issues in production environments. By the end of the book, you’ll be able to use Linux and the command line comfortably and apply your newfound skills in your day-to-day work to save time, troubleshoot issues, and be the command-line wizard that your team turns to.

Preface

Who this book is for

What this book is not

What this book covers

To get the most out of this book

Get in touch

How the Command Line Works

In the beginning…was the REPL

Command-line syntax (read)

Command line vs. shell

Basic command-line skills

Getting help

Shell autocompletion

Conclusion

Free Chapter

Working with Processes

Process basics

Practical commands for working with Linux processes

Advanced process concepts and tools

Review – example troubleshooting session

Conclusion

Service Management with systemd

The basics

Processes and services

Using Shell History

Executing previous commands with !

Jumping to the beginning or end of the current line

Conclusion

Introducing Files

Files on Linux: the absolute basics

The filesystem tree

Basic filesystem operations

Editing files

File types

Advanced file operations

Advanced filesystem knowledge for the real world

Conclusion

Editing Files on the Command Line

Nano

Vi(m)

Editing a file you don’t have permissions for

Setting your preferred editor

Conclusion

Users and Groups

What is a user?

Root versus everybody else

sudo

What is a group?

Mini project: user and group management

Advanced: what is a user, really?

A note on scriptability

Conclusion

Ownership and Permissions

Deciphering a long listing

File attributes

Ownership

Permissions

Changing ownership (chown) and permissions (chmod)

Managing Installed Software

Working with software packages

Caution required – curl | bash

Compiling third-party software from source

Conclusion

Configuring Software

Configuration hierarchy

Command-line arguments

Environment variables

Configuration files

systemd units

Quick note: configuration in Docker

Conclusion

Pipes and Redirection

File descriptors

Input and output redirection (or, playing with file descriptors for fun and profit)

Connecting commands together with pipes (|)

The CLI tools you need to know

Practical pipe patterns

Advanced: inspecting file descriptors

Conclusion

Automating Tasks with Shell Scripts

Why you need Bash scripting basics

Basics

Bash versus other shells

Shebangs and executable text files, aka scripts

Testing

Conditionals: if/then/else

Loops

Functions

Input and output redirection

Variable interpolation syntax – ${}

Limitations of shell scripts

Conclusion

Citations

Secure Remote Access with SSH

Public key cryptography primer

SSH keys

Practical project: Set up a key-based login to a remote server

Converting SSH2 keys to the OpenSSH format

SSH-agent

Common SSH errors and the -v (verbose) argument

File transfer

Tunnels

The configuration file

Conclusion

Version Control with Git

Some background on Git

What is a distributed version control system?

Git basics

Terms you might come across

Best practices for commit messages

GUIs

Useful shell aliases

Poor man’s GitHub

Conclusion

Containerizing Applications with Docker

How containers work as packages

Prerequisite: Docker install

Docker crash course

Creating images with a Dockerfile

Container commands

Docker project: Python/Flask application container

Containers vs. virtual machines

A quick note on Docker image repositories

Painfully learned container lessons

Container theory: namespacing

How do we do Ops with containers?

Conclusion

Monitoring Application Logs

Introduction to logging

The systemd journal

Example journalctl commands

Logging in Docker containers

Load Balancing and HTTP

Basic terminology

Common misunderstandings about HTTP

Conclusion

Other Books You May Enjoy

Index

Customer Reviews

5 (2)

5 star

100%

4 star

3 star

2 star

1 star

Process basics

When we refer to a “process” in Linux, we’re referring to the operating system’s internal model of what exactly a running program is. Linux needs a general abstraction that works for all programs, which can encapsulate the things the operating system cares about. A process is that abstraction, and it enables the OS to track some of the important context around programs that are executing; namely:

Memory usage
Processor time used
Other system resource usage (disk access, network usage)
Communication between processes
Related processes that a program starts, for example, firing off a shell command

You can get a listing of all system processes (at least the ones your user is allowed to see) by running the ps program with the aux flags:

Figure 2.1: List of system processes

We’ll cover the attributes most relevant to your work as a developer in this chapter.

What is a Linux process made of?

From the perspective of the operating system, a “process” is simply a data structure that makes it easy to access information like:

Process ID (PID in the ps output above). PID 1 is the init system – the original parent of all other processes, which bootstraps the system. The kernel starts this as one of the first things it does after starting to execute. When a process is created, it gets the next available process ID, in sequential order. Because it is so important to the normal functioning of the operating system, init cannot be killed, even by the root user. Different Unix operating systems use different init systems – for example, most Linux distributions use systemd, while macOS uses launchd, and many other Unixes use SysV. Regardless of the specific implementation, we’ll refer to this process by the name of the role it fills: “init.”

Note

In containers, processes are namespaced – in the “real” environment, all container processes might be PID 3210, while that single PID maps to lots of processes (1..n, where n is the number of running processes in the container). You can see this from outside but not inside the container.

Parent Process PID (PPID). Each process is spawned by a parent. If the parent process dies while the child is alive, the child becomes an “orphan.” Orphaned processes are re-parented to init (PID 1).
Status (STAT in the ps output above). man ps will show you an overview:
- D – uninterruptible sleep (usually IO)
- I – idle kernel thread
- R – running or runnable (on run queue)
- S – interruptible sleep (waiting for an event to complete)
- T – stopped by job control signal
- t – stopped by debugger during tracing
- X – dead (should never be seen)
- Z – defunct (“zombie”) process, terminated but not reaped by its parent
Priority status (“niceness” – does this process allow other processes to take priority over it?).
A process Owner (USER in the ps output above); the effective user ID.
Effective Group ID (EGID), which is used.
An address map of the process’s memory space.
Resource usage – open files, network ports, and other resources the process is using (VSZ and RSS for memory usage in the ps output above).

(Citation: from the Unix and Linux System Administration Handbook, 5th edition, p.91.)

Let’s take a closer look at a few of the process attributes that are most important for developers and occasional troubleshooters to understand.

Process ID (PID)

Each process is uniquely identifiable by its process ID, which is just a unique integer that is assigned to a process when it starts. Much like a relational database with IDs that uniquely identify each row of data, the Linux operating system keeps track of each process by its PID.

A PID is by far the most useful label for you to use when interacting with processes.

Effective User ID (EUID) and Effective Group ID (EGID)

These determine which system user and group your process is running as. Together, user and group permissions determine what a process is allowed to do on the system.

As you’ll see in Chapter 5, Introducing Files, files have user and group ownership set on them, which determines who their permissions apply to. If a file’s ownership and permissions are essentially a lock, then a process with the right user/group permissions is like a key that opens the lock and allows access to the file. We’ll dive deeper into this later, when we talk about permissions.

Environment variables

You’ve probably used environment variables in your applications – they’re a way for the operating system environment that launches your process to pass in data that the process needs. This commonly includes things like configuration directives (LOG_DEBUG=1) and secret keys (AWS_SECRET_KEY), and every programming language has some way to read them out from the context of the program.

For example, this Python script gets the user’s home directory from the HOME environment variable, and then prints it:

import os
home_dir = os.environ['HOME']
print("The home directory for this user is", home_dir)

In my case, running this program in the python3 REPL on a Linux machine results in the following output:

The home directory for this user is /home/dcohen

Working directory

A process has a “current working directory,” just like your shell (which is just a process, anyway). Typing pwd in your shell prints its current working directory, and every process has a working directory. The working directory for a process can change, so don’t rely on it too much.

This concludes our overview of the process attributes that you should know about. In the next section, we’ll step away from theory and look at some commands you can use to start working with processes right away.

The Software Developer's Guide to Linux

By : David Cohen, Christian Sturm

The Software Developer's Guide to Linux

By: David Cohen, Christian Sturm

Overview of this book

Related Content you might be interested in

Current Title:

The Software Developer's Guide to Linux

Fundamentals of Linux

CentOS Quick Start Guide

Working with Linux - Quick Hacks for the Command Line

Process basics

What is a Linux process made of?

Process ID (PID)

Effective User ID (EUID) and Effective Group ID (EGID)

Environment variables

Working directory