Book Image

Hands-On Data Science with the Command Line

By : Jason Morris, Chris McCubbin, Raymond Page
Book Image

Hands-On Data Science with the Command Line

By: Jason Morris, Chris McCubbin, Raymond Page

Overview of this book

The Command Line has been in existence on UNIX-based OSes in the form of Bash shell for over 3 decades. However, very little is known to developers as to how command-line tools can be OSEMN (pronounced as awesome and standing for Obtaining, Scrubbing, Exploring, Modeling, and iNterpreting data) for carrying out simple-to-advanced data science tasks at speed. This book will start with the requisite concepts and installation steps for carrying out data science tasks using the command line. You will learn to create a data pipeline to solve the problem of working with small-to medium-sized files on a single machine. You will understand the power of the command line, learn how to edit files using a text-based and an. You will not only learn how to automate jobs and scripts, but also learn how to visualize data using the command line. By the end of this book, you will learn how to speed up the process and perform automated tasks using command-line tools.
Table of Contents (8 chapters)

We don't want to BaSH other shells, but...

In this book, we decided to focus on using the Bourne-again shell (bash) for multiple reasons. First, it's the most popular shell and you can find it everywhere. In fact, for the majority of Linux distributions, bash is the default shell. It's a great first shell to learn and very easy to work with. There's a number of examples and resources available to help you with bash if you ever get stuck. It's also safe to say that since it's so popular, you can find it on almost any system available today. From a bare-metal installation in a data center to an instance running in the cloud, bash is there, installed, and waiting for input.

There are a number of other shells you can choose from, such as the Z shell (zsh). The Z shell is fairly new (and by new I mean released in 1990, which is new in shell land) and provides a number of powerful features. Other notable shells are tcsh, ksh, and fish. The C Shell (tcsh), the Korn Shell (ksh), and the Friendly Interactive Shell (fish) are still widely used today. FreeBSD has made tcsh its default shell for the root user and ksh is still used for a lot of Solaris operating systems. Fish is also a great starter shell with a lot of features to help the user navigate the shell without feeling lost.

While these shells are still very powerful and stable, we will be focusing on using bash, as we want to focus on consistency across multiple platforms and help you learn a very active and popular shell that's been around for 30 years.