Book Image

Practical Data Science Cookbook, Second Edition - Second Edition

By : Prabhanjan Narayanachar Tattar, Bhushan Purushottam Joshi, Sean Patrick Murphy, ABHIJIT DASGUPTA, Anthony Ojeda
Book Image

Practical Data Science Cookbook, Second Edition - Second Edition

By: Prabhanjan Narayanachar Tattar, Bhushan Purushottam Joshi, Sean Patrick Murphy, ABHIJIT DASGUPTA, Anthony Ojeda

Overview of this book

As increasing amounts of data are generated each year, the need to analyze and create value out of it is more important than ever. Companies that know what to do with their data and how to do it well will have a competitive advantage over companies that don’t. Because of this, there will be an increasing demand for people that possess both the analytical and technical abilities to extract valuable insights from data and create valuable solutions that put those insights to use. Starting with the basics, this book covers how to set up your numerical programming environment, introduces you to the data science pipeline, and guides you through several data projects in a step-by-step format. By sequentially working through the steps in each chapter, you will quickly familiarize yourself with the process and learn how to apply it to a variety of situations with examples using the two most popular programming languages for data analysis—R and Python.
Table of Contents (17 chapters)
Title Page
Credits
About the Authors
About the Reviewer
www.PacktPub.com
Preface

Installing extra Python packages


There are a few additional Python libraries that you will need throughout this book. Just as R provides a central repository for community-built packages, so does Python in the form of the Python Package Index (PyPI). As of August 28, 2014, there were 48,054 packages in PyPI.

Getting ready

A reasonable Internet connection is all that is needed for this recipe. Unless otherwise specified, these directions assume that you are using the default Python distribution that came with your system, and not Anaconda.

How to do it...

The following steps will show you how to download a Python package and install it from the command line:

  1. Download the source code for the package in the place you like to keep your downloads.
  2. Unzip the package.
  3. Open a terminal window.
  4. Navigate to the base directory of the source code.
  5. Type in the following command:
python setup.py install
  1. If you need root access, type in the following command:
sudo python setup.py install

To use pip, the contemporary and easiest way to install Python packages, follow these steps:

  1. First, let's check whether you have pip already installed by opening a terminal and launching the Python interpreter. At the interpreter, type:
>>>import pip
  1. If you don't get an error, you have pip installed and can move on to step 5. If you see an error, let's quickly install pip.
  2. Download the get-pip.py file from https://raw.github.com/pypa/pip/master/contrib/get-pip.py onto your machine.
  3. Open a terminal window, navigate to the downloaded file, and type:
python get-pip.py

Alternatively, you can type in the following command:

sudo python get-pip.py
  1. Once pip is installed, make sure you are at the system command prompt.
  2. If you are using the default system distribution of Python, type in the following:
pip install networkx

Alternatively, you can type in the following command:

sudo pip install networkx
  1. If you are using the Anaconda distribution, type in the following command:
conda install networkx
  1. Now, let's try to install another package, ggplot. Regardless of your distribution, type in the following command:
pip install ggplot

Alternatively, you can type in the following command:

sudo pip install ggplot

How it works...

You have at least two options to install Python packages. In the preceding old fashioned way, you download the source code and unpack it on your local computer. Next, you run the included setup.py script with the install flag. If you want, you can open the setup.py script in a text editor and take a more detailed look at exactly what the script is doing. You might need the sudo command, depending on the current user's system privileges.

As the second option, we leverage the pip installer, which automatically grabs the package from the remote repository and installs it to your local machine for use by the system-level Python installation. This is the preferred method, when available.

There's more...

The pip is capable, so we suggest taking a look at the user guide online. Pay special attention to the very useful pip freeze > requirements.txt functionality so that you can communicate about external dependencies with your colleagues.

Finally, conda is the package manager and pip replacement for the Anaconda Python distribution or, in the words of its home page, a cross-platform, Python-agnostic binary package manager. Conda has some very lofty aspirations that transcend the Python language. If you are using Anaconda, we encourage you to read further on what conda can do and use it, and not pip, as your default package manager.

See also

You can also refer to the following: