Book Image

Practical Data Science Cookbook

By : Tony Ojeda, Sean Patrick Murphy, Benjamin Bengfort, Abhijit Dasgupta
Book Image

Practical Data Science Cookbook

By: Tony Ojeda, Sean Patrick Murphy, Benjamin Bengfort, Abhijit Dasgupta

Overview of this book

<p>As increasing amounts of data is generated each year, the need to analyze and operationalize it is more important than ever. Companies that know what to do with their data will have a competitive advantage over companies that don't, and this will drive a higher demand for knowledgeable and competent data professionals.</p> <p>Starting with the basics, this book will cover how to set up your numerical programming environment, introduce you to the data science pipeline (an iterative process by which data science projects are completed), and guide you through several data projects in a step-by-step format. By sequentially working through the steps in each chapter, you will quickly familiarize yourself with the process and learn how to apply it to a variety of situations with examples in the two most popular programming languages for data analysis—R and Python.</p>
Table of Contents (18 chapters)
Practical Data Science Cookbook
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

Installing Python on Linux and Mac OS X


Luckily for us, Python comes preinstalled on most versions of Mac OS X and many flavors of Linux (both the latest versions of Ubuntu and Fedora come with Python 2.7 or later versions out of the box). Thus, we really don't have a lot to do for this recipe, except check whether everything is installed.

For this book, we will work with Python 2.7.x and not Version 3. Thus, if Python 3 is your default installed Python, you will have to make sure to use Python 2.7.

Getting ready

Just make sure you have a good Internet connection, just in case we need to install anything.

How to do it...

Perform the following steps in the command prompt:

  1. Open a new terminal window and type the following command:

    which python
    
  2. If you have Python installed, you should see something like this:

    /usr/bin/python
    
  3. Next, check which version you are running with the following command:

    python --version
    

    On my MacBook Air, I see the following:

    Python 2.7.5
    

How it works...

If you are planning on using OS X, you might want to set up a separate Python distribution on your machine for a few reasons. First, each time Apple upgrades your OS, it can and will obliterate your installed Python packages, forcing a reinstall of all previously installed packages. Secondly, new versions of Python will be released more frequently than Apple will update the Python distribution included with OS X. Thus, if you want to stay on the bleeding edge of Python releases, it is best to install your own distribution. Finally, Apple's Python release is slightly different from the official Python release and is located in a nonstandard location on the hard drive.

There are a number of tutorials available online to help walk you through the installation and setup of a separate Python distribution on your Mac. We recommend an excellent guide, available at http://docs.python-guide.org/en/latest/starting/install/osx/, to install a separate Python distribution on your Mac.

There's more...

One of the confusing aspects of Python is that the language is currently straddled between two versions. The Python 3.0 release is a fundamentally different version of the language that came out around Python Version 2.5. However, because Python is used in many operating systems (hence, it is installed by default on OS X and Linux), the Python Software Foundation decided to gradually upgrade the standard library to Version 3 to maintain backwards compatibility. Starting with Version 2.6, the Python 2.x versions have become increasingly like Version 3. The latest version is Python 3.4 and many expect a transition to happen in Python 3.5. Don't worry about learning the specific differences between Python 2.x and 3.x, although this book will focus primarily on the lastest 2.x version. Further, we have ensured that the code in this book is portable between Python 2.x and 3.x with some minor differences.

See also