Book Image

Python Data Analysis

By : Ivan Idris
Book Image

Python Data Analysis

By: Ivan Idris

Overview of this book

Table of Contents (22 chapters)
Python Data Analysis
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Key Concepts
Online Resources
Index

Software used in this book


The software used in this book is based on Python, so you are required to have Python installed. On some operating systems, Python is already installed. You, however, need to check whether the Python version is compatible with the software version you want to install. There are many implementations of Python, including commercial implementations and distributions. In this book, we will focus on the standard CPython implementation, which is guaranteed to be compatible with NumPy.

Note

You can download Python from https://www.python.org/download/. On this website, we can find installers for Windows and Mac OS X as well as source archives for Linux, Unix, and Mac OS X.

The software we will install in this chapter has binary installers for Windows, various Linux distributions, and Mac OS X. There are also source distributions if you prefer that. You need to have Python 2.4.x or above installed on your system. Python 2.7.x is currently the best Python version to have because most Scientific Python libraries support it. Python 2.7 will be supported and maintained until 2020. After that, we will have to switch to Python 3.

Installing software and setup

We will learn how to install and set up NumPy, SciPy, matplotlib, and IPython on Windows, Linux and Mac OS X. Let's look at the process in detail.

On Windows

Installing on Windows is, fortunately, a straightforward task that we will cover in detail. You only need to download an installer and a wizard will guide you through the installation steps. We will give you steps to install NumPy here. The steps to install the other libraries are similar. The actions we will take are as follows:

  1. Download installers for Windows from the SourceForge website (refer to the following table). The latest release versions may change, so just choose the one that fits your setup best.

  2. Choose the appropriate version. In this example, we chose numpy-1.8.1-win32-superpack-python2.7.exe.

  3. Open the EXE installer by double-clicking on it.

  4. Now, we can see a description of NumPy and its features. Click on the Next button.

    If you have Python installed, it should automatically be detected. If it is not detected, maybe your path settings are wrong.

    Tip

    At the end of this chapter, resources are listed just in case you have problems installing NumPy.

  5. Click on the Next button if Python is found; otherwise, click on the Cancel button and install Python (NumPy cannot be installed without Python). Click on the Next button. This is the point of no return. Well, kind of, but it is best to make sure that you are installing to the proper directory, and so on and so forth. Now the real installation starts. This may take a while.

    Note

    The situation around installers is rapidly evolving. Other alternatives exist in various stages of maturity (see http://www.scipy.org/install.html). It might be necessary to put the msvcp71.dll file in your system32 directory located at C:\Windows\. You can get it from http://www.dll-files.com/dllindex/dll-files.shtml?msvcp71.

On Linux

Installing the recommended software on Linux depends on the distribution you have. We will discuss how you would install NumPy from the command line;you could probably use graphical installers depending on your distribution (distro). The commands to install matplotlib, SciPy, and IPython are the same; only the package names are different. Installing matplotlib, SciPy, and IPython is recommended but optional.

Most Linux distributions have NumPy packages. We will go through the necessary commands for some of the popular Linux distributions as follows:

  • Run the following instructions from the command line to install NumPy on Red Hat:

    $ yum install python-numpy
    
  • To install NumPy on Mandriva, run the following command-line instruction:

    $ urpmi python-numpy
    
  • To install NumPy on Gentoo, run the following command-line instruction:

    $ sudo emerge numpy
    
  • To install NumPy on Debian or Ubuntu, we need to type the following:

    $ sudo apt-get install python-numpy
    

The following table gives an overview of the Linux distributions and corresponding package names for NumPy, SciPy, matplotlib, and IPython:

Linux distribution

NumPy

SciPy

matplotlib

IPython

Arch Linux

python-numpy

python-scipy

python-matplotlib

Ipython

Debian

python-numpy

python-scipy

python-matplotlib

Ipython

Fedora

numpy

python-scipy

python-matplotlib

Ipython

Gentoo

dev-python/numpy

scipy

matplotlib

ipython

openSUSE

python-numpy, python-numpy-devel

python-scipy

python-matplotlib

ipython

Slackware

numpy

scipy

matplotlib

ipython

On Mac OS X

You can install NumPy, matplotlib, and SciPy on Mac OS X with a graphical installer or from the command line with a port manager, such as MacPorts or Fink, depending on your preference. The prerequisite is to install XCode, as it is not part of OS X releases. We will install NumPy with a GUI installer using the following steps:

  1. We can get a NumPy installer from the SourceForge website at http://sourceforge.net/projects/numpy/files/. Similar files exist for matplotlib and SciPy.

  2. Just change numpy in the previous URL to scipy or matplotlib to get installers of the respective libraries. IPython didn't have a GUI installer at the time of writing this.

  3. Download the appropriate DMG file; usually the latest one is the best.

    Another alternative is SciPy Superpack (https://github.com/fonnesbeck/ScipySuperpack).

Whichever option you choose, it is important to make sure that updates that impact the system Python library don't negatively influence already-installed software by not building against the Python library provided by Apple. Install NumPy, matplotlib, and SciPy using the following steps:

  1. Open the DMG file (in this example, numpy-1.8.1-py2.7-python.org-macosx10.6.dmg).

  2. Double-click on the icon of the opened box—the one with a subscript that ends with .mpkg. We will be presented with the welcome screen of the installer.

  3. Click on the Continue button to go to the Read Me screen, where we will be presented with a short description of NumPy.

  4. Click on the Continue button to go to the License screen.

  5. Read the license, click on the Continue button, and then click on the Accept button when prompted to accept the license. Continue through the screens that follow from there, and click on the Finish button at the end.

Alternatively, we can install the libraries through the MacPorts route, with Fink or Homebrew. The following installation commands install all these packages. We only need NumPy for all the tutorials in this book, so please omit the packages you are not interested in.

  • To install with MacPorts, type in the following command:

    $ sudo port install py-numpy py-scipy py-matplotlib py-ipython
    
  • Fink also has packages for NumPy, such as scipy-core-py24, scipy-core-py25, and scipy-core-py26. The SciPy packages are scipy-py24, scipy-py25, and scipy-py26. We can install NumPy and other recommended packages that we will be using in this book for Python 2.6 with the following command:

    $ fink install scipy-core-py26 scipy-py26 matplotlib-py26