Book Image

Python: Data Analytics and Visualization

By : Martin Czygan, Phuong Vo.T.H, Ashish Kumar, Kirthi Raman
Book Image

Python: Data Analytics and Visualization

By: Martin Czygan, Phuong Vo.T.H, Ashish Kumar, Kirthi Raman

Overview of this book

You will start the course with an introduction to the principles of data analysis and supported libraries, along with NumPy basics for statistics and data processing. Next, you will overview the Pandas package and use its powerful features to solve data-processing problems. Moving on, you will get a brief overview of the Matplotlib API .Next, you will learn to manipulate time and data structures, and load and store data in a file or database using Python packages. You will learn how to apply powerful packages in Python to process raw data into pure and helpful data using examples. You will also get a brief overview of machine learning algorithms, that is, applying data analysis results to make decisions or building helpful products such as recommendations and predictions using Scikit-learn. After this, you will move on to a data analytics specialization—predictive analytics. Social media and IOT have resulted in an avalanche of data. You will get started with predictive analytics using Python. You will see how to create predictive models from data. You will get balanced information on statistical and mathematical concepts, and implement them in Python using libraries such as Pandas, scikit-learn, and NumPy. You’ll learn more about the best predictive modeling algorithms such as Linear Regression, Decision Tree, and Logistic Regression. Finally, you will master best practices in predictive modeling. After this, you will get all the practical guidance you need to help you on the journey to effective data visualization. Starting with a chapter on data frameworks, which explains the transformation of data into information and eventually knowledge, this path subsequently cover the complete visualization process using the most popular Python libraries with working examples This Learning Path combines some of the best that Packt has to offer in one complete, curated package. It includes content from the following Packt products: ? Getting Started with Python Data Analysis, Phuong Vo.T.H &Martin Czygan •Learning Predictive Analytics with Python, Ashish Kumar •Mastering Python Data Visualization, Kirthi Raman
Table of Contents (6 chapters)

What you need for this learning path

You will need a Python programming environment installed on your system. The first module uses a recent Python 2, but many examples will work with Python 3 as well.b The versions of the libraries used in the first module are: NumPy 1.9.2, Pandas 0.16.2, matplotlib 1.4.3, tables 3.2.2, pymongo 3.0.3, redis 2.10.3, and scikit-learn 0.16.1. As these packages are all hosted on PyPI, the Python package index, they can be easily installed with pip. To install NumPy, you would write:

$ pip install numpy If you are not using them already, we suggest you take a look at virtual environments for managing isolating Python environment on your computer. For Python 2, there are two packages of interest there: virtualenv and virtualenvwrapper. Since Python 3.3, there is a tool in the standard library called pyvenv (https://docs.python.org/3/library/venv.html), which serves the same purpose. Most libraries will have an attribute for the version, so if you already have a library installed, you can quickly check its version:

>>> import redis

>>> redis.__version__

‘2.10.3’).

While all the examples in second module can be run interactively in a Python shell. We used IPython 4.0.0 with Python 2.7.10.

For the third module, you need Python 2.7.6 or a later version installed on your operating system. For the examples in this module, Mac OS X 10.10.5’s Python default version (2.7.6) has been used. Install the prepackaged scientific Python distributions, such as Anaconda from Continuum or Enthought Python Distribution if possible