Book Image

Python Data Analysis - Second Edition

By : Ivan Idris
Book Image

Python Data Analysis - Second Edition

By: Ivan Idris

Overview of this book

Data analysis techniques generate useful insights from small and large volumes of data. Python, with its strong set of libraries, has become a popular platform to conduct various data analysis and predictive modeling tasks. With this book, you will learn how to process and manipulate data with Python for complex analysis and modeling. We learn data manipulations such as aggregating, concatenating, appending, cleaning, and handling missing values, with NumPy and Pandas. The book covers how to store and retrieve data from various data sources such as SQL and NoSQL, CSV fies, and HDF5. We learn how to visualize data using visualization libraries, along with advanced topics such as signal processing, time series, textual data analysis, machine learning, and social media analysis. The book covers a plethora of Python modules, such as matplotlib, statsmodels, scikit-learn, and NLTK. It also covers using Python with external environments such as R, Fortran, C/C++, and Boost libraries.
Table of Contents (22 chapters)
Python Data Analysis - Second Edition
Credits
About the Author
About the Reviewers
www.PacktPub.com
Customer Feedback
Preface
Key Concepts
Online Resources

Reading and writing to Excel with Pandas


Excel files contain a lot of important data. Of course, we can export that data in other more portable formats, such as CSV. However, it is more convenient to read and write Excel files with Python. As is common in the Python world, there is currently more than one project working towards the goal of providing Excel I/O capabilities. The modules that we will need to install to get Excel I/O to work with pandas are somewhat obscurely documented. The reason is that the projects that pandas depends on are independent and rapidly developing. The pandas package is picky about the files it accepts as Excel files. These files must have the .xls or .xlsx suffix, otherwise, we get the following error:

ValueError: No engine for filetype: ''

This is easy to fix. For instance, if we create a temporary file, we just give it the proper suffix. If you don't install anything, you will get the following error message:

ImportError: No module named openpyxl.workbook...