Book Image

Python Data Science Essentials - Third Edition

By : Alberto Boschetti, Luca Massaron
Book Image

Python Data Science Essentials - Third Edition

By: Alberto Boschetti, Luca Massaron

Overview of this book

Fully expanded and upgraded, the latest edition of Python Data Science Essentials will help you succeed in data science operations using the most common Python libraries. This book offers up-to-date insight into the core of Python, including the latest versions of the Jupyter Notebook, NumPy, pandas, and scikit-learn. The book covers detailed examples and large hybrid datasets to help you grasp essential statistical techniques for data collection, data munging and analysis, visualization, and reporting activities. You will also gain an understanding of advanced data science topics such as machine learning algorithms, distributed computing, tuning predictive models, and natural language processing. Furthermore, You’ll also be introduced to deep learning and gradient boosting solutions such as XGBoost, LightGBM, and CatBoost. By the end of the book, you will have gained a complete overview of the principal machine learning algorithms, graph analysis techniques, and all the visualization and deployment instruments that make it easier to present your results to an audience of both data science experts and business users
Table of Contents (11 chapters)

First Steps

Whether you are an eager learner of data science or a well-grounded data science practitioner, you can take advantage of this essential introduction to Python for data science. You can use it to the fullest if you already have at least some previous experience in basic coding, in writing general-purpose computer programs in Python, or in some other data-analysis-specific language such as MATLAB or R.

This book will delve directly into Python for data science, providing you with a straight and fast route to solving various data science problems using Python and its powerful data analysis and machine learning packages. The code examples that are provided in this book don't require you to be a master of Python. However, they will assume that you at least know the basics of Python scripting, including data structures such as lists and dictionaries, and the workings of class objects. If you don't feel confident about these subjects or have minimal knowledge of the Python language, before reading this book, we suggest that you take an online tutorial. There are good online tutorials that you may take, such as the one offered by the Code Academy course at https://www.codecademy.com/learn/learn-python, the one by Google's Python class at https://developers.google.com/edu/python/, or even the Whirlwind tour of Python by Jake Vanderplas (https://github.com/jakevdp/WhirlwindTourOfPython). All the courses are free, and, in a matter of a few hours of study, they should provide you with all the building blocks that will ensure you enjoy this book to the fullest. In order to provide an integration of the two aforementioned free courses, we have also prepared a tutorial of our own, which can be found in the appendix of this book.

In any case, don't be intimidated by our starting requirements; mastering Python enough for data science applications isn't as arduous as you may think. It's just that we have to assume some basic knowledge on the reader's part because our intention is to go straight to the point of doing data science without having to explain too much about the general aspects of the Python language that we will be using.

Are you ready, then? Let's get started!

In this short introductory chapter, we will work through the basics to set off in full swing and go through the following topics:

  • How to set up a Python data science toolbox
  • Using Jupyter
  • An overview of the data that we are going to study in this book