Book Image

The Applied Data Science Workshop - Second Edition

By : Alex Galea
Book Image

The Applied Data Science Workshop - Second Edition

By: Alex Galea

Overview of this book

From banking and manufacturing through to education and entertainment, using data science for business has revolutionized almost every sector in the modern world. It has an important role to play in everything from app development to network security. Taking an interactive approach to learning the fundamentals, this book is ideal for beginners. You’ll learn all the best practices and techniques for applying data science in the context of real-world scenarios and examples. Starting with an introduction to data science and machine learning, you’ll start by getting to grips with Jupyter functionality and features. You’ll use Python libraries like sci-kit learn, pandas, Matplotlib, and Seaborn to perform data analysis and data preprocessing on real-world datasets from within your own Jupyter environment. Progressing through the chapters, you’ll train classification models using sci-kit learn, and assess model performance using advanced validation techniques. Towards the end, you’ll use Jupyter Notebooks to document your research, build stakeholder reports, and even analyze web performance data. By the end of The Applied Data Science Workshop, you’ll be prepared to progress from being a beginner to taking your skills to the next level by confidently applying data science techniques and tools to real-world projects.
Table of Contents (8 chapters)

Installing Libraries

pip comes pre-installed with Anaconda. Once Anaconda is installed on your machine, all the required libraries can be installed using pip, for example, pip install numpy. Alternatively, you can install all the required libraries using pip install –r requirements.txt. You can find the requirements.txt file at https://packt.live/2YBPK5y.

The exercises and activities will be executed in Jupyter Notebooks. Jupyter is a Python library and can be installed in the same way as the other Python libraries – that is, with pip install jupyter, but fortunately, it comes pre-installed with Anaconda. To open a notebook, simply run the command jupyter notebook in the Terminal or Command Prompt.

Working with JupyterLab and Jupyter Notebook

You'll be working on different exercises and activities using either the JupyterLab or Jupyter Notebook platforms. These exercises and activities can be downloaded from the associated GitHub repository.

Download the repository from https://packt.live/2zwhfom.

You can either clone it using git or download it as a zipped folder by clicking on the green Clone or download button in the upper-right corner.

In order to launch a Jupyter Notebook workbook, you should first use the Terminal to navigate to your source code. See the following, for example:

cd The-Applied-Data-Science-Workshop

Once you are in the project directory, simply run jupyter lab to start up JupyterLab. Similarly, for Jupyter Notebook, run jupyter notebook.

Accessing the Code Files

You can find the complete code files of this book at https://packt.live/2zwhfom. You can also run many activities and exercises directly in your web browser by using the interactive lab environment at https://packt.live/3d6yr1A.

We've tried to support interactive versions of all activities and exercises, but we recommend a local installation as well for instances where this support isn't available.

If you have any issues or questions about installation, please email us at [email protected].