Book Image

Comet for Data Science

By : Angelica Lo Duca
Book Image

Comet for Data Science

By: Angelica Lo Duca

Overview of this book

This book provides concepts and practical use cases which can be used to quickly build, monitor, and optimize data science projects. Using Comet, you will learn how to manage almost every step of the data science process from data collection through to creating, deploying, and monitoring a machine learning model. The book starts by explaining the features of Comet, along with exploratory data analysis and model evaluation in Comet. You’ll see how Comet gives you the freedom to choose from a selection of programming languages, depending on which is best suited to your needs. Next, you will focus on workspaces, projects, experiments, and models. You will also learn how to build a narrative from your data, using the features provided by Comet. Later, you will review the basic concepts behind DevOps and how to extend the GitLab DevOps platform with Comet, further enhancing your ability to deploy your data science projects. Finally, you will cover various use cases of Comet in machine learning, NLP, deep learning, and time series analysis, gaining hands-on experience with some of the most interesting and valuable data science techniques available. By the end of this book, you will be able to confidently build data science pipelines according to bespoke specifications and manage them through Comet.
Table of Contents (16 chapters)
1
Section 1 – Getting Started with Comet
5
Section 2 – A Deep Dive into Comet
10
Section 3 – Examples and Use Cases

Introducing EDA

Exploratory Data Analysis (EDA) is one of the preliminary steps in a data science project life cycle. It enables us to understand our data in order to extract meaningful information from it. Through EDA, we can understand the underlying structure in the data.

We can think about the EDA phase as a small data science project, in which the real data analysis part (model definition and evaluation) is missing. Therefore, a typical EDA process is composed of the steps shown in the following figure:

Figure 2.1 – The main steps of an EDA process

The previous figure shows that an EDA process is composed of the following steps:

  • Problem setting
  • Data preparation
  • Preliminary data analysis
  • Preliminary results

Let's investigate each step separately, starting from the first step – problem setting.

Problem setting

Problem setting is the capability to define which kind of questions our dataset can answer...