#### Overview of this book

Data Science with Python begins by introducing you to data science and teaches you to install the packages you need to create a data science coding environment. You will learn three major techniques in machine learning: unsupervised learning, supervised learning, and reinforcement learning. You will also explore basic classification and regression techniques, such as support vector machines, decision trees, and logistic regression. As you make your way through the book, you will understand the basic functions, data structures, and syntax of the Python language that are used to handle large datasets with ease. You will learn about NumPy and pandas libraries for matrix calculations and data manipulation, discover how to use Matplotlib to create highly customizable visualizations, and apply the boosting algorithm XGBoost to make predictions. In the concluding chapters, you will explore convolutional neural networks (CNNs), deep learning algorithms used to predict what is in an image. You will also understand how to feed human sentences to a neural network, make the model process contextual information, and create human language processing systems to predict the outcome. By the end of this book, you will be able to understand and implement any new data science algorithm and have the confidence to experiment with tools or libraries other than those covered in the book.
Free Chapter
Introduction to Data Science and Data Pre-Processing
Data Visualization
Introduction to Machine Learning via Scikit-Learn
Dimensionality Reduction and Unsupervised Learning
Mastering Structured Data
Decoding Images
Processing Human Language
Tips and Tricks of the Trade

## Cross-entropy Loss

Cross-entropy loss is used when we are working with a classification problem where the output of each class is a probability value between 0 and 1. The loss here increases as the model deviates from the actual value; it follows a negative log graph. This helps when the model predicts probabilities that are far from the actual value. For example, if the probability of the true label is 0.05, we penalize the model with a huge loss. On the other hand, if the probability of the true label is 0.40, we penalize it with a smaller loss.

###### Figure 6.9: Graph of log loss versus probability

The preceding graph shows that the loss increases exponentially as the predictions get further from the true label. The formula that the cross-entropy loss follows is as follows:

###### Figure 6.10: Cross entropy loss formula

M is number of classes in the dataset (10 in the case of MNIST), y is the true label, and p is the predicted probability of the class. We prefer cross-entropy...