Different Types of Data Science Problems
Much of your time as a data scientist is likely to be spent wrangling data: figuring out how to get it, getting it, examining it, making sure it's correct and complete, and joining it with other types of data. pandas is a widely used tool for data analysis in Python, and it can facilitate the data exploration process for you, as we will see in this chapter. However, one of the key goals of this book is to start you on your journey to becoming a machine learning data scientist, for which you will need to master the art and science of predictive modeling. This means using a mathematical model, or idealized mathematical formulation, to learn relationships within the data, in the hope of making accurate and useful predictions when new data comes in.
For predictive modeling use cases, data is typically organized in a tabular structure, with features and a response variable. For example, if you want to predict the price of a house based on...