Apparently, no machine learning system can be built without data. Data collection should be our first focus.
Best practices in the data preparation stage
Best practice 1 - completely understand the project goal
Before starting to collect data, we should make sure that the goal of the project, the business problem, is completely understood. As it will guide us to what data sources to look into, and where sufficient domain knowledge and expertise is also required. For example, in the previous chapter, our goal was to predict future prices of the DJIA index, so we collected data of its past performance, instead of past performance of a European stock; in Chapter 5, Click-Through Prediction with Tree-Based Algorithms and Chapter 6, Click-Through Prediction with Logistic...