Measuring data forecastability
So far, we have learned about the importance of analyzing data by inspecting its consistency and purity, looking for monitoring drifts, and checking for any adversarial attacks to explain the working of ML models. But some datasets are extremely complex and, hence, training accurate models even with complex algorithms is not feasible. If the trained model is not accurate, it is prone to make incorrect predictions. Now the question is how do we gain the trust of our end users if we know that the trained model is not extremely accurate in making the correct predictions?
I would say that the best way to gain trust is by being transparent and clearly communicating what is feasible. So, measuring data forecastability and communicating the model's efficiency to end users helps to set the right expectation.
Data forecastability is an estimation of the model's performance using the underlying data. For example, let's suppose we have a...