The role of data audits and quality checks in fairness
Even before digging deep into predictive algorithms and evaluating fairness metrics, we must try to see whether the training data used in the process is skewed and biased toward a majority of the population. This is mainly because most bias results from not having enough data for a disadvantaged or minority sector of a population. Additionally, bias also emerges when we do not apply any of the techniques to deal with data imbalance. In such scenarios, it is essential for us to integrate explainability tools to justify the variability and skewness of the data.
Let's now investigate how to measure data imbalance and explain variability with the use of certain tools. One of the tools we are going to use first is Fairlens, which aids in fairness assessment and improvement (such as evaluating fairness metrics, mitigation algorithms, plotting, and so on). Some examples are given here with code snippets (on the COMPAS dataset...