Understanding TCAV intuitively
The idea of TCAV was first introduced by Kim et al. in their work – Interpretability beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV) (https://arxiv.org/pdf/1711.11279.pdf). The framework was designed to provide interpretability beyond feature attribution, particularly for deep learning models that rely on low-level transformed features that are not human-interpretable. TCAV aims to explain the opaque internal state of the deep learning model using abstract, high-level, human-friendly concepts. In this section, I will present you with an intuitive understanding of TCAV and explain how it works to provide human-friendly explanations.
What is TCAV?
So far, we have covered many methods and frameworks to explain ML models through feature-based approaches. But it might occur to you that since most ML models operate on low-level features, the feature-based explanation approaches might highlight features that...