References
Please refer to the following resources to gain additional information:
- Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV): https://arxiv.org/pdf/1711.11279.pdf
- TCAV Python framework - https://github.com/tensorflow/tcav
- Koh et al. "Concept Bottleneck Models": https://arxiv.org/abs/2007.04612
- Guillaume Alain and Yoshua Bengio, "Understanding intermediate layers using linear classifier probes": https://arxiv.org/abs/1610.01644
- Ghorbani, Amirata, James Wexler, James Zou and Been Kim, "Towards automatic concept-based explanations": https://arxiv.org/abs/1902.03129
- Detecting Concepts, Chapter 10.3 Molnar, C. (2022). Interpretable Machine Learning: A Guide for Making Black Box Models Explainable (2nd ed.).: https://christophm.github.io/interpretable-ml-book/detecting-concepts.html