Abstracting Deep Neural Networks into Concept Graphs for Concept Level Interpretability
Avinash Kori, Parth Natekar, Ganapathy Krishnamurthi, Balaji Srinivasan
The black-box nature of deep learning models prevents them from being
completely trusted in domains like biomedicine. Most explainability techniques
do not capture the concept-based reasoning that human beings follow. In this
work, we attempt to understand the behavior of trained models that perform
image processing tasks in the medical domain by building a graphical
representation of the concepts they learn. Extracting such a graphical
representation of the model's behavior on an abstract, higher conceptual level
would unravel the learnings of these models and would help us to evaluate the
steps taken by the model for predictions. We show the application of our
proposed implementation on two biomedical problems - brain tumor segmentation
and fundus image classification. We provide an alternative graphical
representation of the model by formulating a concept level graph as discussed
above, which makes the problem of intervention to find active inference trails
more tractable. Understanding these trails would provide an understanding of
the hierarchy of the decision-making process followed by the model. [As well as
overall nature of model]. Our framework is available at
this https URL