Interpreting Deep Neural Networks by Explaining their Predictions
Fraunhofer Heinrich Hertz Institute in Berlin
Deep neural networks (DNNs) are reaching or even exceeding the human level on an increasing number of complex tasks. However, due to their complex non-linear structure, these models are usually applied in a black box manner, i.e., no information is provided about what exactly makes them arrive at their predictions. This lack of transparency can be a major drawback in practice. In my talk I will present a general technique, Layer-wise Relevance Propagation (LRP), for interpreting DNNs by explaining their predictions. I will demonstrate the effectivity of LRP when applied to various datatypes (images, text, audio, video, EEG/fMRI signals) and neural architectures (ConvNets, LSTMs), and will summarize what we have learned so far by peering inside these black boxes.