A learning machine that a user can trust and easily operate needs to be fashioned with the ability of explanation. 

While deep neural networks lead to impressive successes, e.g. they can now reliably identify 1000 object classes, argue about their interactions through natural language, answer questions about their attributes through interactive dialogues, integrated interpretability is still in its early stages. In other words, we do not know why these deep learning based visual classifications systems work when they are accurate and why they do not work when they make mistakes. Enabling such transparency requires the interplay of different modalities such as images and text, whereas current deep networks are designed as a combination of different tools each optimising a different learning objective with extremely weak and uninterpretable communication channels. However, deep neural networks draw their power from their ability to process large amounts of data in an end-to-end manner through a feedback loop with forward and backward processing. Although interventions on the feedback loop have been implemented by removing neurons and back propagating gradients, a generalizable multi-purpose interpretability is still far from reach. 

Deep neural networks require a large amount of labeled training data to reach reliable conclusions. For instance, the system needs to observe the driver’s behavior at the red light to be able to learn to stop at red light both in a sunny and rainy weather, both in daylight and in night, both in fog and in snow, and so on. This causes a significant overhead in labelling every possible situation. Hence, our aim is to build an explainable machine learning system that can learn the meaning of “red light” and use this knowledge to identify many other related situations, e.g. although red light may look different in darkness vs daylight, the most important aspect in such a situation is to identify that the vehicle needs to stop. In other words, we would like to transfer the explainable behaviour of a decision maker to novel situations.

In summary, we would like to develop an end-to-end trainable decision maker operating in sparse data regime with an integrated interpretability module. Our main research direction to build such a system is two folds: learning representations with weak supervision and generating multimodal explanations of classification decisions.