Explain Yourself - A Primer on ML Interpretability & Explainability
On the intersection of causality and interpretability, exciting and promising research areas in the field of Machine Learning
Explain Yourself - A Primer on ML Interpretability & Explainability
What is intelligence? The project to define what the late Marvin Minsky refers to as a suitcase word — words that have so much packed inside them, making it difficult for us to unpack and understand this embedded intricacy in its entirety — has not been without its fair share of challenges. The term does not have a single agreed-upon definition, with the dimensions of description shifting from optimization or efficient search space exploration to rationality and the ability to adapt to uncertain environments, depending on which expert you ask. The confusion becomes more salient when one hears news of machines achieving super-human performance in activities like Chess or Go — traditional stand-ins for high intellectual aptitude — but fail miserably in tasks like grabbing objects or moving across uneven terrain, which most of us do without thinking.
But, several themes do emerge when we try to corner the concept. Our ability to explain why we do what we do makes a fair number of appearances in the list of definitions proposed by multiple disciplines. If we are to extend this to algorithms — and there is no reason why we cannot or should not — we should expect intelligent machines to explain themselves in ways sensible to us. That is precisely where interpretability and explainability fits into the picture.