April 25, 2017

Network Dissection

One of the principal challenges facing humanity is transparency: how to maintain human comprehension and control as we build ever more complex systems and societies.

Doing a PhD at MIT has allowed me to pour my efforts into one corner of this problem that I think is important: cracking open the black box of deep neural networks. Deep neural networks are self-programming systems that use numerical methods to create layered models that are incomprehensibly complex. And yet despite our ignorance of how deep nets work, we engineers have been busy stitching these opaque systems into our everyday lives.

I have just posted my first paper done in this area at MIT. I am proud of the work. It was done together with Bolei Zhou and Aditya Khosla and our advisers Aude Oliva and Antonio Torralba. Motivated by the notion that the first step towards improving something is to measure it, we lay out a way to quantify human interpretability of deep networks. We show how interpretability varies over a range of different types of neural networks.

We also use our measurement to discover a fact that contradicts prevailing wisdom. We find that interpretability is not isotropic in representation space. That means that neural networks align their representations with individual variables which are much more interpretable than random linear combinations of those variables. This behavior is a hint that networks may be decomposing problems along human-understandable lines. Networks may be rediscovering human common sense.

It is just a first step. But it is a step of a longer research program I want to pursue.

Read about Network Dissection here; code and data are available. We have posted a preprint of our paper on arxiv, and we will be presenting it at CVPR 2017 this summer.

Posted by David at April 25, 2017 10:01 PM
Post a comment

Remember personal info?