Senior Research Fellowship

Royal Academy of Engineering

Royal Academy of Engineering/The Leverhulme Trust Senior Research Fellowship

From October 2019-September 2020 I am working full time on a research fellowship in the area of model-based deep learning.

Broadly, there are two ways in which we can create machines with visual capabilities:

  1. We can “teach” them what we know about the visual world. This knowledge is encapsulated in models that have been developed by scientists and engineers over centuries to explain processes such as the reflection of light from a surface or the projection of the 3D world to a 2D image via a camera. These mathematical models are fitted to data to provide an explanation of observed appearance. I refer to this as the “model-based” approach to computer vision.

  2. They can “learn” from data. Here, a black box machine learning algorithm (most successfully a convolutional neural network or CNN) is trained to map inputs to desired outputs. The black box knows nothing about the nature of the data or the problem at hand and learns to solve the entire problem from scratch. To achieve this remarkable feat, CNNs require a very large number of trainable parameters and hence, to avoid overfitting, a very large training set. Given thousands or millions of images along with the desired output label, CNNs are capable of learning state of the art performance on a wide range of vision tasks. I refer to this as the “learning-based” approach.

Over the past 5 years, the learning-based approach has come to completely dominate computer vision, providing truly remarkable performance breakthroughs. However, this performance step change has come at a cost. We have gone from having a very good understanding of approaches that don’t work very well to having methods that work extremely well for reasons we don’t understand. This project seeks to unify these two divergent approaches, to “model what we know and learn the rest”. Specifically, I will develop new architectures that allow CNNs to learn from explicit models.