Skip to content Skip to navigation

Reverse engineering visual intelligence - James DiCarlo

Stanford Neurosciences Institute, James DiCarlo
October 11, 2018 - 11:30am to 12:15pm
Li Ka Shing Center, Berg Hall

Reverse engineering visual intelligence

James DiCarlo

Peter de Florez Professor of Neuroscience Head
Department of Brain and Cognitive Sciences
Massachusetts Institute of Technology


The brain and cognitive sciences are hard at work on a great quest — to reverse engineer the human mind and its intelligent behavior. Success in this quest will be achieved by intersecting the efforts of brain and cognitive scientists with forward engineering aimed at emulating intelligence (machine learning and AI). I will illustrate this using one aspect of human intelligence — visual object perception — and I will tell the story of how the brain sciences and computer sciences converged to create artificial deep neural networks that can support such tasks. These networks not only approach human performance, but their internal workings are modeled after — and largely explain and predict — the internal workings of the primate visual system. Yet, the primate visual system still outperforms current artificial networks, and I will show some new clues that the brain sciences can offer.


James DiCarlo is a professor of neuroscience, and head of the Department of Brain and Cognitive Sciences at the Massachusetts Institute of Technology.  He has been a Sloan Research Fellow, a Pew Biomedical Scholar, and a McKnight Scholar. His research goal is a computational understanding of the brain mechanisms that underlie primate visual intelligence.  Over the last 15 years, his group has helped reveal how population image transformations carried out by a deep stack of neocortical processing stages — called the primate ventral visual stream — are effortlessly able to extract object identity and other latent variables such as object position, scale, and pose from visual images.  His group is currently using a combination of large-scale neurophysiology, brain imaging, direct neural perturbation methods, and machine learning methods to build neurally-mechanistic computational models of the ventral visual stream and its support of cognition and behavior.  They aim to use this model-based understanding to inspire and develop new machine vision approaches, new neural prosthetics (brain-machine interfaces) to restore or augment lost senses, and a new foundation from which to attack human conditions such as agnosia, dyslexia, and autism.