The sparse coding model has been shown to provide a good account of neural response properties at early stages of sensory processing. However, despite several promising efforts it is still unclear how to exploit the structure in a sparse code for learning higher-order structure at later stages of processing. Here I shall argue that the key lies in understanding how continuous transformations in the signal space (manifolds) are expressed in the elements of a sparse code, and in deriving the proper computations that disentangle these transformations from the underlying invariances (persistence). I shall present a new signal representation framework, called the sparse manifold transform, that exploits temporally-persistent structure in the input (similar to slow feature analysis) in order to turn non-linear transformations in the signal space into linear interpolations in a representational embedding space. The SMT thus provides a way to progressively flatten manifolds, allowing higher forms of structure to be learned at each higher stage of processing. The SMT also provides a principled way to derive the pooling layers commonly used in deep networks, and since the transform is approximately invertible, dictionary elements learned at any level in the hierarchy may be directly visualized. Possible neural substrates and mechanisms of SMT shall be discussed. Joint work with Yubei Chen and Dylan Paiton.
 Chen Y, Paiton DM, Olshausen BA (2018). The Sparse Manifold Transform. NeurIPS 31.
 Cadieu CF, Olshausen BA (2012) Learning intermediate-level representations of form and motion from natural movies. Neural Computation, 24(4):827-66