Simon Schug
Postdoctoral Researcher
Princeton University
Email: sschug [ät] princeton.edu
Github ∙
Google
Scholar ∙
Bluesky
In my research, I seek to understand the principles that allow large-scale neural networks (including the human brain) to rapidly adapt and systematically generalize.
In some of my recent work, we identify conditions under which modular neural networks will compositionally generalize, find that simply scaling neural networks can lead to compositional generalization, and discover how transformers' multi-head attention can capture compositional structure in abstract reasoning tasks.
I am a postdoctoral researcher in the Department of Computer Science at Princeton University where I work with Brenden Lake at the intersection of machine learning and cognitive science.
I completed my Ph.D. in Computer Science at ETH Zurich in February 2025 where I was supervised by João Sacramento and Angelika Steger and worked on meta-learning and compositional generalization in neural networks. During the final year of my Ph.D. I spent six months in the Foundational Research Team at Google DeepMind researching efficient inference in large-scale mixture of experts architectures.
Prior to my doctoral studies, I completed my Master's thesis at the University of Cambridge with Maté Lengyel and received an M.Sc. in Neural Systems & Computation from ETH Zurich and UZH in 2020, after having simultaneously pursued and obtained B.Sc. degrees in both Electrical Engineering and Psychology from RWTH Aachen in 2017.
* – equal contributions
A small utility to add Fully-Sharded Data Parallelism (FSDP) with minimal code changes using jax.
A minimal but highly flexible hypernetwork implementation in jax using flax.
An extendible meta-learning library in jax for research. It bundles various meta-learning algorithms and architectures that can be flexibly combined.
An optax implementation of the shrink and perturb algorithm proposed by Ash & Adams to deal with pathologies of warm starting.
Lecture on Bilevel optimization problems in neuroscience and machine learning at the MLSS^N 2022 summer school in Krakow.
A one-hour workshop on models of decision making in neuroscience taught for the Swiss Study Foundation.
A two-hour introductory workshop on cryptography taught online for hebbian, adaptable for students between 10-18 years.
Pytorch implementation of the Equilibrium Propagation algorithm.
A four-hour workshop on git taught at the Goethe-University Frankfurt for the Frankfurt Open Science Initiative.
¹ Survival of motor neuron 1, also known as component of gems 1, is a gene that encodes the SMN protein in humans.