Richard Zemel - Professor of Computer Science at Columbia University

Richard Zemel

Trianthe Dakolias Professor of Engineering and Applied Science
Professor, Computer Science
Columbia University

619 Schapiro CEPSR
530 West 120th St, New York, NY 10027
zemel at cs dot columbia dot edu

Brief Bio

I am a professor of computer science at Columbia. I am also the director of the NSF AI Institute for ARtificial and Natural Intelligence (ARNI).

I was previously the co-founder and inaugural research director of the Vector Institute for Artificial Intelligence, and I have been a visiting researcher at Amazon, Google, Meta and Spotify, and founded and ran a startup company, Smartfinance.

I am a Canadian Institute for Advanced Research AI Chair and on the Advisory Board of the Neural Information Processing Society. I received an AI Lifetime Achievement Award (CAIA) and a Pioneer of AI Award (NVIDIA).

Research & Teaching

By developing learning algorithms that flexibly adapt across tasks and environments, our research aims to create AI systems that are reliable, controllable, and trustworthy. Zgroup investigates how machine learning models can integrate diverse modalities, continually acquire new skills, quantify their own uncertainty, and remain robust in unfamiliar settings. Other interests include algorithmic fairness, interpretability, computational neuroscience, and applications of machine learning to high-stakes scientific and societal decisions.

I typically recruit one or two PhD students to join Zgroup each year. Prospective PhD students should apply to the PhD program. Due to the volume of email we receive, we unfortunately cannot respond to emails about applications.

In Fall 2026 I will teach two courses: Neural Networks & Deep Learning (COMS 4776), and Continual Learning & Memory Models (COMS 6998).

Current Zgroup PhD Students and Postdocs

Zgroup icon Zemel research group photo

Graduated PhD Students and Former Postdocs

Representative Papers

The papers below are representative examples drawn from our main research directions. Some develop systems that flexibly span language, vision, and real-world reasoning. Others tackle how to acquire new skills efficiently and stably as the world changes. Another strand develops models that know what they know — and what they don't — and provides reliable performance guarantees. A final strand builds systems that handle scenarios beyond their training data, producing trustworthy responses to ambiguous inputs.

2026

Level Up: Defining and Exploiting Transitional Problems for Curriculum Learning
Zhenwei Tang, Amogh Inamdar, Ashton Anderson, Richard Zemel
Paper BibTeX

Whom to Query for What: Adaptive Group Elicitation via Multi-Turn LLM Interactions
Ruomeng Ding, Tianwei Gao, Thomas P. Zollo, Eitan Bachmat, Richard Zemel, Zhun Deng
Paper BibTeX

2025

Adaptive Elicitation of Latent Information Using Natural Language
Jimmy Wang, Thomas Zollo, Richard Zemel, Hongseok Namkoong
Paper BibTeX

Guiding LLM Decision-Making with Fairness Reward Models
Zara Hall, Melanie Subbiah, Thomas P Zollo, Kathleen McKeown, Richard Zemel
Paper BibTeX

Let the Experts Speak: Improving Survival Prediction & Calibration via Mixture-of-Experts Heads
Todd Morrill, Aahlad Puli, Murad Megjhani, Soojin Park, Richard Zemel
Paper BibTeX

QuEst: Enhancing Estimates of Quantile-Based Distributional Measures Using Model Predictions
Zhun Deng, Thomas P Zollo, Benjamin Eyre, Amogh Inamdar, David Madras, Richard Zemel
Paper BibTeX

Replay Can Provably Increase Forgetting
Yasaman Mahdaviyeh, James Lucas, Mengye Ren, Andreas S. Tolias, Richard Zemel, Toniann Pitassi
Paper BibTeX

Schemex: Discovering Structural Abstractions from Examples
Sitong Wang, Samia Menon, Dingzeyu Li, Xiaojuan Ma, Richard Zemel, Lydia B. Chilton
Paper BibTeX

Societal Alignment Frameworks Can Improve LLM Alignment
Karolina Stańczak, Nicholas Meade, Mehar Bhatia, Hattie Zhou, Konstantin Böttinger, Jeremy Barnes, Jason Stanley, Jessica Montgomery, Richard Zemel, Nicolas Papernot, Nicolas Chapados, Denis Therien, Timothy P. Lillicrap, Ana Marasović, Sylvie Delacroix, Gillian K. Hadfield, Siva Reddy
Paper BibTeX

Test-Time Warmup for Multimodal Large Language Models
Nikita Rajaneesh, Thomas Zollo, Richard Zemel
Paper BibTeX

Towards Effective Discrimination Testing for Generative AI
Thomas P. Zollo, Nikita Rajaneesh, Richard Zemel, Talia B. Gillis, Emily Black
Paper BibTeX

Towards Safety Reasoning in LLMs: AI-agentic Deliberation for Policy-embedded CoT Data Creation
Tharindu Kumarage, Ninareh Mehrabi, Anil Ramakrishna, Xinyan Zhao, Richard Zemel, Kai-Wei Chang, Aram Galstyan, Rahul Gupta, Charith Peris
Paper BibTeX

2024

Attribute Controlled Fine-tuning for Large Language Models: A Case Study on Detoxification
Tao Meng, Ninareh Mehrabi, Palash Goyal, Anil Ramakrishna, Aram Galstyan, Richard Zemel, Kai-Wei Chang, Rahul Gupta, Charith Peris
Paper BibTeX

Controlling the World by Sleight of Hand
Sruthi Sudhakar, Ruoshi Liu, Basile Van Hoorick, Carl Vondrick, Richard Zemel
Paper BibTeX

Improving Predictor Reliability with Selective Recalibration
Thomas P. Zollo, Zhun Deng, Jake C. Snell, Toniann Pitassi, Richard Zemel
Paper BibTeX

Integrating Present and Past in Unsupervised Continual Learning
Yipeng Zhang, Laurent Charlin, Richard Zemel, Mengye Ren
Paper BibTeX

Out of the Ordinary: Spectrally Adapting Regression for Covariate Shift
Benjamin Eyre, Elliot Creager, David Madras, Vardan Papyan, Richard Zemel
Paper BibTeX

Partial Federated Learning
Tiantian Feng, Anil Ramakrishna, Jimit Majmudar, Charith Peris, Jixuan Wang, Clement Chung, Richard Zemel, Morteza Ziyadi, Rahul Gupta
Paper BibTeX

Quantile Risk Control: A Flexible Framework for Bounding the Probability of High-Loss Predictions
Jake C. Snell, Thomas P. Zollo, Zhun Deng, Toniann Pitassi, Richard Zemel
Paper BibTeX

2023

"I'm fully who I am": Towards centering transgender and non-binary voices to measure biases in open language generation
Anaelia Ovalle, Palash Goyal, Jwala Dhamala, Zachary Jaggers, Kai-Wei Chang, Aram Galstyan, Richard Zemel, Rahul Gupta
Paper BibTeX

Coordinated replay sample selection for continual federated learning
Jack Good, Jimit Majmudar, Christophe Dupuy, Jixuan Wang, Charith Peris, Clement Chung, Richard Zemel, Rahul Gupta
Paper BibTeX

Differentially private decoding in large language models
Jimit Majmudar, Christophe Dupuy, Charith Peris, Sami Smaili, Rahul Gupta, Richard Zemel
Paper BibTeX

Distribution-free statistical dispersion control for societal applications
Zhun Deng, Thomas Zollo, Jake Snell, Toniann Pitassi, Richard Zemel
Paper BibTeX

FLIRT: Feedback Loop In-context Red Teaming
Ninareh Mehrabi, Palash Goyal, Christophe Dupuy, Qian Hu, Shalini Ghosh, Richard Zemel, Kai-Wei Chang, Aram Galstyan, Rahul Gupta
Paper BibTeX

ICL Markup: Structuring in-context learning using soft-token tags
Marc-Etienne Brunet, Ashton Anderson, Richard Zemel
Paper BibTeX

JAB: Joint Adversarial Prompting and Belief Augmentation
Ninareh Mehrabi, Palash Goyal, Anil Ramakrishna, Jwala Dhamala, Shalini Ghosh, Richard Zemel, Kai-Wei Chang, Aram Galstyan, Rahul Gupta
Paper BibTeX

On the steerability of large language models toward data-driven personas
Junyi Li, Ninareh Mehrabi, Charith Peris, Palash Goyal, Kai-Wei Chang, Aram Galstyan, Richard Zemel, Rahul Gupta
Paper BibTeX

Prompt Risk Control: A flexible framework for bounding the probability of high-loss predictions
Thomas Zollo, Todd Morrill, Zhun Deng, Jake Snell, Toniann Pitassi, Richard Zemel
Paper BibTeX

Resolving ambiguities in text-to-image generative models
Ninareh Mehrabi, Palash Goyal, Apurv Verma, Jwala Dhamala, Varun Kumar, Qian Hu, Kai-Wei Chang, Richard Zemel, Aram Galstyan, Rahul Gupta
Paper BibTeX

Semantically informed slang interpretation
Zhewei Sun, Richard Zemel, Yang Xu
Paper BibTeX

SurfsUp: Learning fluid simulation for novel surfaces
Arjun Mani, Ishaan Preetam Chandratreya, Elliot Creager, Carl Vondrick, Richard Zemel
Paper BibTeX

Tokenization Matters: Navigating Data-Scarce Tokenization for Gender Inclusive Language Technologies
Anaelia Ovalle, Ninareh Mehrabi, Palash Goyal, Jwala Dhamala, Kai-Wei Chang, Richard Zemel, Aram Galstyan, Yuval Pinter, Rahul Gupta
Paper BibTeX

2022

Correlation and generalization under correlation shifts
Christina Funke, Paul Vicol, Kuan-Chieh Wang, Matthias Kummerer, Richard Zemel, Matthias Bethge
Paper BibTeX

Deep ensembles work, but are they necessary?
Taiga Abe, E. Kelly Buchanan, Geoff Pleiss, Richard Zemel, John Cunningham
Paper BibTeX

2021

Environment inference for invariant learning
Elliot Creager, Jorn Jacobsen, Richard Zemel.
Paper BibTeX

NP-DRAW: A non-parametric structured latent variable model for image generation
Xiaohui Zeng, Raquel Urtasun, Richard Zemel, Sanja Fidler, Renjie Liao
Paper BibTeX

On monotonic linear interpolation of neural network parameters
James Lucas, Juhan Bae, Michael Zhang, Stanislav Fort, Richard Zemel, Roger Grosse
Paper BibTeX

Online Unsupervised Learning of Visual Representations and Categories
Mengye Ren, Tyler R. Scott, Michael L. Iuzzolino, Michael C. Mozer, Richard Zemel
Paper BibTeX

Theoretical bounds on estimation error for meta-learning
James Lucas, Mengye Ren, Irene Kameni, Toni Pitassi, Richard Zemel
Paper BibTeX

Universal template for few-shot dataset generalization
Eleni Triantafillou, Hugo Larochelle, Richard Zemel, Vincent Dumoulin
Paper BibTeX

Variational Model Inversion Attacks
Kuan-Chieh Wang, Yan Fu, Ke Li, Ashish Khisti, Richard Zemel, Alireza Makhzani
Paper BibTeX

Wandering within a world: Online contextualized few-shot learning
Mengye Ren, Michael Iuzzolino, Michael Mozer, Richard Zemel
Paper BibTeX

2020

Causal modeling for fairness in dynamical systems
Elliot Creager, David Madras, Toni Pitassi, Richard Zemel
Paper BibTeX

Cutting out the middle-man: Training and evaluating energy-based models
Will Grathwohl, Jackson Wang, Jorn Jacobsen, David Duvenaud, Richard Zemel
Paper BibTeX

Optimizing long-term social welfare in recommender systems: A constrained matching approach
Martin Mladenov, Elliot Creager, O Ben-Porat, Kevin Swersky, Richard Zemel, Craig Boutilier
Paper BibTeX

Probing Few-Shot Generalization with Attributes
Mengye Ren, Eleni Triantafillou, Kuan-Chieh Wang, James Lucas, Jake Snell, Xaq Pitkow, Andreas S. Tolias, Richard Zemel
Paper BibTeX

Shortcut learning in deep neural networks
Robert Geirhos, Jörn-Henrik Jacobsen, Claudio Michaelis, Richard Zemel, Wieland Brendel, Matthias Bethge, Felix Wichmann
Paper BibTeX

Understanding the limitations of conditional generative models
Ethan Fetaya, Joern-Henrik Jacobsen, Will Grathwohl, Richard Zemel
Paper BibTeX

2019

A divergence minimization perspective on imitation learning methods
Seyed Kamyar Seyed Ghasemipour, Richard Zemel, Shane Gu
Paper BibTeX

Aggregated momentum: Stability through passive damping
James Lucas, Shengyang Sun, Richard Zemel, Roger Grosse
Paper BibTeX

Alchemy: A Quantum Chemistry Dataset for Benchmarking AI Models
Guangyong Chen, Pengfei Chen, Chang-Yu Hsieh, Chee-Kong Lee, Benben Liao, Renjie Liao, Weiwen Liu, Jiezhong Qiu, Qiming Sun, Jie Tang, Richard Zemel, Shengyu Zhang
Paper BibTeX

Dimensionality reduction for representing the knowledge of probabilistic models
Marc Law, Jake Snell, Amir-massoud Farahmand, Raquel Urtasun, Richard Zemel
Paper BibTeX

Efficient graph generation with graph recurrent attention networks
Renjie Liao, Yujia Li, Yang Song, Shenlong Wang, Charlie Nash, William Hamilton, David Duvenaud, Raquel Urtasun, Richard Zemel
Paper BibTeX

Excessive invariance causes adversarial vulnerability
Jörn-Henrik Jacobsen, Jens Behrmann, Richard Zemel, Matthias Bethge
Paper BibTeX

Flexibly fair representation learning by disentanglement
Elliot Creager, David Madras, Joern-Henrik Jacobsen, Marissa Weis, Kevin Swersky, Toniann Pitassi, Richard Zemel
Paper BibTeX

Incremental few-shot learning with attention attractor networks
Mengye Ren, Renjie Liao, Ethan Fetaya, Richard Zemel
Paper BibTeX

LanczosNet: Multi-scale deep graph convolutional networks
Renjie Liao, Zhizhen Zhao, Raquel Urtasun, Richard Zemel
Paper BibTeX

Lorentzian distance learning for hyperbolic representations
Marc Law, Renjie Liao, Jake Snell, Richard Zemel
Paper BibTeX

Understanding the origins of bias in word embedding
Marc-Etienne Brunet, Colleen Alkalay-Houlihan, Ashton Anderson, Richard Zemel
Paper BibTeX

2018

Adversarial distillation of Bayesian neural network posteriors
Kuan-Chieh Wang, Paul Vicol, James Lucas, Li Gu, Roger Grosse, Richard Zemel
Paper BibTeX

Graph Partition Neural Networks for Semi-Supervised Classification
Renjie Liao, Marc Brockschmidt, Daniel Tarlow, Alexander L. Gaunt, Raquel Urtasun, Richard Zemel
Paper BibTeX

Inference in Probabilistic Graphical Models by Graph Neural Networks
KiJung Yoon, Renjie Liao, Yuwen Xiong, Lisa Zhang, Ethan Fetaya, Raquel Urtasun, Richard Zemel, Xaq Pitkow
Paper BibTeX

Learning adversarially fair and transferable representations
David Madras, Elliot Creager, Toniann Pitassi, Richard Zemel
Paper BibTeX

Meta-Learning for Semi-Supervised Few-Shot Classification
Mengye Ren, Eleni Triantafillou, Sachin Ravi, Jake Snell, Kevin Swersky, Joshua B. Tenenbaum, Hugo Larochelle, Richard S. Zemel
Paper BibTeX

Neural guided constraint logic programming for program synthesis
Lisa Zhang, Gregory Rosenblatt, Ethan Fetaya, Renjie Liao, William Byrd, Matthew Might, Raquel Urtasun, Richard Zemel.
Paper BibTeX

Neural relational inference for interacting systems
Thomas Kipf, Ethan Fetaya, Kuan-Chieh Wang, Max Welling, Richard Zemel
Paper BibTeX

Reviving and improving recurrent back-propagation
Renjie Liao, Yuwen Xiong, Ethan Fetaya, Lisa Zhang, KiJung Yoon, Zachary Pitkow, Raquel Urtasun, Richard Zemel
Paper BibTeX

The elephant in the room
Amir Rosenfeld, Richard Zemel, John K. Tsotsos
Paper BibTeX

2017

Causal effect inference with deep latent-variable models
Christos Louizos, Uri Shalit, Joris Mooij, David Sontag, Richard Zemel, Max Welling
Paper BibTeX

Deep spectral clustering learning
Marc Law, Raquel Urtasun, Richard Zemel
Paper BibTeX

Dualing GANs
Yujia Li, Alexander Schwing, Kuan-Chieh Wang, Richard Zemel
Paper BibTeX

Efficient multiple instance metric learning using weakly supervised data
Marc Law, Yaoling Yu, Raquel Urtasun, Richard Zemel, Eric Xing
Paper BibTeX

Few-shot learning through an information retrieval lens
Eleni Triantafillou, Richard Zemel, Raquel Urtasun
Paper BibTeX

Learning to generate images with perceptual similarity metrics
Jake Snell, Karl Ridgeway, Renjie Liao, Brett Roads, Michael Mozer & Richard Zemel
Paper BibTeX

Normalizing the normalizers: Comparing and extending network normalization schemes
Mengye Ren, Renjie Liao, Raquel Urtasun, Fabian Sinz, Richard Zemel
Paper BibTeX

Prototypical networks for few-shot learning
Jake Snell, Kevin Swersky, Richard Zemel
Paper BibTeX

Stochastic segmentation trees
Jake Snell, Richard Zemel
Paper BibTeX

2016

Gated graph sequence neural networks
Yujia Li, Daniel Tarlow, Marc Brockschmidt, Richard Zemel
Paper BibTeX

Learning deep parsimonious representations
Renjie Liao, Alexander Schwing, Richard Zemel, Raquel Urtasun
Paper BibTeX

The variational fair autoencoder
Christos Louizos, Kevin Swersky, Yujia Li, Max Welling, Richard Zemel
Paper BibTeX

Towards generalizable sentence embeddings
Eleni Triantafillou, Jamie Ryan Kiros, Raquel Urtasun, Richard Zemel
Paper BibTeX

Training deep neural networks via direct loss minimization
Yang Song, Alex Schwing, Richard Zemel, Raquel Urtasun
Paper BibTeX

2015

Aligning Books and Movies: Towards Story-like Visual Explanations by Watching Movies and Reading Books
Yukun Zhu, Ryan Kiros, Richard Zemel, Ruslan Salakhutdinov, Raquel Urtasun, Antonio Torralba, Sanja Fidler
Paper BibTeX

Generative Moment Matching Networks
Yujia Li, Kevin Swersky, Richard Zemel
Paper BibTeX

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhutdinov, Richard Zemel, Yoshua Bengio
Paper BibTeX

Skip-Thought Vectors
Ryan Kiros, Yukun Zhu, Ruslan Salakhutdinov, Richard S. Zemel, Antonio Torralba, Raquel Urtasun, Sanja Fidler
Paper BibTeX

2014

Input Warping for Bayesian Optimization of Non-stationary Functions
Jasper Snoek, Kevin Swersky, Richard S. Zemel, Ryan P. Adams
Paper BibTeX

Learning unbiased features
Yujia Li, Kevin Swersky, Richard Zemel
Paper BibTeX

Mean-Field Networks
Yujia Li, Richard Zemel
Paper BibTeX

2013

Learning Fair Representations
Richard Zemel, Yu Wu, Kevin Swersky, Toniann Pitassi, Cynthia Dwork
Paper BibTeX

2012

Active Learning for Matching Problems
Laurent Charlin, Richard Zemel, Craig Boutilier
Paper BibTeX

Fairness Through Awareness
Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, Richard Zemel
Paper BibTeX

Fast Exact Inference for Recursive Cardinality Models
Daniel Tarlow, Kevin Swersky, Richard S. Zemel, Ryan Prescott Adams, Brendan J. Frey
Paper BibTeX

2011

A Framework for Optimizing Paper Matching
Laurent Charlin, Richard S. Zemel, Craig Boutilier
Paper BibTeX

Interpreting Graph Cuts as a Max-Product Algorithm
Daniel Tarlow, Inmar E. Givoni, Richard S. Zemel, Brendan J. Frey
Paper BibTeX

Loss-sensitive Training of Probabilistic Conditional Random Fields
Maksims N. Volkovs, Hugo Larochelle, Richard S. Zemel
Paper BibTeX

Ranking via Sinkhorn Propagation
Ryan Prescott Adams, Richard S. Zemel
Paper BibTeX

2008

Flexible Priors for Exemplar-based Clustering
Daniel Tarlow, Richard S. Zemel, Brendan J. Frey
Paper BibTeX

2007

Collaborative Filtering and the Missing at Random Assumption
Benjamin Marlin, Richard S. Zemel, Sam Roweis, Malcolm Slaney
Paper BibTeX

2003

Active Collaborative Filtering
Craig Boutilier, Richard S. Zemel, Benjamin Marlin
Paper BibTeX

Efficient Parametric Projection Pursuit Density Estimation
Max Welling, Richard S. Zemel, Geoffrey E. Hinton
Paper BibTeX