Richard Zemel

Trianthe Dakolias Professor of Engineering and Applied Science
Professor, Department of Computer Science
Columbia University

summaryoffice hoursteachingtalksgroupprospective studentspublications 

Contact information:
Office: CEPSR/Shapiro 619
Twitter: @zemelgroup
Email: zemel at cs . columbia . edu

Mail: 500 W 120th St (Mudd bldg)
New York, NY, 10027



I am a professor in the Department of Computer Science at Columbia University. I am broadly interested in machine learning, artificial intelligence, statistics, neuroscience, and cognitive science. I am also the Director of the new NSF AI Institute for ARtificial and Natural Intelligence (ARNI).

I am currently trying to recruit a postdoctoral fellow, to work jointly with Ashton Anderson and myself, on societal aspects of AI. More information about this position can be found here.

My recent research interests include:

Office hours

During the academic year I hold weekly office hours. For the Spring 2024 term these are Thursdays 3-4PM (starting in March). For students and postdocs, coming to my office hours is easier than using email to make an appointment.



Here are some recent talks:

Research Group

Students and Postdocs:

  • Marc-Etienne Brunet
  • Elliot Creager
  • Zhun Deng
  • Ben Eyre
  • Arjun Mani
  • Sruthi Sudhakar
  • Tom Zollo

    Former students and postdocs:

  • Arnold Binas   (Senior Engineering Manager, Google)
  • Sagan Bolliger   (Master of Counselling Pschology, Adler University)
  • Miguel Carreira-Perpinan (Professor, University of California, Merced)
  • Laurent Charlin   (Associate Professor, University of Montreal)
  • Stephen Cowen   (Associate Professor, University of Arizona}
  • Emily Denton   (Senior Research Scientist, Google)
  • Stephen Fung   (Engineer, Google)
  • Ethan Fetaya   (Assistant Professor, Bar-Ilan University)
  • Kamyar Seyed Ghasemipour   (Research Engineer, Google)
  • Will Grathwohl   (Research Scientist, DeepMind)
  • Amit Gruber   (Research Scientist, IBM)
  • Xuming He   (Associate Professor, ShanghaiTech University)
  • Jorn Jacobsen   (Senior Research Scientist, Apple)
  • Nikola Karamanov   (Engineer, Visteon)
  • Jamie Ryan Kiros   (Entrepeneur, Latvia)
  • James Lucas   (Research Scientist, NVIDIA)
  • Gregory Koch  
  • Marc Law   (Senior Research Scientist, NVIDIA)
  • Yujia Li   (Research Scientist, DeepMind)
  • Renjie Liao   (Assistant Professor, University of British Columbia)
  • Jake Snell   (Postdoc, Princeton University)
  • David Madras   (Research Scientist, Google)
  • Benjamin Marlin   (Associate Professor, University of Massachusetts, Amherst)
  • James Martens   (Staff Research Scientist, DeepMind)
  • Rama Natarajan  
  • Jonathan Pillow   (Professor, Princeton University)
  • Mengye Ren   (Visiting Faculty Researcher, Google)
  • Danny Roobaert   (Founder, Lemaitre Capital)
  • David Ross   (Engineering Manager, Google Research)
  • Tanya Schmah   (Associate Professor, University of Ottawa)
  • Liam Stewart   (Software Engineer, Google)
  • Kevin Swersky   (Research Scientist, Google Brain)
  • Danny Tarlow   (Research Scientist, Google Brain)
  • David Towers  
  • Eleni Triantafillou   (Research Scientist, Google Brain)
  • Maks Volkovs   (Co-Founder, Layer 6 AI)
  • Alexander Wang   (PhD student, NYU)
  • Jackson Wang   (Postdoctoral Fellow, Stanford University)
  • Zhiyong Yang  
  • Lisa Zhang   (Assistant Professor, University of Toronto)

  • Prospective PhD Students

    If you are interested in applying for a PhD in Machine Learning at Columbia, you should apply through the Columbia University Computer Science department.

    Publications (from 2016-2023)


    Distribution-free statistical dispersion control for societal applications
    Zhun Deng, Thomas Zollo, Jake Snell, Toniann Pitassi, Richard Zemel
    NeurIPS, 2023.

    ICL Markup: Structuring in-context learning using soft-token tags
    Marc-Etienne Brunet, Ashton Anderson, Richard Zemel
    NeurIPS: R0-FoMo Workshop, 2023.

    Prompt Risk Control: A flexible framework for bounding the probability of high-loss predictions
    Thomas Zollo, Todd Morrill, Zhun Deng, Jake Snell, Toniann Pitassi, Richard Zemel
    NeurIPS SoLaR Workshop, 2023.

    On the steerability of large language models toward data-driven personas
    Junyi Li, Ninareh Mehrabi, Charith Peris, Palash Goyal, Kai-Wei Chang, Aram Galstyan, Richard Zemel, Rahul Gupta
    CIKM, 2023.

    Coordinated replay sample selection for continual federated learning
    Jack Good and Jimit Majmudar and Christophe Dupuy and Jixuan Wang and Charith Peris and Clement Chung and Richard Zemel and Rahul Gupta
    EMNLP, 2023.

    Resolving ambiguities in text-to-image generative models
    Ninareh Mehrabi, Palash Goyal, Apurv Verma, Jwala Dhamala, Varun Kumar, Qian Hu, Kai-Wei Chang, Richard Zemel, Aram Galstyan, Rahul Gupta
    ACL, 2023.

    SurfsUp: Learning fluid simulation for novel surfaces
    Arjun Mani, Ishaan Preetam Chandratreya, Elliot Creager, Carl Vondrick, Richard Zemel
    ICCV, 2023.

    "I'm fully who I am": Towards centering transgender and non-binary voices to measure biases in open language generation
    Anaelia Ovalle, Palash Goyal, Jwala Dhamala, Zachary Jaggers, Kai-Wei Chang, Aram Galstyan, Richard Zemel, Rahul Gupta
    FAccT, 2023.

    Differentially private decoding in large language models
    Jimit Majmudar, Christophe Dupuy, Charith Peris, Sami Smaili, Rahul Gupta, Richard Zemel
    NAACL TrustNLP Workshop, 2023.

    Semantically informed slang interpretation
    Zhewei Sun, Richard Zemel, Yang Xu
    NAACL, 2023.


    Implications of model indeterminacy for explanations of automated decisions
    Marc-Etienne Brunet, Ashton Anderson, Richard Zemel
    NeurIPS, 2022.

    Deep ensembles work, but are they necessary?
    Taiga Abe, E. Kelly Buchanan, Geoff Pleiss, Richard Zemel, John Cunningham
    NeurIPS, 2022.

    Amortized Causal Discovery: Learning to infer causal graphs from time-series data
    Sindy Lowe, David Madras, Richard Zemel, Max Welling
    CLeaR, 2022.

    Correlation and generalization under correlation shifts
    Christina Funke, Paul Vicol, Kuan-Chieh Wang, Matthias Kummerer, Richard Zemel, Matthias Bethge
    CoLLaS, 2022.


    Identifying and benchmarking natural out-of-context prediction problems
    David Madras, Richard Zemel
    NeurIPS, 2021.

    Directly training joint energy-based models for conditional synthesis and calibrated prediction of multi-attribute data
    Jacob Kelly, Richard Zemel, Will Grathwohl
    ICML UDL, 2021.

    NP-DRAW: A non-parametric structured latent variable model for image generation
    Xiaohui Zeng, Raquel Urtasun, Richard Zemel, Sanja Fidler, Renjie Liao
    UAI, 2021.

    Fairness and robustness in invariant learning: A case study in toxicity classification
    Robert Adragna, Elliot Creager, David Madras, Richard Zemel
    NeurIPS Workshop: Algorithmic Fairness through the Lens of Causality and Interpretability, 2021.

    Environment inference for invariant learning
    Elliot Creager, Jorn Jacobsen, Richard Zemel.
    ICML, 2021.

    SketchEmbedNet: Learning novel concepts by imitating drawings
    Alex Wang, Mengye Ren, Richard Zemel
    ICML, 2021.

    Universal template for few-shot dataset generalization
    Eleni Triantafillou, Hugo Larochelle, Richard Zemel, Vincent Dumoulin
    ICML, 2021.

    On monotonic linear interpolation of neural network parameters
    James Lucas, Juhan Bae, Michael Zhang, Stanislav Fort, Richard Zemel, Roger Grosse
    ICML, 2021

    A computational framework for slang generation
    Zhewei Sun, Richard Zemel, Yang Xu
    Transactions of the Association for Computational Linguistics, 9: 478-462 (2021).

    Wandering within a world: Online contextualized few-shot learning
    Mengye Ren, Michael Iuzzolino, Michael Mozer, Richard Zemel
    ICLR, 2021.

    Bayesian few-shot classification with one-vs-each Polya-Gamma augmented Gaussian Processes
    Jake Snell, Richard Zemel.
    ICLR, 2021.

    Theoretical bounds on estimation error for meta-learning
    James Lucas, Mengye Ren, Irene Kameni, Toni Pitassi, Richard Zemel
    ICLR, 2021.

    A PAC-Bayesian approach to generalization bounds for graph neural networks
    Renjie Liao, Raquel Urtasun, Richard Zemel
    ICLR, 2021.


    Shortcut learning in deep neural networks
    Robert Geirhos, Jörn-Henrik Jacobsen, Claudio Michaelis, Richard Zemel, Wieland Brendel, Matthias Bethge, Felix Wichmann
    Nature Machine Intelligence: 2, 2020.

    Causal modeling for fairness in dynamical systems
    Elliot Creager, David Madras, Toni Pitassi, Richard Zemel
    ICML, 2020.

    Cutting out the middle-man: Training and evaluating energy-based models
    Will Grathwohl, Jackson Wang, Jorn Jacobsen, David Duvenaud, Richard Zemel
    ICML, 2020.

    Optimizing long-term social welfare in recommender systems: A constrained matching approach
    Martin Mladenov, Elliot Creager, O Ben-Porat, Kevin Swersky, Richard Zemel, Craig Boutilier
    ICML, 2020.

    Understanding the limitations of conditional generative models
    Ethan Fetaya, Joern-Henrik Jacobsen, Will Grathwohl, Richard Zemel
    ICLR, 2020.


    A divergence minimization perspective on imitation learning methods
    Seyed Kamyar Seyed Ghasemipour, Richard Zemel, Shane Gu
    CORL, 2019.

    Efficient graph generation with graph recurrent attention networks
    Renjie Liao, Yujia Li, Yang Song, Shenlong Wang, Charlie Nash, William Hamilton, David Duvenaud, Raquel Urtasun, Richard Zemel
    NeurIPS, 2019.

    SMILe: Scalable meta inverse reinforcement learning through context-conditional policies
    Seyed Kamyar Seyed Ghasemipour, Shane Gu, Richard Zemel
    NeurIPS, 2019.

    Incremental few-shot learning with attention attractor networks
    Mengye Ren, Renjie Liao, Ethan Fetaya, Richard Zemel
    NeurIPS, 2019.

    Understanding the origins of bias in word embedding
    Marc-Etienne Brunet, Colleen Alkalay-Houlihan, Ashton Anderson, Richard Zemel
    ICML, 2019.

    Lorentzian distance learning for hyperbolic representations
    Marc Law, Renjie Liao, Jake Snell, Richard Zemel
    ICML, 2019.

    Flexibly fair representation learning by disentanglement
    Elliot Creager, David Madras, Joern-Henrik Jacobsen, Marissa Weis, Kevin Swersky, Toniann Pitassi, Richard Zemel
    ICML, 2019.

    Dimensionality reduction for representing the knowledge of probabilistic models
    Marc Law, Jake Snell, Amir-massoud Farahmand, Raquel Urtasun, Richard Zemel
    ICLR, 2019.

    Aggregated momentum: Stability through passive damping
    James Lucas, Shengyang Sun, Richard Zemel, Roger Grosse
    ICLR, 2019.

    Excessive invariance causes adversarial vulnerability
    Jörn-Henrik Jacobsen, Jens Behrmann, Richard Zemel, Matthias Bethge
    ICLR, 2019.

    LanczosNet: Multi-scale deep graph convolutional networks
    Renjie Liao, Zhizhen Zhao, Raquel Urtasun, Richard Zemel
    ICLR, 2019.

    Fairness through causal awareness: Learning causal latent-variable models for biased data.
    David Madras, Elliot Creager, Toni Pitassi, Richard Zemel
    FAccT, 2019.


    Neural guided constraint logic programming for program synthesis
    Lisa Zhang, Gregory Rosenblatt, Ethan Fetaya, Renjie Liao, William Byrd, Matthew Might, Raquel Urtasun, Richard Zemel.
    NeurIPS, 2018.

    Predict responsibly: improving fairness and accuracy by learning to defer
    David Madras, Toni Pitassi, Richard Zemel
    NeurIPS, 2018.

    Learning latent subspaces in variational autoencoders
    Jack Klys, Jake Snell, Richard Zemel
    NeurIPS, 2018.

    Neural relational inference for interacting systems
    Thomas Kipf, Ethan Fetaya, Kuan-Chieh Wang, Max Welling, Richard Zemel
    ICML, 2018.

    Adversarial distillation of Bayesian neural network posteriors
    Kuan-Chieh Wang, Paul Vicol, James Lucas, Li Gu, Roger Grosse, Richard Zemel
    ICML, 2018.

    Learning adversarially fair and transferable representations
    David Madras, Elliot Creager, Toniann Pitassi, Richard Zemel
    ICML, 2018.

    Reviving and improving recurrent back-propagation
    Renjie Liao, Yuwen Xiong, Ethan Fetaya, Lisa Zhang, KiJung Yoon, Zachary Pitkow, Raquel Urtasun, Richard Zemel
    ICML, 2018.

    The elephant in the room
    Amir Rosenfeld, Richard Zemel, John K. Tsotsos
    Arxiv, 2018.


    Few-shot learning through an information retrieval lens
    Eleni Triantafillou, Richard Zemel, Raquel Urtasun
    NeurIPS, 2017.

    Dualing GANs
    Yujia Li, Alexander Schwing, Kuan-Chieh Wang, Richard Zemel
    NeurIPS, 2017.

    Causal effect inference with deep latent-variable models
    Christos Louizos, Uri Shalit, Joris Mooij, David Sontag, Richard Zemel, Max Welling
    NeurIPS, 2017.

    Deep spectral clustering learning
    Marc Law, Raquel Urtasun, Richard Zemel
    ICML, 2017.

    Efficient multiple instance metric learning using weakly supervised data
    Marc Law, Yaoling Yu, Raquel Urtasun, Richard Zemel, Eric Xing
    CVPR, 2017.

    Prototypical networks for few-shot learning
    Jake Snell, Kevin Swersky, Richard Zemel
    NeurIPS, 2017.

    Stochastic segmentation trees
    Jake Snell, Richard Zemel
    UAI, 2017.

    Learning to generate images with perceptual similarity metrics
    Jake Snell, Karl Ridgeway, Renjie Liao, Brett Roads, Michael Mozer & Richard Zemel
    ICIP, 2017.

    Normalizing the normalizers: Comparing and extending network normalization schemes
    Mengye Ren, Renjie Liao, Raquel Urtasun, Fabian Sinz, Richard Zemel
    ICLR, 2017.

    End-to-end instance segmentation with recurrent attention
    Mengye Ren and Richard Zemel
    CVPR, 2017.

    Towards generalizable sentence embeddings
    Eleni Triantafillou, Jamie Ryan Kiros, Raquel Urtasun, Richard Zemel
    ACL Workshop on Representation Learning for NLP, 2017.


    The variational fair autoencoder
    Christos Louizos, Kevin Swersky, Yujia Li, Max Welling, Richard Zemel
    ICLR, 2016.

    Gated graph sequence neural networks
    Yujia Li, Daniel Tarlow, Marc Brockschmidt, Richard Zemel
    ICLR, 2016.

    Training deep neural networks via direct loss minimization
    Yang Song, Alex Schwing, Richard Zemel, Raquel Urtasun
    ICLR, 2016.

    Classifying NBA offensive plays using neural networks
    Kuan-Chieh Wang, Richard Zemel
    Sloan Sports Analytics Conference, 2016.

    Understanding the effective receptive field in deep convolutional neural networks
    Wenjie Luo, Yujia Li, Raquel Urtasun, Richard Zemel
    NeurIPS, 2016.

    Learning deep parsimonious representations
    Renjie Liao, Alexander Schwing, Richard Zemel, Raquel Urtasun
    NeurIPS, 2016.