Carl Vondrick

Assistant Professor
Department of Computer Science
Columbia University

Office: 618 CEPSR WFH
Address: 530 West 120th St, New York, NY 10027


Our group studies computer vision and machine learning. By training machines to observe and interact with their surroundings, we aim to create robust and versatile models for perception. We often investigate visual models that capitalize on large amounts of unlabeled data and transfer across tasks and modalities. Other interests include scene dynamics, sound and language and beyond, interpretable models, and perception for robotics.

Postdoc Opening: We have openings for postdoctoral fellows in self-supervised visual learning. Details here.


  • Latest ECCV 2020 paper shows that deep networks are vulnerable to adversarial examples partly because they are trained on too few tasks.
  • Our ECCV 2020 paper shows meta-learning improves generalization for vision and language representations.
  • Checkout our workshop on Learning from Unlabeled Video.
  • Oops! Our dataset of unintentional action is accepted to CVPR 2020 and available for download.
  • Thank you to Toyota Research Institute and Amazon for supporting our lab!

PhD Students

Mia Chiquier
Starting Fall 2020
Sachit Menon
Starting Fall 2020
Boyuan Chen
with Hod Lipson
Chengzhi Mao
with Junfeng Yang


Dave Epstein (CRA Honorable Mention, now PhD student @ Berkeley)

All Papers



Multitask Learning Strengthens Adversarial Robustness New!
Chengzhi Mao, Amogh Gupta, Vikram Nitin, Baishakhi Ray, Shuran Song, Junfeng Yang, Carl Vondrick
ECCV 2020 (Oral)

We Have So Much In Common: Modeling Semantic Relational Set Abstractions in Videos Coming Soon!
Alex Andonian, Camilo Fosco, Mathew Monfort, Allen Lee, Carl Vondrick, Rogerio Feris, Aude Oliva
ECCV 2020
Coming Soon

Learning to Learn Words from Visual Scenes New!
Dídac Surís*, Dave Epstein*, Heng Ji, Shih-Fu Chang, Carl Vondrick
ECCV 2020
Paper Project Page Code Talk

Oops! Predicting Unintentional Action in Video New!
Dave Epstein, Boyuan Chen, Carl Vondrick
CVPR 2020
Paper Project Page Data Code Talk


Metric Learning for Adversarial Robustness
Chengzhi Mao, Ziyuan Zhong, Junfeng Yang, Carl Vondrick, Baishakhi Ray
NeurIPS 2019
Paper Code

VideoBERT: A Joint Model for Video and Language Representation Learning
Chen Sun, Austin Myers, Carl Vondrick, Kevin Murphy, Cordelia Schmid
ICCV 2019
Paper Blog

Multi-level Multimodal Common Semantic Space for Image-Phrase Grounding
Hassan Akbari, Svebor Karaman, Surabhi Bhargava, Brian Chen, Carl Vondrick, Shih-Fu Chang
CVPR 2019
Paper Code

Relational Action Forecasting
Chen Sun, Abhinav Shrivastava, Carl Vondrick, Rahul Sukthankar, Kevin Murphy, Cordelia Schmid
CVPR 2019 (Oral)


Tracking Emerges by Colorizing Videos
Carl Vondrick, Abhinav Shrivastava, Alireza Fathi, Sergio Guadarrama, Kevin Murphy
ECCV 2018
Paper Blog

The Sound of Pixels
Hang Zhao, Chuang Gan, Andrew Rouditchenko, Carl Vondrick, Josh McDermott, Antonio Torralba
ECCV 2018
Paper Project Page

Actor-centric Relation Network
Chen Sun, Abhinav Shrivastava, Carl Vondrick, Kevin Murphy, Rahul Sukthankar, Cordelia Schmid
ECCV 2018


Following Gaze in Video
Adria Recasens, Carl Vondrick, Aditya Khosla, Antonio Torralba
ICCV 2017

Generating the Future with Adversarial Transformers
Carl Vondrick, Antonio Torralba
CVPR 2017
Paper Project Page

Cross-Modal Scene Networks
Yusuf Aytar*, Lluis Castrejon*, Carl Vondrick, Hamed Pirsiavash, Antonio Torralba
PAMI 2017
Paper Project Page

See, Hear, and Read: Deep Aligned Representations
Yusuf Aytar, Carl Vondrick, Antonio Torralba
arXiv 2017
Paper Project Page


Generating Videos with Scene Dynamics
Carl Vondrick, Hamed Pirsiavash, Antonio Torralba
NeurIPS 2016
Paper Project Page Code NBC Scientific American New Scientist MIT News

Anticipating Visual Representations with Unlabeled Video
Carl Vondrick, Hamed Pirsiavash, Antonio Torralba
CVPR 2016 (Spotlight)
Paper Project Page NPR CNN AP Wired Stephen Colbert MIT News

Predicting Motivations of Actions by Leveraging Text
Carl Vondrick, Deniz Oktay, Hamed Pirsiavash, Antonio Torralba
CVPR 2016
Paper Dataset

Learning Aligned Cross-Modal Representations from Weakly Aligned Data
Lluis Castrejon*, Yusuf Aytar*, Carl Vondrick, Hamed Pirsiavash, Antonio Torralba
CVPR 2016
Paper Project Page Demo

Visualizing Object Detection Features
Carl Vondrick, Aditya Khosla, Hamed Pirsiavash, Tomasz Malisiewicz, Antonio Torralba
IJCV 2016
Paper Project Page Slides MIT News


Do We Need More Training Data?
Xiangxin Zhu, Carl Vondrick, Charless C. Fowlkes, Deva Ramanan
IJCV 2015
Paper Dataset

Learning Visual Biases from Human Imagination
Carl Vondrick, Hamed Pirsiavash, Aude Oliva, Antonio Torralba
NeurIPS 2015
Paper Project Page Technology Review

Where are they looking?
Adria Recasens*, Aditya Khosla*, Carl Vondrick, Antonio Torralba
NeurIPS 2015
Paper Project Page Demo


Assessing the Quality of Actions
Hamed Pirsiavash, Carl Vondrick, Antonio Torralba
ECCV 2014
Paper Project Page


HOGgles: Visualizing Object Detection Features
Carl Vondrick, Aditya Khosla, Tomasz Malisiewicz, Antonio Torralba
ICCV 2013 (Oral)
Paper Project Page Slides MIT News


Do We Need More Training Data or Better Models for Object Detection?
Xiangxin Zhu, Carl Vondrick, Deva Ramanan, Charless C. Fowlkes
BMVC 2012
Paper Dataset

Efficiently Scaling Up Crowdsourced Video Annotation
Carl Vondrick, Donald Patterson, Deva Ramanan
IJCV 2012
Paper Project Page


Video Annotation and Tracking with Active Learning
Carl Vondrick, Deva Ramanan
NeurIPS 2011
Paper Project Page


Efficiently Scaling Up Video Annotation with Crowdsourced Marketplaces
Carl Vondrick, Deva Ramanan, Donald Patterson
ECCV 2010
Paper Project Page



Code and Data