Our group studies computer vision and machine learning. By training machines to observe and interact with their surroundings, we aim to create robust and versatile models for perception. We often investigate visual models that capitalize on large amounts of unlabeled data and transfer across tasks and modalities. Other interests include scene dynamics, sound and language and beyond, interpretable models, and perception for robotics.
Amogh Gupta (now at Amazon Research), Dave Epstein (CRA Honorable Mention, now PhD student at Berkeley)
Learning the Predictability of the Future New!
Dídac Surís*, Ruoshi Liu*, Carl Vondrick
CVPR 2021
Paper Project Page Code Models Talk
Generative Interventions for Causal Learning New!
Chengzhi Mao, Amogh Gupta, Augustine Cha, Hao Wang, Junfeng Yang, Carl Vondrick
CVPR 2021
Paper Code
Learning Goals from Failure New!
Dave Epstein, Carl Vondrick
CVPR 2021
Paper Project Page Data Code
Visual Behavior Modelling for Robotic Theory of Mind New!
Boyuan Chen, Carl Vondrick, Hod Lipson
Scientific Reports 2021
Paper Project Page
Globetrotter: Unsupervised Multilingual Translation from Visual Alignment New!
Dídac Surís, Dave Epstein, Carl Vondrick
arXiv 2020
Paper Project Page Code
Dissecting Image Crops New!
Basile Van Hoorick, Carl Vondrick
arXiv 2020
Paper Code
Listening to Sounds of Silence for Speech Denoising
Ruilin Xu, Rundi Wu, Yuko Ishiwaka, Carl Vondrick, Changxi Zheng
NeurIPS 2020
Paper Project Page
Multitask Learning Strengthens Adversarial Robustness
Chengzhi Mao, Amogh Gupta, Vikram Nitin, Baishakhi Ray, Shuran Song, Junfeng Yang, Carl Vondrick
ECCV 2020 (Oral)
Paper
We Have So Much In Common: Modeling Semantic Relational Set Abstractions in Videos
Alex Andonian, Camilo Fosco, Mathew Monfort, Allen Lee, Carl Vondrick, Rogerio Feris, Aude Oliva
ECCV 2020
Paper Project Page
Learning to Learn Words from Visual Scenes
Dídac Surís*, Dave Epstein*, Heng Ji, Shih-Fu Chang, Carl Vondrick
ECCV 2020
Paper Project Page Code Talk
Oops! Predicting Unintentional Action in Video
Dave Epstein, Boyuan Chen, Carl Vondrick
CVPR 2020
Paper Project Page Data Code Talk
Metric Learning for Adversarial Robustness
Chengzhi Mao, Ziyuan Zhong, Junfeng Yang, Carl Vondrick, Baishakhi Ray
NeurIPS 2019
Paper Code
VideoBERT: A Joint Model for Video and Language Representation Learning
Chen Sun, Austin Myers, Carl Vondrick, Kevin Murphy, Cordelia Schmid
ICCV 2019
Paper Blog
Multi-level Multimodal Common Semantic Space for Image-Phrase Grounding
Hassan Akbari, Svebor Karaman, Surabhi Bhargava, Brian Chen, Carl Vondrick, Shih-Fu Chang
CVPR 2019
Paper Code
Relational Action Forecasting
Chen Sun, Abhinav Shrivastava, Carl Vondrick, Rahul Sukthankar, Kevin Murphy, Cordelia Schmid
CVPR 2019 (Oral)
Paper
Moments in Time Dataset: one million videos for event understanding
Mathew Monfort et al
PAMI 2019
Paper Project Page
Tracking Emerges by Colorizing Videos
Carl Vondrick, Abhinav Shrivastava, Alireza Fathi, Sergio Guadarrama, Kevin Murphy
ECCV 2018
Paper Blog
The Sound of Pixels
Hang Zhao, Chuang Gan, Andrew Rouditchenko, Carl Vondrick, Josh McDermott, Antonio Torralba
ECCV 2018
Paper Project Page
Actor-centric Relation Network
Chen Sun, Abhinav Shrivastava, Carl Vondrick, Kevin Murphy, Rahul Sukthankar, Cordelia Schmid
ECCV 2018
Paper
AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions
Chunhui Gu et al
CVPR 2018 (Spotlight)
Paper Project Page
Following Gaze in Video
Adria Recasens, Carl Vondrick, Aditya Khosla, Antonio Torralba
ICCV 2017
Paper
Generating the Future with Adversarial Transformers
Carl Vondrick, Antonio Torralba
CVPR 2017
Paper Project Page
Cross-Modal Scene Networks
Yusuf Aytar*, Lluis Castrejon*, Carl Vondrick, Hamed Pirsiavash, Antonio Torralba
PAMI 2017
Paper Project Page
See, Hear, and Read: Deep Aligned Representations
Yusuf Aytar, Carl Vondrick, Antonio Torralba
arXiv 2017
Paper Project Page
Generating Videos with Scene Dynamics
Carl Vondrick, Hamed Pirsiavash, Antonio Torralba
NeurIPS 2016
Paper Project Page Code NBC Scientific American New Scientist MIT News
SoundNet: Learning Sound Representations from Unlabeled Video
Yusuf Aytar*, Carl Vondrick*, Antonio Torralba
NeurIPS 2016
Paper Project Page Code NPR New Scientist Week Junior MIT News
Anticipating Visual Representations with Unlabeled Video
Carl Vondrick, Hamed Pirsiavash, Antonio Torralba
CVPR 2016 (Spotlight)
Paper Project Page NPR CNN AP Wired Stephen Colbert MIT News
Predicting Motivations of Actions by Leveraging Text
Carl Vondrick, Deniz Oktay, Hamed Pirsiavash, Antonio Torralba
CVPR 2016
Paper Dataset
Learning Aligned Cross-Modal Representations from Weakly Aligned Data
Lluis Castrejon*, Yusuf Aytar*, Carl Vondrick, Hamed Pirsiavash, Antonio Torralba
CVPR 2016
Paper Project Page Demo
Visualizing Object Detection Features
Carl Vondrick, Aditya Khosla, Hamed Pirsiavash, Tomasz Malisiewicz, Antonio Torralba
IJCV 2016
Paper Project Page Slides MIT News
Do We Need More Training Data?
Xiangxin Zhu, Carl Vondrick, Charless C. Fowlkes, Deva Ramanan
IJCV 2015
Paper Dataset
Learning Visual Biases from Human Imagination
Carl Vondrick, Hamed Pirsiavash, Aude Oliva, Antonio Torralba
NeurIPS 2015
Paper Project Page Technology Review
Where are they looking?
Adria Recasens*, Aditya Khosla*, Carl Vondrick, Antonio Torralba
NeurIPS 2015
Paper Project Page Demo
Assessing the Quality of Actions
Hamed Pirsiavash, Carl Vondrick, Antonio Torralba
ECCV 2014
Paper Project Page
HOGgles: Visualizing Object Detection Features
Carl Vondrick, Aditya Khosla, Tomasz Malisiewicz, Antonio Torralba
ICCV 2013 (Oral)
Paper Project Page Slides MIT News
Do We Need More Training Data or Better Models for Object Detection?
Xiangxin Zhu, Carl Vondrick, Deva Ramanan, Charless C. Fowlkes
BMVC 2012
Paper Dataset
Efficiently Scaling Up Crowdsourced Video Annotation
Carl Vondrick, Donald Patterson, Deva Ramanan
IJCV 2012
Paper Project Page
Video Annotation and Tracking with Active Learning
Carl Vondrick, Deva Ramanan
NeurIPS 2011
Paper Project Page
A Large-scale Benchmark Dataset for Event Recognition
Sangmin Oh, et al.
CVPR 2011
Paper Project Page
Efficiently Scaling Up Video Annotation with Crowdsourced Marketplaces
Carl Vondrick, Deva Ramanan, Donald Patterson
ECCV 2010
Paper Project Page