Columbia University

E6998 Advanced Computer Vision


Carl Vondrick
TR 4:10pm-5:25pm
227 Seeley Mudd
Spring 2019

Office Hours

Tue/Thr 5:30pm-6pm, CEPSR 611
Mon 5:30pm-7:30pm, CS TA Room
Tues 2:15pm-3:15pm



  • Class Participation 20%
  • Class Presentation 40%
  • Final Project 40%


This is an advanced seminar course that will focus on the latest research in computer vision and related fields. Students will read, present, and discuss papers, and a complete a semester long project. Topics will include visual recognition, self-supervised learning, cross-modal transfer, neural network interpretation, commonsense reasoning, vision and language, and embodied vision. Experience in deep learning is strongly recommended.


  • Enrollment is capped at 30 students. You must have instructor approval to take the course.
  • The course website is under construction and subject to change.


The syllabus is subject to change as the course evolves.

List of suggested papers

Paper signup

Date Presenter 1 Paper 1 Presenter 2 Paper 2
Jan 22 Carl Vondrick Perception Beyond Measurement
Jan 24 Carl Vondrick Ecological Vision
Jan 29 Justin Chou What makes Paris look like Paris? Carl Doersch, Saurabh Singh, Abhinav Gupta, Josef Sivic, and Alexei A. Efros. Siggraph 2012. Sagar Lal What Makes an Image Memorable?
Jan 31 Sebastian Cueva-Caro Mask R-CNN Kaiming He, Georgia Gkioxari, Piotr Dollar, Ross Girshick. ICCV 2017 Simone Fobi Light-Head R-CNN: In Defense of Two-Stage Object Detector Zeming Li, Chao Peng, Gang Yu, Xiangyu Zhang, Yangdong Deng, Jian Sun. arXiv Nov 2017
Feb 5 Nathan Silberman Computer Vision for Healthcare ExplainGAN: Model Explanation via Decision Boundary Crossing Transformations Understanding Equivalence and Noninferiority Testing
Feb 7 Parita Pooj Finding Tiny Faces James Shin PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation Charles R. Qi, Hao Su, Kaichun Mo, Leonidas J. Guibas. arXiv Dec 2016
Feb 12 Project Pitch
Feb 14 Project Pitch
Feb 19 Dave Epstein Low-shot Learning from Imaginary Data. Yu-Xiong Wang, Ross Girshick, Martial Herbert, Bharath Hariharan. CVPR, 2018 (Spotlight). Jessie Liu Matching Networks for One Shot Learning
Feb 21 Yicun Liu Deformable Convolutional Networks Terence Conlon ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices Xiangyu Zhang, Xinyu Zhou, Mengxiao Lin, Jian Sun. arXiv July 2017
Feb 26 Aashish Misraa BigTime: Learning Intrinsic Images by Watching the World. Zhengqi Li and Noah Snavely. CVPR 2018 Boyuan Chen What happens if... Learning to Predict the Effect of Forces in Images
Feb 28 Ian Huang From Recognition to Cognition: Visual Commonsense reasoning Yiliang Shi Inferring and Executing Programs for Visual Reasoning Justin Johnson, Bharath Hariharan, Laurens van der Maaten, Judy Hoffman, Li Fei-Fei, C. Lawrence Zitnick, Ross Girshick. ICCV 2017
Mar 5 Noah Snavely 3D Computer Vision
Mar 7 Lauren Arnett Taskonomy: disentangling task transfer learning Dimitri Leggas Learning to Look Around: Intelligently Exploring Unseen Environments for Unknown Tasks
Mar 12 Max Ogryzko Learning Features by Watching Objects Move. Deepak Pathak, Ross Girshick, Piotr Dollar, Trevor Darrell, Bharath Hariharan. CVPR 2017. Mayank Saxena Fighting Fake News: Image Splice Detection via Learned Self-Consistency Minyoung Huh, Andrew Liu, Andrew Owens, Alexei A. Efros in ECCV'18
Mar 14 Yueqi Wang Cognitive Mapping and Planning for Visual Navigation Suhyun Kim Visual Curiosity: Learning to Ask Questions to Learn Visual Recognition
Mar 19 Spring Break
Mar 21 Spring Break
Mar 26 Lahav Lipson Multimodal Unsupervised Image-to-Image Translation Chris Alberti Jointly Discovering Visual Objects and Spoken Words from Raw Sensory Input David Harwath, Adrià Recasens, Dídac Surís, Galen Chuang, Antonio Torralba and James Glass
Mar 28 TBD TBD
Apr 2 Zheng Shou Video
Apr 4 Connor Goggins Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog Yiru Chen Network Dissection: Quantifying Interpretability of Deep Visual Representations David Bau*, Bolei Zhou*, Aditya Khosla, Aude Oliva, Antonio Torralba. CVPR 2017.
Apr 9 Suman Mulumudi Playing Atari with Deep Reinforcement Learning Niles Christensen Investigating Human Priors for Playing Video Games
Apr 11 Roop Pal Learning to Poke by Poking: Experiential Learning of Intuitive Physics Vinay Ramesh Learning to Fly by Crashing. Dhiraj Gandhi, Lerrel Pinto, Abhinav Gupta
Apr 16 Hassan Akbari Neural architecture search with reinforcement learning. Akarsh Zingade Focal Loss for Dense Object Detection Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, Piotr Dollar. ICCV 2017 "
Apr 18 Sonam Goenka Unsupervised Image Captioning (Feng et al) TBD
Apr 23 Presentation
Apr 25 Presentation
Apr 30 Presentation
May 2 Presentation

Course Policy

Class Participation: Discussion of papers will be a large component of this course. As such, class participation is a large part of the course grade (20%). Please come to class prepared to engage with the presenter and your peers in class. Remember, your peers have spent a significant amount of time preparing the lecture for the day, and participating in discussion will help them out. When you present your lecture, you will want your classmates to join in discussion too!

Presentation: During the course, you will have the chance to present your chosen paper to the class and lead discussion on it. In my experience, excellent talks are the result of extensive preparation. This means two things: a) everybody can give a great presentation because all it takes is practice, and b) you should practice your lecture many times before giving it to the class.

Late Policy: Since all assignments are discussion based and require the whole class to be present, there will be no extensions. If you are unable to present on your scheduled day, it is your responsibility to find a friend to swap with.

Academic Dishonesty: Plagiarism and cheating will result in a zero for the course. You are allowed to use images, code, slides, and material from papers and websites, however you must cite the source.

Course Projects: You may complete the course project individually or in groups. You are encouraged to start the course project from the first day. If you do not have access to a GPU, please contact the course instructors and we will help you find one. At the end of the course, you will have the chance to present your project to your classmates.


We gladly acknowledge several instructors for making their course material available, which this class is based on: James Hays, Philipp Krähenbühl, Devi Parikh, Abhinav Gupta, Alyosha Efros, Antonio Torralba.