This is an advanced seminar course that will focus on the latest research in computer vision and related fields. Students will read, present, and discuss papers, and a complete a semester long project. Topics will include visual recognition, self-supervised learning, cross-modal transfer, neural network interpretation, commonsense reasoning, vision and language, and embodied vision. Experience in deep learning is strongly recommended.
The syllabus is subject to change as the course evolves.
|Date||Presenter 1||Paper 1||Presenter 2||Paper 2|
|Jan 22||Carl Vondrick||Perception Beyond Measurement|
|Jan 24||Carl Vondrick||Ecological Vision|
|Jan 29||Justin Chou||What makes Paris look like Paris? Carl Doersch, Saurabh Singh, Abhinav Gupta, Josef Sivic, and Alexei A. Efros. Siggraph 2012.||Sagar Lal||What Makes an Image Memorable?|
|Jan 31||Sebastian Cueva-Caro||Mask R-CNN Kaiming He, Georgia Gkioxari, Piotr Dollar, Ross Girshick. ICCV 2017||Simone Fobi||Light-Head R-CNN: In Defense of Two-Stage Object Detector Zeming Li, Chao Peng, Gang Yu, Xiangyu Zhang, Yangdong Deng, Jian Sun. arXiv Nov 2017|
|Feb 5||Nathan Silberman||Computer Vision for Healthcare||ExplainGAN: Model Explanation via Decision Boundary Crossing Transformations||Understanding Equivalence and Noninferiority Testing|
|Feb 7||Parita Pooj||Finding Tiny Faces||James Shin||PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation Charles R. Qi, Hao Su, Kaichun Mo, Leonidas J. Guibas. arXiv Dec 2016|
|Feb 12||Project Pitch|
|Feb 14||Project Pitch|
|Feb 19||Dave Epstein||Low-shot Learning from Imaginary Data. Yu-Xiong Wang, Ross Girshick, Martial Herbert, Bharath Hariharan. CVPR, 2018 (Spotlight).||Jessie Liu||Matching Networks for One Shot Learning|
|Feb 21||Yicun Liu||Deformable Convolutional Networks||Terence Conlon||ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices Xiangyu Zhang, Xinyu Zhou, Mengxiao Lin, Jian Sun. arXiv July 2017|
|Feb 26||Aashish Misraa||BigTime: Learning Intrinsic Images by Watching the World. Zhengqi Li and Noah Snavely. CVPR 2018||Boyuan Chen||What happens if... Learning to Predict the Effect of Forces in Images|
|Feb 28||Ian Huang||From Recognition to Cognition: Visual Commonsense reasoning||Yiliang Shi||Inferring and Executing Programs for Visual Reasoning Justin Johnson, Bharath Hariharan, Laurens van der Maaten, Judy Hoffman, Li Fei-Fei, C. Lawrence Zitnick, Ross Girshick. ICCV 2017|
|Mar 5||Noah Snavely||3D Computer Vision|
|Mar 7||Lauren Arnett||Taskonomy: disentangling task transfer learning||Dimitri Leggas||Learning to Look Around: Intelligently Exploring Unseen Environments for Unknown Tasks|
|Mar 12||Max Ogryzko||Learning Features by Watching Objects Move. Deepak Pathak, Ross Girshick, Piotr Dollar, Trevor Darrell, Bharath Hariharan. CVPR 2017.||Mayank Saxena||Fighting Fake News: Image Splice Detection via Learned Self-Consistency Minyoung Huh, Andrew Liu, Andrew Owens, Alexei A. Efros in ECCV'18|
|Mar 14||Yueqi Wang||Cognitive Mapping and Planning for Visual Navigation||Suhyun Kim||Visual Curiosity: Learning to Ask Questions to Learn Visual Recognition|
|Mar 19||Spring Break|
|Mar 21||Spring Break|
|Mar 26||Lahav Lipson||Multimodal Unsupervised Image-to-Image Translation||Chris Alberti||Jointly Discovering Visual Objects and Spoken Words from Raw Sensory Input David Harwath, Adrià Recasens, Dídac Surís, Galen Chuang, Antonio Torralba and James Glass|
|Apr 2||Zheng Shou||Video|
|Apr 4||Connor Goggins||Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog||Yiru Chen||Network Dissection: Quantifying Interpretability of Deep Visual Representations David Bau*, Bolei Zhou*, Aditya Khosla, Aude Oliva, Antonio Torralba. CVPR 2017.|
|Apr 9||Suman Mulumudi||Playing Atari with Deep Reinforcement Learning||Niles Christensen||Investigating Human Priors for Playing Video Games|
|Apr 11||Roop Pal||Learning to Poke by Poking: Experiential Learning of Intuitive Physics||Vinay Ramesh||Learning to Fly by Crashing. Dhiraj Gandhi, Lerrel Pinto, Abhinav Gupta|
|Apr 16||Hassan Akbari||Neural architecture search with reinforcement learning.||Akarsh Zingade||Focal Loss for Dense Object Detection Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, Piotr Dollar. ICCV 2017 "|
|Apr 18||Sonam Goenka||Unsupervised Image Captioning (Feng et al)||TBD|
|Apr 23||Project Studio|
|Apr 25||Project Studio|
|Apr 30||Project Studio|
|May 2||Poster Presentation|
Class Participation: Discussion of papers will be a large component of this course. As such, class participation is a large part of the course grade (20%). Please come to class prepared to engage with the presenter and your peers in class. Remember, your peers have spent a significant amount of time preparing the lecture for the day, and participating in discussion will help them out. When you present your lecture, you will want your classmates to join in discussion too!
Presentation: During the course, you will have the chance to present your chosen paper to the class and lead discussion on it. In my experience, excellent talks are the result of extensive preparation. This means two things: a) everybody can give a great presentation because all it takes is practice, and b) you should practice your lecture many times before giving it to the class.
Late Policy: Since all assignments are discussion based and require the whole class to be present, there will be no extensions. If you are unable to present on your scheduled day, it is your responsibility to find a friend to swap with.
Academic Dishonesty: Plagiarism and cheating will result in a zero for the course. You are allowed to use images, code, slides, and material from papers and websites, however you must cite the source.
Course Projects: You may complete the course project individually or in groups. You are encouraged to start the course project from the first day. If you do not have access to a GPU, please contact the course instructors and we will help you find one. At the end of the course, you will have the chance to present your project to your classmates.
We gladly acknowledge several instructors for making their course material available, which this class is based on: James Hays, Philipp Krähenbühl, Devi Parikh, Abhinav Gupta, Alyosha Efros, Antonio Torralba.