SPRING 2021 TOPICS COURSES

The below are course descriptions for the Spring 2021 Topics Courses. This page will be updated as new information is made available. The official record of what will be offered is listed on the Directory of Classes. Please use this only as a resource in your course planning. Undergraduates should consult your CS Faculty advisor to see if a course counts for your track. MS students should consult the Topics page (if not listed then consult your CS Faculty advisor).

COMS W4995.001 Semantic Representations for NLP | Daniel Bauer
COMS W4995.002 Intro to Agile Project Management | Tristian Boutros
COMS W4995.003 Intro to Networks and Crowds | Augustin Chaintreau
COMS W4995.004 Geometric Data Analysis | Andrew Blumberg
COMS W4995.005 Multilingual Language Technologies and Language Diversity | Smaranda Muresan
COMS W4995.006 Design Using C++ | Bjarne Stroustrup
COMS W4995.007 Causal Inference II | Elias Bareinboim
COMS W4995.008 Advanced Algorithms | Alexandr Andoni
COMS W4995.009 Intro to Data Visualization | Agnes Chang
COMS W4995.010 Applied Deep Learning | Joshua Gordon
COMS W4995.011 Causal Inference for Data Science | Adam Kelleher

COMS E6998.001 Virtual Technologies for Cloud Computing | Jason Nieh
COMS E6998.002 Software Engineering for AI Systems | Baishakhi Ray
COMS E6998.003 Security and Robustness of ML Systems | Junfeng Yang
COMS E6998.004 Topics in Robot Learning | Shuran Song
COMS E6998.005 Human-Computer Interaction | Brian Smith
COMS E6998.006 Dialog Systems (Conversational AI) | Zhou Yu
COMS E6998.007 Advanced Topics and Projects in Deep Learning | Peter Belhumeur
COMS E6998.008 Fundamentals of Speech Recognition | Homayoon S. Beigi
COMS E6998.009 Empirical Methods of Data Science | Michelle Levine
COMS E6998.010 Cloud Computing and Big Data | Sambit Sahu

COMS W4995.001 Semantic Representations for NLP | Daniel Bauer

Most NLP tasks and applications require some level of understanding of the semantics (meaning) of linguistic expressions. The question of how to represent semantics and how to map between surface linguistic expressions to semantic representations is therefore an important part of NLP research. This course will explore some of the challenges surrounding semantic representations in various applications. We will compare a variety of approaches to representing the meaning of words and sentences from an NLP perspective, including symbolic/logic-based representations and resources, as well as modern distributional and multi-modal approaches. Requirements include homework assignments, a paper presentation, and a final project.

COMS W4995.002 Intro to Agile Project Management | Tristian Boutros

Project management skills are essential for professionals to meet the ever-growing demands of today’s businesses and to succeed in the global economy. From technology to finance, and construction to healthcare, project management skills are applicable across every industry. The Introduction to Agile Project Management course is tailored to both individuals who have some project management-related experience, but aspire to enhance these skills, and individuals who are just starting out in their careers and wish to gain new skills that will serve them a lifetime. As a student enrolled in this course, you will gain the critical knowledge, and foundation needed to initiate, plan, execute, and manage a successful engineering project using both traditional and agile project management approaches. Upon the completion of this course, you will be able to describe the basic values, principles and practices of Agile project management and Scrum, learn to develop a project or product roadmap, and the skills and tools needed to successfully execute projects to completion. This course will also outline the importance of organizational culture in project activities and how to develop and implement a project management framework that works for your company. Course work will explore essential concepts and techniques in project management and how to apply them, including terminology, methodologies, people management, process management, leadership, and enterprise strategy integration. Upon completion of the course, you will possess the knowledge to begin to study for multiple industry certifications as delivered by the Project Management Institute (PMI), Scrum Alliance and Scrum.org.

COMS W4995.003 Intro to Networks and Crowds | Augustin Chaintreau

Course Description

This course covers the fundamentals underlying information diffusion and
incentives on networked applications. Applications include but are not limited to social networks, crowdsourcing, online advertising, rankings, information networks like the world wide web, as well as areas where opinion formation and the aggregate behavior of groups of people play a critical role. Among structural concepts introduced and covered in class feature random graphs, small world, weak ties, structural balance, cluster modularity, preferential attachments, Nash equilibrium, Potential Game and Bipartite Graph Matching. The class examines the following dynamics: link prediction, network formation, adoption with network effect, spectral clustering and ranking, spread of epidemic, seeding, social learning,
routing game, all-pay contest and truthful bidding…(read more)

COMS W4995.004 Geometric Data Analysis | Andrew Blumberg

Tentative Syllabus

The goal of this class is to introduce approaches to analyzing data
presented as finite metric spaces using ideas from algebraic topology and differential geometry. Prerequisites are a grounding in basic probability, statistics, and linear algebra. The class will focus on rigorous mathematical foundations and applications drawn from computational genomics.

COMS W4995.005 Multilingual Language Technologies and Language Diversity | Smaranda Muresan

Course description coming soon…

COMS W4995.006 Design Using C++ | Bjarne Stroustrup
Link to full description

Informal description
Design cannot be understood in the abstract: To discuss design you need concrete examples – preferably examples of both good and bad design. Conversely, you cannot understand a programming language or library – or use if well – by just learning the rules for its individual features. You need to understand the general design ideas behind the language or library: Its philosophy. The ISO C++ language and its standard library provide many concrete examples for the discussion of design. We will look at C++ from its earliest days through the current 2020 ISO standard (C++20). This year’s version of this course will place some emphasis on the C++ Core Guidelines effort to provide tool-and-library supported guidelines for a modern style of C++ providing type and resource without loss of generality or performance.

This course involves a fair bit of reading, some programming, and some writing. Specific topics will be chosen from resource management (e.g., constructors and destructors), error handling (e.g., exceptions), generic programming (e.g. templates and concepts), compile-time computation, modularity, concurrency (threads and coroutines) and libraries (e.g. containers, algorithms, ranges, and smart pointers). Topics will be examined from various points of view, including usability, implementation models, teachability, performance, and real-world constraints…(read more)

Course Description and Prerequisites
This course explores the interactions among language design, library design, and program design in the context of ISO standard C++. Features provided from early C++ to C++20 and the design and programming techniques they support are featured.

Requirements: Senior undergraduate, masters, professional, PhD graduate standing. A basic understanding of C++ and experience with a software development project (in any language) would be an advantage.

COMS W4995.007 Causal Inference II | Elias Bareinboim

Course description coming soon…

COMS W4995.008 Advanced Algorithms | Alexandr Andoni

“http://www.cs.columbia.edu/~andoni/advancedS20/index.html | The class covers classic and modern algorithmic ideas that are central to many areas of Computer Science. The focus is on most powerful paradigms and techniques of how to design algorithms, and measure their efficiency. The topics will include hashing, sketching, dimension reduction, spectral graph theory, optimization (linear programming, gradient descent, IPM), multiplicative weights, compressed sensing, and others.

The class is designed as a “grad intro to algorithms” class, and is thus an advanced version of “Analysis of Algorithms” (COMS 4231), both in terms of content as well as pace. You need not have taken 4231, but some algorithmic exposure is expected (see prerequisites below). Hence it is suitable for those of you who have seen some algorithms class (like 4231 or easier), and/or want to take an in-depth algorithms class. The evaluation is based on homeworks and a final project.”

COMS W4995.009 Intro to Data Visualization | Agnes Chang

Recent Syllabus

Students will learn to

Apply a structured design process to create effective and ethical visualizations
Conceptualize ideas and interaction techniques using sketching and prototyping
Apply principles of color, typography, and layout as well as principles of human perception and cognition in visual design; avoid misrepresentation
Process and analyze a variety of data types: quantitative, text, geospatial, qualitative
Create web-based interactive visualizations using D3 and Observable
Critically evaluate visualizations and suggest improvements and refinements
Work constructively as a member of a team to carry out a complex project

COMS W4995.010 Applied Deep Learning | Joshua Gordon
This is a DSI course therefore please refer to their website for the cross-registration instructions for NON-DS students

Link to most recent syllabus

This course provides a practical introduction to Deep Learning. We aim to help you understand the fundamentals of neural networks (DNNs, CNNs, and RNNs), and prepare you to successfully apply them in practice. This course will be taught using open-source software, including TensorFlow. In addition to covering the fundamental methods, we will discuss the rapidly developing space of frameworks and applications, including deep learning on mobile and the web, and applications in healthcare. This course emphasizes fairness, responsibility, and testing, and teaches best practices with these in mind.

COMS W4995.011 Causal Inference for Data Science | Adam Kelleher

This is a DSI course therefore please refer to their website for the cross-registration instructions for NON-DS students

Link to recent syllabus Course description coming soon…

back to top

COMS E6998.001 Virtual Technologies for Cloud Computing | Jason Nieh

Prerequisites: COMS 4118 Operating Systems or the equivalent.

This course will cover the underlying technologies that enable major cloud computing providers to deliver computing resources to consumers on-demand over the Internet, focusing primarily on virtualization and Infrastructure as a Service (IaaS) cloud models. Topics to be covered will include many aspects of the design and implementation of hypervisors and containers, ranging from architectural support for virtualization to live migration to cloud computing security. The course will have homework assignments and a final project, both of which will involve systems programming.

COMS E6998.002 Software Engineering for AI Systems | Baishakhi Ray

Course description coming soon…

COMS E6998.003 Security and Robustness of ML Systems | Junfeng Yang

last year’s course website

Over the past few years, Machine Learning (ML) has made tremendous progress, achieving or surpassing human-level performance for a diverse set of tasks including image classification, speech recognition, and playing games such as Go. These advances have led to widespread adoption and deployment of ML in security- and safety-critical systems such as self-driving cars, malware detection, and aircraft collision avoidance systems. This wide adoption of ML techniques presents new challenges as the predictability and correctness of such systems are of crucial importance. Unfortunately, ML systems, despite their impressive capabilities, often demonstrate unexpected or incorrect behaviors in corner cases for several reasons such as biased training data, overfitting, and underfitting of the models. In safety- and security-critical settings, such incorrect behaviors can lead to disastrous consequences such as a fatal collision of a self-driving car. For example, a Google self-driving car recently crashed into a bus because it expected the bus to yield under a set of rare conditions but the bus did not. A Tesla car in autopilot crashed into a trailer because the autopilot system failed to recognize the trailer as an obstacle due to its “white color against a brightly lit sky” and the “high ride height.” Such corner cases were not part of Google’s or Tesla’s test set and thus never showed up during testing. Other examples include Microsoft’s Tay chatbot tweeting racist words because it was misled by malicious twitter users, and Google removing “gorilla” as an image class after its image classification algorithm incorrectly classified dark skinned people as gorillas.These challenges have drawn huge attention from researchers in machine learning, security, systems, and programming language communities. A number of techniques and theories have been proposed to increase the robustness and security of machine learning. In this course, we will study the most practical and most important of these techniques and theories with a focus on deep learning.

COMS E6998.004 Topics in Robot Learning | Shuran Song

This is an advanced seminar course that will focus on the latest research in machine learning for robotics. More specifically, we study how machine learning and data-driven method can influence the robot’s perception, planning, and control. For example, we will explore the problem of how a robot can learn to perceive and understand its 3D environment, how they can learn from experience to make reasonable plans, and how they can reliably act upon the complex environment base on their understanding of the world. Students will read, present, and discuss the latest research papers on robot learning as well as obtain experience in developing a learning-based robotic system in the course projects.

Pre-requisites: Knowledge of basic machine learning, computer vision, computer graphics, and robotics. Taken any of the following courses or equivalent: COMS W4160, COMS 4733, or COMS 4731.

COMS E6998.005 Human-Computer Interaction | Brian Smith

This course is a graduate-level seminar in which we meet once a week to discuss several human–computer interaction (HCI) research papers. We will meet once a week and cover a different research area within HCI each week. The class is open to graduate students and, with instructor permission, undergraduate students.

Students will be expected to read all of the papers, write weekly reflections about the papers, and present one or two of them to the class in a presentation that is roughly 20 mins long. In our discussions, we will be talking about the papers themselves as well as the research strategy and methodology behind them.

Stanford’s CS 376 is an HCI seminar of similar vein. You can see the following page for their most recent syllabus of papers:
https://hcicourses.stanford.edu/cs376/2018/syllabus.php
(Note that research seminar courses, including both ours and Stanford’s, tend to change the list of papers covered every year.)

Our syllabus will be similar in format, and I will announce the full list of papers on the first day of class.

COMS E6998.006 Dialog Systems (Conversational AI) | Zhou Yu

Course description coming soon…

COMS E6998.007 Advanced Topics and Projects in Deep Learning | Peter Belhumeur

COMS E6998.008 Fundamentals of Speech Recognition | Homayoon S. Beigi

Instructor Website | Syllabus

Fundamentals of Speech Recognition is a comprehensive course, covering all aspects of automatic speech recognition from theory to practice. In this course such topics as Anatomy of Speech, Signal Representation, Phonetics and Phonology, Signal Processing and Feature Extraction, Probability Theory and Statistics, Information Theory, Metrics and Divergences, Decision Theory, Parameter Estimation, Clustering and Learning, Transformation, Hidden Markov Modeling, Language Modeling, Neural Networks (specifically TDNN, LSTM, RNN, and CNN architectures) plus other recent machine learning techniques used in speech recognition are covered in some detail. Also, several open source speech recognition software packages are introduced, with detailed hands-on projects using Kaldi to produce a fully functional speech recognition engine. The lectures cover the theoretical aspects as well as practical coding techniques. The course is graded based on a project. There will be one homework project worth 20%, a Midterm proposal (20% of the grade is in the form of a two page proposal for the project and the final (60% of the grade) is an oral presentation of the project plus a 6-page conference style paper describing the results of the research project. The instructor uses his own Textbook for the course, Homayoon Beigi, “Fundamentals of Speaker Recognition,” Springer-Verlag, New York, 2011. Every week, the slides of the lecture are made available to the students.

COMS E6998.009 Empirical Methods of Data Science | Michelle Levine

Empirical Methods of Data Science is a seminar for students seeking an in depth understanding of how to conduct empirical research in computer science. In the first part of the seminar, we will discuss how to critically examine previous research, build and test hypotheses, and collect data in the most ethical and robust manner. As we explore different means of data collection, we will dive into ethical concerns in research. Next, we will explore how to most effectively analyze different data sets and how to present the data in engaging and exciting ways. In the last part of the seminar, we will hear from different researchers on the methods they use to conduct research, lending to further conversations about when and how to use particular research methods. The focus will be primarily on relatively small data sets but we will also address big data. Students will complete homework assignments and a group research project (paper and presentation).

COMS E6998.010 Cloud Computing and Big Data | Sambit Sahu

This is a graduate level course on Cloud Computing and Big Data with emphasis on hands-on design and implementations. In addition to Infrastructure as a Service (IaaS) and Platform as a Service (PaaS) cloud technologies and concepts, Big Data technologies and platforms (Hadoop ecosystem, Spark) will be covered in this course. By the end of the course, you should have fair amount of knowledge about how to use a Cloud, write applications on Cloud as well as manage and build applications on Big Data platform. The first part of the course covers basic building blocks such as virtualization technologies, virtual appliance, automated provisioning, elasticity, and cloud monitoring. We shall learn these concepts by using and extending capabilities available in real clouds such as Amazon AWS, Google Cloud and OpenStack. The second part of the course will focus on Big Data computing platform (Hadoop- MR and Spark) where students will learn HDFS, Map Reduce and Spark (RDD, DataFrame) concepts and programming. We will also learn various stacks used in building an extremely large scale system such as (i) Kafka for event logging and handing, (ii) Spark and Spark streaming for large scale compute, (iii) Elastic Search for extremely fast indexing and search, (iv) various database services such as DynamoDB, Cassandra, (v) MLLib, Grpahx for various Machine Learning and Deep Learning on extremely large scale data levering cloud. Several real world applications will be covered to illustrate these concepts and research innovations. Students are expected to participate in class discussions, read research papers, work on three programming assignments, and conduct a significant course project. Given that this is a very hands-on course, it is expected that students have decent programming background. Topics to include: Cloud Introduction and Programing with AWS; Virtual Machine, Containers, Serverless Lambda; Compute in a cluster: Hadoop, Spark; Cloud scale data store; Kafka; Elastic Search; Cloud devOps: End-to-end cloud based application design, deployment and monitoring; Openstack based Private Cloud; Intelligent AI Systems design on Cloud

back to top

Computer Science at Columbia University

Upcoming Events

Coffee and Questions

In the News

Press Mentions