Fall 2022 Topics Courses

The below are course descriptions for the Fall 2022 Topics Courses. This page will be updated as new information is made available. The official record of what will be offered is listed on the Directory of Classes. Please use this only as a resource in your course planning. Undergraduates should consult your CS Faculty advisor to see if a course counts for your track. MS students should consult the Topics page (if not listed then consult your CS Faculty advisor).

For questions regarding Data Science courses please email: DataScience-Registration@columbia.edu

 

COMS 4995.001 HACKING 4 DEFENSE | Blaer, Paul
COMS 4995.002 PARALLEL FUNCTIONAL PROGR | Edwards, Stephen
COMS 4995.003 EMPIRICAL METHODS DATA SC | Levine, Michelle
COMS 4995.004 NEURAL NETWORKS DEEP LEAR | Zemel, Richard
COMS 4995.005 LOGIC AND COMPUTABILITY | Pitassi, Toniann
COMS 4995.010 APPLIED DEEP LEARNING | Gordon, Joshua
COMS 4995.011 APPLIED MACHINE LEARNING | Pappu, Vijay
COMS 4995.012 ELEMENTS FOR DATA SCIENCE | Gibson, Bryan

 

COMS 6998.001 TOPICS IN ROBOTIC LEARNIN | Song, Shuran
COMS 6998.002 ADV WEB DESIGN STUDIO | Chilton, Lydia
COMS 6998.004 DIALOG SYSTEMS (CONVERSNL | Yu, Zhou
COMS 6998.005 REPRESENTATION LEARNING | Vondrick, Carl
COMS 6998.006 ADV SPOKEN LANGUAGE PROCE | Hirschberg, Julia
COMS 6998.007 TOPICS DATACENTER NETWORK | Misra, Vishal
COMS 6998.008 SECURITY ROBUSTNESS ML SY | Yang, Junfeng
COMS 6998.009 FUND SPEECH RECOGNITION | Beigi, Homayoon
COMS 6998.010 FINE GRAINED COMPLEXITY | Alman, Joshua
COMS 6998.011 NATURAL LANG GEN SUMMARIZ | McKeown, Kathleen
COMS 6998.012 PRACT DEEP LEARNING SYS P | Dube, Parijat
COMS 6998.013 FAIR AND ROBUST ALGORITHM | Zemel, Richard
COMS 6998.014 ANALYSIS OF NETWORKS & CR | Chaintreau, Augustin
COMS 6998.015 CLOUD COMPUTING & BIG DAT | Sahu, Sambit
COMS 6998.016 MACHINE LEARNING &CLIMATE | Kucukelbir, Alp

 

 


 

COMS 4995.001 HACKING 4 DEFENSE | Blaer, Paul

Course Website

Solve complex technology problems critical to our National Security with a team of engineers, scientists, MBAs, and policy experts. In a crisis, national security initiatives move at the speed of a startup yet in peacetime they default to decades-long acquisition and procurement cycles. Startups operate with continual speed and urgency 24/7. Over the last few years they’ve learned how to be not only fast, but extremely efficient with resources and time using lean startup methodologies. In this class student teams develop technology solutions to help solve important national security problems. Student teams take actual national security problems and learn how to apply the Lean launchpad and Lean Startup principles, (“business model canvas,” “customer development,” and “agile engineering”) to discover and validate customer needs and to continually build iterative prototypes to test whether they understood the problem and solution. Teams take a hands-on approach requiring close engagement with actual military, Department of Defense and other government agency end-users. Team applications required. Limited enrollment. Taught by Professor Paul Blaer and Jason Cahill, Hacking for Defense™ is a university-sponsored class that allows students to develop a deep understanding of the problems and needs of government sponsors in the Department of Defense and the Intelligence Community. In a short time, students rapidly iterate prototypes and produce solutions to sponsors’ needs. This course provides students with an experiential opportunity to become more effective in their chosen field, with a body of work to back it up. For government agencies, it allows problem sponsors to increase the speed at which their organization solves specific, mission-critical problems. For more information check out http://www.h4di.org/

 

COMS 4995.002 PARALLEL FUNCTIONAL PROGR | Edwards, Stephen

Prerequisites: COMS 3157 Advanced Programming or the equivalent. Knowledge of at least one programming language and related development tools/environments required. Functional programming experience not required. Functional programming in Haskell, with an emphasis on parallel programs. The goal of this class is to introduce you to the functional programming paradigm. You will learn to code in Haskell; this experience will also prepare you to code in other functional languages. The first half of the class will cover basic (single-threaded) functional programming; the second half will cover how to code parallel programs in a functional setting.

 

COMS 4995.003 EMPIRICAL METHODS DATA SC | Levine, Michelle

Empirical Methods of Data Science is a seminar for students seeking an in depth understanding of how to conduct empirical research in computer science. In the first part of the seminar, we will discuss how to critically examine previous research, build and test hypotheses, and collect data in the most ethical and robust manner. As we explore different means of data collection, we will dive into ethical concerns in research. Next, we will explore how to most effectively analyze different data sets and how to present the data in engaging and exciting ways. In the last part of the seminar, we will hear from different researchers on the methods they use to conduct research, lending to further conversations and in class debates about when and how to use particular research methods. The focus will be primarily on relatively small data sets but we will also address big data.

 

COMS 4995.004 NEURAL NETWORKS DEEP LEAR | Zemel, Richard

 

Course description coming soon…

 

~ back to top ~

 

COMS 4995.005 LOGIC AND COMPUTABILITY | Pitassi, Toniann

 

Course description coming soon…

 

COMS 4995.010 APPLIED DEEP LEARNING | Gordon, Joshua

 

Course description coming soon…

 

COMS 4995.011 APPLIED MACHINE LEARNING | Pappu, Vijay

 

This is a DSI course therefore please refer to their website for the cross-registration instructions for NON-DS students

This class offers a hands-on approach to machine learning and data science. The class discusses the application of machine learning methods like SVMs, Random Forests, Gradient Boosting and neural networks on real world dataset, including data preparation, model selection and evaluation. This class complements COMS W4721 in that it relies entirely on available open source implementations in scikit-learn and tensor flow for all implementations. Apart from applying models, we will also discuss software development tools and practices relevant to productionizing machine learning models.

 

COMS 4995.012 ELEMENTS FOR DATA SCIENCE | Gibson, Bryan

 

This is a DSI course therefore please refer to their website for the cross-registration instructions for NON-DS students

This course is designed as an introduction to elements that constitutes the skill set of a data scientist. The course will focus on the utility of these elements in common tasks of a data scientist, rather than their theoretical formulation and properties. The course provides a foundation of basic theory and methodology with applied examples to analyze large engineering, business, and social data for data science problems. Hands-on experiments with R or Python will be emphasized.

 

~ back to top ~

 

COMS 6998.001 TOPICS IN ROBOTIC LEARNIN | Song, Shuran

 

This is an advanced seminar course that will focus on the latest research in machine learning for robotics. More specifically, we study how machine learning and data-driven method can influence the robot’s perception, planning, and control. For example, we will explore the problem of how a robot can learn to perceive and understand its 3D environment, how they can learn from experience to make reasonable plans, and how they can reliably act upon with the complex environment base on their understanding of the world. Students will read, present, and discuss the latest research papers on robot learning as well as obtain experience in developing a learning-based robotic system in the course projects.

 

COMS 6998.002 ADV WEB DESIGN STUDIO | Chilton, Lydia

 

This semester, Advanced Web Design Studio is partnering with faculty and students in Journalism and Architecture to design, build and deploy “public interest technology.” We will introduce interdisciplinary design methods and principles for Human Computer Interaction — mixing Architecture, Urban Planning, Computer Science, and Journalism — that respond to the potentials as well as the adverse effects of computation on society today. Our work will be dedicated to leveraging technology to support public interest organizations. We will move beyond short-term metrics—clicks, daily active users, and private profit—to focus on fostering long-term value and local networks. Students will work together to create and deploy interdisciplinary projects in collaborative teams, with the aim of serving the public and using technology to advance justice, equality, and inclusion in society. We will also be alert to the ways in which the language of “public interest” can sometimes hide or offer alibis for other political or private interests. For the first half of the semester, coding and design exercises, and complete short readings on journalism, urban planning, and public interest technology. During the second half of the semester students will iterate their work and put their projects into action. A strict pre-requisite is to have taken COMS 4170 UI Design, or have taken an equivalent class with both web-based implementation and design components. You must also fill out the prerequisite form that will be available thru SSOL. It is also here: https://forms.gle/v3a22idw9qgubgcG9. The faculty instructors in each area are: Lydia Chilton: School of Engineering, Computer Science; Mark Hansen: School of Journalism; Laura Kurgan: Graduate School of Architecture, Planning and Preservation

Here are some examples of Public Interest Technology:

— Ushahidi: Since 2008, thousands have used this crowdsourcing platform in disaster settings, from violence post-election to earthquakes and floods worldwide.

— OneBusAway: Results from Providing Real-Time Arrival Information for Public Transit. (2010)

— Discrimination in Online Ad Delivery: Google ads, black names and white names, racial discrimination, and click advertising (2013)

— Anti Eviction Mapping is a volunteer data-visualization, data analysis, and storytelling collective documenting the dispossession and resistance upon gentrifying landscapes (since 2013)

 

~ back to top ~

 

COMS 6998.004 DIALOG SYSTEMS (CONVERSNL | Yu, Zhou

 

Course description coming soon…

 

COMS 6998.005 REPRESENTATION LEARNING | Vondrick, Carl

 

Course description coming soon…

 

COMS 6998.006 ADV SPOKEN LANGUAGE PROCE | Hirschberg, Julia

 

Course Website

This class will introduce students to spoken language processing:  basic concepts, analysis approaches, and applications.

 

COMS 6998.007 TOPICS DATACENTER NETWORK | Misra, Vishal

 

Course description coming soon…

 

COMS 6998.008 SECURITY ROBUSTNESS ML SY | Yang, Junfeng

 

Course Website

Over the past few years, Machine Learning (DL) has made tremendous progress, achieving or surpassing human-level performance for a diverse set of tasks including image classification, speech recognition, and playing games such as Go. These advances have led to widespread adoption and deployment of ML in security- and safety-critical systems such as self-driving cars, malware detection, and aircraft collision avoidance systems. This wide adoption of DL techniques presents new challenges as the predictability and correctness of such systems are of crucial importance. Unfortunately, ML systems, despite their impressive capabilities, often demonstrate unexpected or incorrect behaviors in corner cases for several reasons such as biased training data, overfitting, and underfitting of the models. In safety- and security-critical settings, such incorrect behaviors can lead to disastrous consequences such as a fatal collision of a self-driving car. For example, a Google self-driving car recently crashed into a bus because it expected the bus to yield under a set of rare conditions but the bus did not. A Tesla car in autopilot crashed into a trailer because the autopilot system failed to recognize the trailer as an obstacle due to its “white color against a brightly lit sky” and the “high ride height.” Such corner cases were not part of Google’s or Tesla’s test set and thus never showed up during testing. Other examples include Microsoft’s Tay chatbot tweeting racist words because it was misled by malicious twitter users, and Google removing “gorilla” as an image class after its image classification algorithm incorrectly classified dark skined people as gorillas.

These challenges have drawn huge attention from researchers in machine learning, security, systems, and programming language communities. A number of techniques and theories have been proposed to increase the robustness and security of machine learning. In this course, we will study the most practical and most important of these techniques and theories with a focus on deep learning. For details on the topics we’ll cover, please go to the Course Syllabus page.

 

~ back to top ~

 

COMS 6998.009 FUND SPEECH RECOGNITION | Beigi, Homayoon

 

Syllabus

Fundamentals of Speech Recognition is a comprehensive course, covering all aspects of automatic speech recognition from theory to practice. In this course such topics as Anatomy of Speech, Signal Representation, Phonetics and Phonology, Signal Processing and Feature Extraction, Probability Theory and Statistics, Information Theory, Metrics and Divergences, Decision Theory, Parameter Estimation, Clustering and Learning, Transformation, Hidden Markov Modeling, Language Modeling and Natural Language Processing, Search Techniques, Neural Networks, Support Vector Machines and other recent machine learning techniques used in speech recognition are covered in some detail. Also, several open source speech recognition software packages are introduced, with detailed hands-on projects using Kaldi to produce a fully functional speech recognition engine. The lectures cover the theoretical aspects as well as practical coding techniques. The course is graded based on a project. The Midterm (40% of the grade is in the form of a two page proposal for the project and the final (60% of the grade) is an oral presentation of the project plus a 6-page conference style paper describing the results of the research project. The instructor uses his own Textbook for the course, Homayoon Beigi, “”Fundamentals of Speaker Recognition,”” Springer-Verlag, New York, 2011. Every week, the slides of the lecture are made available to the students.

 

COMS 6998.010 FINE GRAINED COMPLEXITY | Alman, Joshua

 

Course description coming soon…

 

COMS 6998.011 NATURAL LANG GEN SUMMARIZ | McKeown, Kathleen

 

Course description coming soon…

 

~ back to top ~

 

COMS 6998.012 PRACT DEEP LEARNING SYS P | Dube, Parijat

 

This course will cover several topics in performance evaluation of machine learning and deep learning systems. Major topics covered in the course: Algorithmic and system level introduction to Deep Learning (DL), DL training algorithms, network architectures, and best practices for performance optimization, ML/DL system stack on cloud, Tools and benchmarks (e.g., DAWNBench) for performance evaluation of ML/DL systems, Practical performance analysis using standard DL frameworks (tensorflow, pytorch) and resource monitoring tools (e.g., nvidia-smi), Performance modeling to characterize scalability with respect to workload and hardware, Performance consideration with special techniques like transfer learning, semi-supervised learning, neural architecture search. Emphasis will be on getting working knowledge of tools and techniques to evaluate performance of ML/DL systems on cloud platforms. The assignments will involve running experiments using standard DL frameworks (tensorflow, pytorch) and working with open source DL technologies. The students will gain practical experience working on different stages of DL life cycle (development and deployment) and understanding/addressing related system performance issues.

 

COMS 6998.013 FAIR AND ROBUST ALGORITHM | Zemel, Richard

 

Course description coming soon…

 

COMS 6998.014 ANALYSIS OF NETWORKS & CR | Chaintreau, Augustin

 

This course covers the fundamentals underlying information diffusion and incentives on networked applications. Applications include but are not limited to social networks, crowdsourcing, online advertising, rankings, information networks like the world wide web, as well as areas where opinion formation and the aggregate behavior of groups of people play a critical role. Among structural concepts introduced and covered in class feature random graphs, small world, weak ties, structural balance, cluster modularity, preferential attachments, Nash equilibrium, Potential Game and Bipartite Graph Matching. The class examines the following dynamics: link prediction, network formation, adoption with network effect, spectral clustering and ranking, spread of epidemic, seeding, social learning, routing game, all-pay contest and truthful bidding

 

 

~ back to top ~

 

 

COMS 6998.015 CLOUD COMPUTING & BIG DAT | Sahu, Sambit

 

Cloud Computing and Big Data Systems
This is a graduate level course on Cloud Computing and Big Data with emphasis on hands-on design and implementations. You will learn to design and build extremely large scale systems and learn the underlying principles and building blocks in the design of such large scale applications. You will be using real Cloud platforms and services to learn the concepts, build such applications.

The first part of the course covers basic building blocks such as essential cloud services for web applications, cloud programming, virtualization, containers, kubernetes and micro-services. We shall learn these concepts by using and extending capabilities available in real clouds such as Amazon AWS, Google Cloud.

The second part of the course will focus on the various stacks used in building an extremely large scale system such as (i) Kafka for event logging and handing, (ii) Spark and Spark streaming for large scale compute, (iii) Elastic Search for extremely fast indexing and search, (iv) various noSQL database services such as DynamoDB, Cassandra, (v) cloud native with kubernets, (vi) cloud platforms for Machine Learning and Deep Learning based applications. Several real world applications will be covered to illustrate these concepts and research innovations.

Students are expected to participate in class discussions, read research papers, work on three programming assignments, and conduct a significant course project. Given that this is a very hands-on course, it is expected that students have decent programming background.

Prerequisite: Good programming experience in any language, Concepts of Web Applications and Systems

Reading Material: Lecture Notes, Reading Papers, Reference Text Books, Lot of Engineering Docs

Grading: 3 Programming Assignments (35%), 2 Quizzes (25%), Project (40%)

 

COMS 6998.016 MACHINE LEARNING &CLIMATE | Kucukelbir, Alp

 

Course Website | Syllabus PDF

In this course, we will study two aspects of how ml interacts with Earth’s climate.
First, we will investigate how ml can be used to tackle climate change. We will focus on use cases from transportation, manufacturing, food and agriculture, waste management, and atmospheric studies. We will ask questions like: what are the requirements for applying ml to such problems? How can we evaluate the effectiveness of our analyses?

Second, we will consider ml’s own impact on the climate. We will focus on the energy and computation that goes into designing, training, and deploying modern ml systems. We will ask questions like: how can we accurately track and account for ml’s own energy footprint? What strategies can we employ to minimize it?

By the end of this course, you will learn about modern statistical and causal ml methods and their applications to the climate. Our focus will be the modeling of real-world phenomena using probability models, with a focus on vision, time series forecasting, uncertainty quantification, and causality. In addition, you will gain a deeper understanding about the carbon footprint of ml itself and explore how to mitigate it

 

 

 

~ back to top ~