Professor Michelle Levine
Email: mlevine@cs.columbia.edu
Lectures: Fridays 12:10am- 2:00pm, 415 Schapiro Cepser
Office Hours: Fridays 2-3pm at CSB Courtyard nook northwest 452B and By Appointment
Description
Empirical Methods of Data Science is a seminar for students seeking an in depth understanding of how to conduct empirical research in computer science. In the first part of the seminar, we will discuss how to critically examine previous research, build and test hypotheses, and collect data in the most ethical and robust manner. As we explore different means of data collection, we will dive into ethical concerns in research. Next, we will explore how to most effectively analyze different data sets and how to present the data in engaging and exciting ways. In the last part of the seminar, we will hear from different researchers on the methods they use to conduct research, lending to further conversations about when and how to use particular research methods. The focus will be primarily on relatively small data sets but we will also address big data. Students will complete homework assignments and a group research project (paper and presentation).
Grade Breakdown
Absence Policy: An unexcused absence or unexcused late arrival will give you a zero for participation for that day. Absences and late arrivals must be cleared with the professor in advance of the class meeting. If excused, there will be no participation penalty. Please make sure you are aware of and follow the Attendance Policies and Missed Classes for this semester which is found here.
Assignment Submission Policy: All assignments must be submitted through CourseWorks by the deadline. Late assignments will not be accepted and will receive a zero. You are required to check that your file properly uploaded. A corrupted file, or zip file that does not open, will count as a late assignment and you will receive a zero.
Academic Integrity
The SEAS academic integrity policy is found here.
The CS academic integrity policy is found here.
Courseworks and Ed Discussion
Students are responsible for actively checking Courseworks and Piazza.
Use Courseworks for: accessing readings and assignments; participating in instructor-lead discussions; receiving email announcements
Use Piazza for: posting questions; forming student discussions
Schedule
Note: Schedule is subject to change. All changes will be posted below and announced through Courseworks.
Date | Topic | Assignments & Due Dates |
---|---|---|
Week 1 (9/10) |
Introduction to the Course |
|
Week 2 (9/17) |
The Scientific Method Conducting a Literature Review |
|
Week 3 (9/24) |
Scientific Method & Big Data Designing a Study |
Assignment 1 Due |
Week 4 (10/1) |
Data Collection Methods |
|
Week 5 (10/8) | Data Analysis Tools (NLP Demo) |
Project Proposal Due |
Week 6 (10/15) |
Ethics |
|
Week 7 (10/22) |
Ethics, Part 2 |
Project Progress Report #1 Due |
Week 8 (10/29) |
Ethics, Part 3 | |
Week 9 (11/5)
|
Data Analysis Techniques Guest Speaker/Demo |
Assignment 2 Due |
Week 10 (11/12) | Graphing and Reporting Data Guest Speaker/Demo |
Project Progress Report #2 Due |
Week 11 (11/19) | Presenting & Publishing Research Guest Speaker/Demo |
Project Rough Draft Due |
Thanksgiving Holiday (11/26) | NO CLASS |
Week 12 (12/3) | Research in the Press |
Assignment 3 Due |
Week 13 (12/10) | Student Presentations |
Submit Presentation Slides by 12/9 |
Finals Week | Final Project Paper Due |