COMS 4995: Randomized Algorithms
 Sept 4: Welcome to COMS 4995!
Instructor:

Tim
Roughgarden (Office hours: Mondays 11:30am12:30pm in Mudd 410. Email: tr@cs.columbia.edu.)
Course Assistants:

David Cheikhi
(Office hours: Mondays 12:302:30pm and Tuesdays Noon2pm in the TA room, Mudd first floor.
Email: d.cheikhi@columbia.edu.)

Chengyu Lin
(Office hours: Tuesdays 23pm.
Email: chengyu@cs.columbia.edu.)
Time/location: 10:1011:25 AM Mon/Wed in Mudd 545.
Discussion site: We'll be using
Piazza.
Prerequisites: Undergraduate algorithms (COMS 4231) or equivalent.
Course Description:
Randomness pervades the natural processes around us, from the
formation of networks, to genetic recombination, to quantum
physics. Randomness is also a powerful tool that can be leveraged to
create algorithms and data structures which, in many cases, are more
efficient and simpler than their deterministic counterparts. This
course covers the key tools of probabilistic analysis, and
applications of these tools to understand the behaviors of random
processes and algorithms. Emphasis is on theoretical foundations,
though we will apply this theory broadly, discussing applications in
machine learning and data analysis, networking, and systems.
Textbook: There is no required textbook for the course;
lectures are the sole required source of content.
Supplementary reading will be posted as part of the lecture schedule, below.
You might, however,
find one or more of the following books helpful:
 Probability and Computing: Randomized Algorithms and
Probabilistic Analysis by Mizenmacher/Upfal.
 Randomized Algorithms by Motwani/Raghavan.
 The Probabilistic Method by Alon/Spencer.
 Concentration of Measure for the Analysis of Randomized
Algorithms by Dubhashi/Panconesi.
 Exercise Sets (0%):
Exercise sets will be handed out weekly and are meant to help
you keep up with the course material.
Do not turn in your solutions. Think about them and discuss with
fellow students or the course staff any that you cannot answer.
At least half of each exam will consist of
problems on these exercise sets or variations thereof.
 Problem Sets (60%):
There will be 5 problem sets. Here is the schedule:
 Problem Set #1
(Posted Wed Sept 11; due Wed Sept 25.)

Problem Set #2 (Posted Wed Sept 25; due
Wed Oct 9.)

Problem Set #3
(Posted Wed Oct 9; due Wed Oct 30.)

Problem Set #4
(Posted Wed Oct 30; due Wed Nov 13.)

Problem Set #5
(Posted Wed Nov 13; due Wed Dec 4.)
 Problem Set Policies:
 These are challenging are you are strongly encouraged to form
groups, of up to three students. Each group should turn in a single
writeup (all students of the group receive the same score).
 You can form different groups for different problem sets.
 You can discuss problems with students from other groups
verbally and at a high level only.

Except where otherwise noted, you may refer to your course notes, the
textbooks and research papers listed on the course Web
page only. You cannot refer to textbooks, handouts, or
research papers that are not listed on the course home page. If you
do use any approved sources, make you sure you cite them
appropriately, and make sure that all your words are your own.
 You are strongly encouraged to use LaTex to typeset your writeup.
Here's a LaTeX template that you can use to
type up solutions. Here
and here are good
introductions to LaTeX.
 Honor code: We expect you to abide by the computer science department's
policies and procedures regarding academic honesty.
 Submission instructions:
We are using Gradescope for the homework submissions. Go to
www.gradescope.com to either login or create a new
account. Use the
course code 9D6V5E to register for this class. Only one group member
needs to submit the assignment. When submitting, please remember to
add all group member names onto Gradescope.
 Late Days:
 One late day equals a 24hour extension.
 Each student has three free late days.
 At most two late days can be applied to a single assignment;
after Friday (following the original due date) no solutions
will be accepted.
 Each late day used after the first two will result in a 25%
penalty.
 Example: a student had one free late day remaining but his/her
group uses two late days on a Problem Set. If the group's writeup
earns p points, the student receives a final score of .75*p points
for the assignment.
 Exams (40%): There will be two exams.
The first one is during class on Wed Oct 23, the second one
at the regularly scheduled final exam time. The second exam will be a
noncumulative 75minute exam, weighted equally with the first exam.
 At least half of the questions will be exercise set
questions or variations thereof.
 The exam is closedbook/computer; however, you are allowed
to bring one
doublesided sheets (2 pages) of notes, which must be
handwritten and prepared by yourself.
Lecture Schedule
 Lecture 1 (Wed Sept 4): Course goals and deliverables.
Random sampling and abundance.
Randomized primality testing and the MillerRabin test.
Supplementary reading/videos:
 Lecture 2 (Mon Sept 9):
Lightning review of basic probability concepts. Linearity of expectation. A 7/8approximation for MAX 3SAT. Other constraint satisfaction problems (CSPs). The GoemansWilliamson randomized hyperplane rounding algorithm for the Maximum Cut problem.
 Lecture 3 (Wed Sept 11): Karger's randomized
contraction algorithm for the Min Cut problem.
 Lecture 4 (Mon Sept 16): Hash functions as approximate random oracles. Hash table review (chaining and open addressing). Bloom filters: implementation and heuristic analysis.

For a review of the basics of hash tables (including open addressing), see the following videos (or Algorithms Illuminated, Sections 12.112.4):
The bloom filter material is covered in the following videos (or Algorithms Illuminated, Sections 12.512.6):