CS 4252 Final Project
For the final project, you will do independent
research/reading on a topic not covered in COMS 4252 lectures
(or go into significantly greater depth on some topic that we did
touch upon)
and write up a report
in your own words as clearly and thoroughly as you can. Your project should
demonstrate that you have synthesized information from several different
sources (typically research papers).
Projects absolutely do not need to have an experimental component but it
is OK to have an experimental component if it is relevant. For
projects that include an experimental component the report
should include a clear description of the experiments that were performed
and code should be made available.
Ten to fifteen pages
is a reasonable length for your written report, though there is neither
a hard upper limit nor lower limit on the length of the report.
(Note that your report should be no longer than it needs to be -- concision
will be rewarded and excessive wordiness or "padding" will be frowned on).
You may work in teams of up to three people
for the final project, but you do not have to
work in a team.
Expectations for team projects will be calibrated according to the
number of team members -- so a three-person project would need to demonstrate
three times as much work as a one-person project on the same topic in order
to get the same grade.
The timeline for final projects is:
Wed October 19:
initial project proposal due.
Describe the topic you propose for your final project.
This can be brief (no longer than a single paragraph)
but should include some description of the
sources you anticipate using for your project (names of papers, books, etc).
Email this to the course account with subject line "Project
Proposal".
Mon November 21: progress report due.
Send a 1-2 page high-level description of what you have accomplished so far
and what you still have to do for your project.
This should be substantially more detailed than your initial project proposal.
Email this to the course account with subject line "Project
Progress Report."
Mon December 12: final project report due.
Email this to the course account with subject line "Final Project Report".
If you are not sure what to work on for your final project,
here are some sample topics. It is OK to choose from this list
but you are encouraged to spend some time browsing the literature
since you may encounter a great topic that interests you more than
any of these.
Kernels: Kernel functions and large margin classifiers are an
important topic in contemporary machine learning and learning theory
research. A starting point here is
the book by Cristianini and Shawe-Taylor ("An
introduction to support vector machines").
Such a project could include an experimental
component, perhaps testing the performance of the Perceptron algorithm
or other kernel-based algorithms with various kernels.
Random projection: papers by Arriaga/Vempala (FOCS 1999) and
Klivans/Servedio (COLT 2004) (see also papers by Sanjoy Dasgupta on this
topic) use random projection of data from
n-dimensional space to k-dimensional space (with k much less than n)
as an essential step of learning algorithms with provable
performance guarantees.
Read these and/or related papers on learning with random projection
and write a summary/synthesis
explaining what you learned.
Here too experiments are an option.
Learning under the uniform distribution / Fourier analysis:
learning Boolean functions from uniformly distributed random examples on
{0,1}^n is an important topic in learning theory. Much more is learnable
in these frameworks than in the standard PAC model where you must succeed
w.r.t. any distribution. Some good papers to look at are Verbeurgt's 1989
COLT paper on learning DNF in quasipolynomial time under the uniform
distribution; the paper of Kushilevitz and Mansour (SIAM J on Computing,
1993) on learning decision trees using the Fourier spectrum; Mansour's
survey article on "Learning Boolean functions using the Fourier
transform"; Linial, Mansour and Nisan's 1993 JACM article on "Constant
depth circuits, Fourier transform and learnability"; and others.
Boosting: We will only touch the tip of the iceberg w.r.t.
boosting algorithms -- many have been studied and analyzed. Experiments
could play a role in a project here.
Multiclass learning: in the class we only discuss learning
Boolean (i.e. two-valued) functions but there is much work on learning
multivalued functions where there are more than two possible outputs.
Best expert / weighted majority type algorithms: here too
there is a large literature that we just scratched the surface of. The
original paper "The weighted majority algorithm" by Littlestone
and Warmuth is one good source to get started with.
"Follow-the-(perturbed)-leader": this is an alternate approach to
``best-expert'' algorithms (different from multiplicative weight approaches)
but one that also gives good bounds. The paper "Efficient algorithms
for online decision problems" by Kalai and Vempala is a good starting point.
Topics in exact learning: toward the end of the course we will
study (only a little bit) a model in which the learner can make
equivalence queries and membership queries, and must exactly identify the
unknown target concept. We will show that monotone DNF are efficiently
learnable in this framework, but in fact several other interesting and
expressive concept classes are known to be efficiently learnable in this
model; these include finite automata (a result due to Dana Angluin; this
is covered in the Kearns/Vazirani text, and references are given there)
and decision trees (due to Nader Bshouty, Information and Computation,
"Exact learning Boolean functions via the monotone theory"). Study and
give an exposition of these results.
Learning in the presence of noise: In class we will discuss
the models of PAC learning with random classification noise
and PAC learning with malicious noise, but many other noise
models have been studied as well (so called nasty noise; attribute
noise; agnostic learning can be viewed as a noise model; etc.)
There is a lot of interesting work on these topics.
Algorithms for learning linear threshold functions: One possible
project would be to do a detailed study of linear programming based algorithms
for learning linear threshold functions (good if you have a strong
interest in linear programming). Another would be to study simple
iterative algorithms such as Perceptron and Winnow in more detail;
there are many variants of these algorithms for related problems
such as linear regression, modifications/analyses for noise-tolerance,
etc. A project of the latter sort could easily have an experimental
component.
Inductive inference: This is a whole other way to look at
learning from a theoretical CS perspective (more of a recursion theory
flavor). Much work in inductive inference predates the work we've
discussed in 4252. The survey by Angluin and Smith is a good place to
get started here.
Learning rich concept classes: Some learning algorithms which
run in more than polynomial time (but less than exponential 2^n time) are
known for rich concept classes in the PAC learning model. These include
algorithms for decision trees (Ehrenfeucht and Haussler, 1989; see also
Blum, 1992, Information Processing Letters, "Rank-r Decision Trees are a
subclass of r-decision lists") and algorithms for DNF formulas (see
Bshouty, 1994, "A subexponential exact algorithm for learning DNF using
equivalence queries"; also see Klivans and Servedio, STOC 2001). Study and
given an exposition of these results.
Active learning: In active learning, the learner is given a large
data set of unlabelled examples, and is allowed to choose which
of these examples should receive a label; she must "pay" for each
example that she chooses to have labelled. This topic has
received renewed attention recently; papers by Dasgupta (NIPS 2004),
Freund/Seung/Shamir/Tishby (Machine Learning 1997), and
Dasgupta/Kalai/Monteleoni (COLT 2005) are good initial references, and
there is a lot of work just in the past few years as well.