#
CS 4252 Final Project

For the final project, you will choose **one** research
paper in computational
learning theory, read it, and submit a written report giving a clear
and coherent explanation **in your own words** of what you learned from
the paper. The point of the project is (a) to give you a chance to explore
a CLT topic outside of the lectures/book, and (b) for you to
demonstrate to me
that you can read and understand (at least some portion of) a research
paper in computational learning theory.
Note that your report does not have to provide detailed coverage of
the entire paper that you choose (some papers are quite long), but it
is a good idea to go into at least some level of detail
about some portion of the paper. Ten pages
is a reasonable length for your written report, though there is neither
a hard upper limit nor lower limit on the length of the report.
(Note that your report should be no longer than it needs to be -- concision
will be rewarded and excessive wordiness or "padding" will be frowned on).

You may work in teams of up to three people
for the final project, but you do not have to
work in a team.
**
Expectations for team projects will be calibrated according to the
number of team members -- so a three-person project would need to demonstrate
three times as much work/understanding as a one-person project on the
same topic in order to get the same grade.
**

The rough timeline for final projects is:

###

**
**

**November 6:
initial project proposal due.**
Specify the paper you plan to cover for your final project.
Email this to the course account with subject line "Project Proposal".
If you don't plan to cover the entire paper you should specify
(approximately) the portion
of the paper that you are planning to cover.
###

**
December 16, 11:59pm EST: final project report due.
**

Email your final project to **coms4252columbia.projects@gmail.com
** (not the usual course account)
with subject line "Final Project Report".
Your project should be submitted in the form of a pdf file.

If you are not sure what to do for your final project,
here are some sample topics and papers to consider as possibilities.
There is nothing particularly special about the items listed below,
many other great topics/papers could have been given.
It is OK to choose from this list
but you are encouraged to spend some time browsing the literature
since you may encounter a great topic that interests you more than
any of these.

Note that if you decide to choose your own paper, it is
**not** an option to choose a survey paper -- you should choose a
paper that actually presents original research results.
You also may not be allowed to pick such a "classic" paper that its contents
have already been extensively covered in surveys, since in such cases the
survey papers have already by and large
done the work that you are expected to do
for this project, that of reading, understanding, and giving a nice exposition
of (at least a portion of) the paper.

Learning under the uniform distribution / Fourier analysis:
Learning Boolean functions from uniformly distributed random examples on
{0,1}^n is an important topic in learning theory. Much more is learnable
in these frameworks than in the standard PAC model where you must succeed
w.r.t. any distribution. For this project you could choose from the
1989 COLT paper of Karsten Verbeurgt on learning DNF in quasipolynomial time
under the uniform distribution;
the paper of Kushilevitz and Mansour (SIAM J on Computing,
1993) on learning decision trees using the Fourier spectrum; or
Linial, Mansour and Nisan's 1993 JACM article on "Constant
depth circuits, Fourier transform and learnability".

Multiclass learning: in the class we only discuss learning
Boolean (i.e. two-valued) functions but there is much work on learning
multivalued functions where there are more than two possible outputs.
Some good possible papers on this topic would be the 1999 Guruswami/Sahai
paper or the 1995 JAIR multiclass learning paper of Dietterich/Bakiri.

Best expert / weighted majority type algorithms: here too
there is a large literature that we just scratched the surface of. The
original 1994 Information and Computation paper
"The weighted majority algorithm" by Littlestone
and Warmuth is a good paper here. (If you choose this of course you need to
go into depth on parts of the paper that we did not cover in class.)

"Follow-the-(perturbed)-leader": this is an alternate approach to
``best-expert'' algorithms (different from multiplicative weight approaches)
but one that also gives good bounds. The 2005 JCSS paper "Efficient algorithms
for online decision problems" by Kalai and Vempala is a good choice
for this topic.

Topics in exact learning: toward the end of the course we will
study (only a little bit) a model in which the learner can make
equivalence queries and membership queries, and must exactly identify the
unknown target concept. We will show that monotone DNF are efficiently
learnable in this framework, but in fact several other interesting and
expressive concept classes are known to be efficiently learnable in this
model; these include finite automata (a result due to Dana Angluin; this
is covered in the Kearns/Vazirani text, and references are given there)
and decision trees. Good papers to choose for this topic are
Dana Angluin's 1987 Information and Computation
paper "Learning Regular Sets from Queries and Counterexamples",
or her 1990 Machine Learning paper "Negative results for Equivalence Queries",
or Nader Bshouty's 1995 Information and Computation paper
"Exact learning Boolean functions via the monotone theory".

Complexity-theoretic aspects of learning: There are many interesting
connections between complexity theory (complexity classes) and learning theory.
Good papers in this area are the 1996 JCSS paper of
Bshouty, Cleve, Gavalda, Kannan, Tamon "Oracles and queries that are
sufficient for exact learning" and the 2009 Fortnow/Klivans JCSS
paper "Efficient learning algorithms yield circuit lower bounds."

Random projection: A useful technique (which we did not cover in
class) is to perform a 'random projection' of data
from n-dimensional space to k-dimensional space (with k much less than n).
If you are interested in this topic you may want to choose the 2006 Machine
Learning paper "An algorithmic theory of learning: Robust concepts and random projection" of Arriaga and Vempala, or the 2008 Klivans/Servedio
paper "Learning intersections of halfspaces with a margin".

Statistical Query learning:
The STOC 1994 paper of Blum, Furst, Jackson, Kearns, Mansour and Rudich
on "Weakly learning DNF and characterizing statistical query learning using
Fourier analysis" covers aspects of the SQ learning model that we won't
do in class. Another good paper here is Vitaly Feldman's 2012 JCSS
paper "A complete characterization of statistical query learning with applications to evolvability."

Evolvability: Leslie Valiant has introduced a computational model
of evolution which views the evolution process as a restricted
form of learning. See his 2009 JACM paper "Evolvability."

Noise-tolerant learning of linear threshold functions: The 1998 Algorithmica
paper of Blum, Frieze, Kannan and Vempala gives a polynomial-time
algorithm to learn arbitrary linear threshold functions in the
presence of random classification noise.

Learning rich concept classes: Some learning algorithms which
run in more than polynomial time (but less than exponential 2^n time) are
known for rich concept classes in the PAC learning model. These include
algorithms for decision trees (Ehrenfeucht and Haussler, 1989,
Information and Computation, "Learning Decision Trees from Random Examples")
and algorithms for DNF formulas (Bshouty, 1994, Information Processing
Letters, "A subexponential exact algorithm for learning DNF using
equivalence queries", or the 2004 Klivans/Servedio JCSS paper on DNF.).