Pick a research paper on a topic related to this course, read it, and write a report that coherently explains the main ideas and results **in your own words**.
This is your opportunity to read about a topic that goes beyond what we
cover in lecture, and also to demonstrate your ability to read and
understand a research article in this area.

You will also give an oral presentation towards the end of the semester, where you essentially give a lecture on the main ideas from the paper. You should take into account what we have already covered in class when preparing the presentation.

You should work in groups of two or three. If this is a problem, please let me know as soon as possible.

Some papers are very long; you can check with me if it is okay to only partially cover such a paper. The report should not simply restate or regurgitate the text already in paper. It does not have to go into detail about the entire paper, but it should at least describe some key parts of the paper in moderate detail. Although there are no strict page count limits on the report, the report should not be any longer than it needs to be, so avoid extraneous "fluff". I suspect most good reports will be around 8–10 pages.

An example of what an okay report might look like is the notes on the J-L lemma, based on the paper An Elementary Proof of a Theorem of Johnson and Lindenstrauss. (The papers you will read are likely to be longer.) These notes fall short of being a good report because they lack details about the Chernoff bound actually used in the paper, and also lack a detailed comparison of the different linear maps.

A list of possible papers is given on this page. Once a group selects a paper, it will no longer be available for other groups to pick. (Some of the longer papers could be split up.) You can also propose a paper not on the list, but you have to clear this with me well in advance.

If you are inclined to do a research project of a theoretical nature, please talk to me first. You should still do a project proposal (where you state the problem and point to any prior/related work), a project report (essentially a research paper, though results can be preliminary and a bit more informal), and also an oral presentation.

Anytime before the end of **October 14, 2015**, one member of your group should send a message to the course e-mail, cc'ing the group members, stating (i) your project group, and (ii) the paper you plan to cover.
Use the subject line "Project proposal".

If you pick a paper that has already been chosen, I will let you know, and you will have to pick another paper before the end of **October 19, 2015**.
If there are still conflicts after this, I will pick a paper for you.

Feel free to discuss your choice of the paper with me in office hours. If you are picking a long paper and want to only cover a portion of it, please clear this with me first. Otherwise I will expect you to cover the entire paper. If you are picking a paper not on this list, you should clear it with me well before the deadline.

Anytime before the end of **December 10, 2015**, one member of each group should send the final report as a PDF file to the course e-mail.
Use the subject line "Project report".

Each member of the group should also privately send me an assessment of their own contribution to the project (report and oral presentation).

You must adhere to the Academic Honesty policy of the Computer Science Department.

The list is incomplete; I will add more papers later.

- Random linear maps
- Klartag and Mendelson, Empirical processes and random projections
- Clarkson, Tighter bounds for random projections of manifolds
- Matousek, On variants of the Johnsonâ€“Lindenstrauss lemma
- Dasgupta, Kumar, and Sarlos, A sparse Johnson-Lindenstrauss transform
- Kane and Nelson, Sparser Johnson-Lindenstrauss transforms
- Indyk and Naor, Nearest neighbor preserving embeddings
- Dasgupta and Freund, Random projection trees and low dimensional manifolds

- Learning mixture models
- Dasgupta and Schulman, A probabilistic analysis of EM for mixtures of separated, spherical Gaussians
- Achlioptas and McSherry, On spectral learning of mixtures of distributions
- Hsu and Kakade, Learning mixtures of spherical Gaussians: moment methods and spectral decompositions

- Low rank approximations and matrix sketching
- Halko, Martinsson, and Tropp, Finding structure with randomness: probabilistic algorithms for constructing approximate matrix decompositions
- Liberty, Simple and deterministic matrix sketching and Ghashami and Phillips, Relative errors for deterministic low-rank matrix approximations
- Balsubramani, Dasgupta, and Freund, The fast convergence of incremental PCA
- Boutsidis, Garber, Karnin, and Liberty, Online principal component analysis

- k-means clustering
- Telgarsky and Dasgupta, Moment-based uniform deviation bounds for k-means and friends
- Kumar, Sabharwal, and Sen, Linear-time approximation schemes for clustering problems in any dimensions
- Kanungo, Mount, Netanyahu, Piatko, Silverman, and Wu, A local search approximation algorithm for k-means clustering
- Ostrovsky, Rabini, Schulman, and Swamy, The effectiveness of Lloyd-type methods for the k-means problem
- Feldman and Langberg, A unified framework for approximating and clustering data

- Spectral clustering
- McSherry, Spectral partitioning of random graphs
- Bshouty and Long, Finding planted partitions in nearly linear time using arrested spectral clustering
- Chaudhuri, Chung, Tsiatas, Spectral clustering of graphs with general degrees in the extended planted partition model
- Oymak and Hassibi, Finding dense clusters via "low rank + sparse" decomposition
- Cole, Friedland, and Reyzin, A simple spectral algorithm for recovering planted partitions

- Sparse recovery
- Bickel, Ritov, and Tsybakov, Simultaneous analysis of Lasso and Dantzig selector
- Donoho and Stark, Uncertainty principles and signal recovery
- Zhang, Adaptive forward-backward greedy algorithm for learning sparse representations
- Zhang, Sparse recovery with orthogonal matching pursuit under RIP

- Dictionary learning
- Spielman, Wang, and Wright, Exact recovery of sparsely-used dictionaries
- Luh and Vu, Dictionary learning with few samples and matrix concentration
- Agarwal, Anandkumar, Jain, Netrapali, and Tandon, Learning sparsely used overcomplete dictionaries
- Arora, Ge, and Moitra, New algorithms for learning incoherent and overcomplete dictionaries