ADVANCED MACHINE LEARNING &     October 5, 2010

PERCEPTION            

COMS 6772 CLASS PROJECT
PROF. TONY JEBARA

 

ABSTRACT WRITEUP

SEND BY EMAIL TO TAS AND INSTRUCTOR BY 9AM OCT 19, 2010.

 

PRESENTATIONS ON

NOV 16 AND NOV 23rd  2010 (CVN students just email powerpoint)

 

WRITE UP DUE ON

DEC 9th 2010 BY MIDNIGHT

 

1. These are either individual 1-person projects or 2-person projects. We expect double the amount of work and research depth for a 2-person effort.

 

2. Send us a 1 paragraph abstract (no more than 200 words) describing your project, project title, your name (or names if a two-person effort). Email both the TA and the instructor and make the title of the email "ABSTRACT". This should broadly describe what you plan to do and are doing, what results you expect, etc.

 

3. The final presentations should be in powerpoint or pdf files and should be <3 minutes long. Groups of two people can take about double that time (<6 minutes). You can either email me your powerpoint, pdf or bring your own laptop. CVN students need only send a powerpoint but do not need to present it. Single student projects are allowed 3 powerpoint slides total (3 minutes) and two-student projects are allowed 6 powerpoint slides total (6 minutes).

 

4. After presentations, submit a write-up in a two-column conference paper-style document as a Postscript file project.ps or a Portable Document Format file project.pdf, whichever is more appropriate and convenient for you to produce. Please do not send your work as a Microsoft Office document, LaTex source code, or something more exotic. Include images within your document as figures. Keep your total write-up no longer than 5 pages (two-column) although for 2-person projects you can write up to 8 pages. To see how to write a good paper and present it, check out this link:
http://www.cs.iastate.edu/~honavar/grad-advice.html
In particular see Simon Peyton Jones on "How to Write a Good Research Paper".
We recommend using Latex to write up your report: http://www.latex-project.org

 

Submit your homework via Courseworks. If unable to, please email it to both the TAs and Instructor. Please tar.gz everything in your current directory and then send it to us. Make sure you send us a write up of your results as a postscript or pdf file containing any figures, tables and equations as well as your Matlab or C code and scripts as separate files.

 

For examples of previous year’s projects, take a look at:

http://www1.cs.columbia.edu/~jebara/6772/proj/

http://www1.cs.columbia.edu/~jebara/6998-01/projects/

 (some links may be broken, just try to follow the ones that work)

 

PROJECT DESCRIPTION

 

Unlike the assignments, for the projects there is no fixed recipe to follow. Rather, you are free to pick a topic and direction that you find motivating and to leverage the tools covered in class. Here are a few themes we suggest as well as a few papers to look into.


    Combine discriminative and generative learning. Consider new ways to fuse the two, either via the tools we have discussed or new ideas of your own.

 

Maximum margin Markov Networks

B. Taskar, C. Guestron, D. Koller http://books.nips.cc/papers/files/nips16/NIPS2003_AA04.pdf

 

Maximum entropy discrimination

T. Jaakkola, M. Meila and T. Jebara http://www1.cs.columbia.edu/~jebara/papers/maxent.pdf

 

Machine Learning: Discriminative and Generative

T. Jebara

 

Exploiting generative models in discriminative classifiers

T. Jaakkola and D. Haussler. http://www.ai.mit.edu/~tommi/papers/gendisc.ps

 

 


    Graph based learning.

 

Graph Transduction via Alternating Minimization

J. Wang, T. Jebara and S.F. Chang http://www.cs.columbia.edu/~jebara/papers/icml08.pdf


Graph Construction and b-Matching for Semi-Supervised Learning

T. Jebara, J. Wang and S.F. Chang http://www.cs.columbia.edu/~jebara/papers/JebWanCha09.pdf


Graph reconstruction with degree-constrained subgraphs

S. Andrews and T. Jebara http://www1.cs.columbia.edu/~jebara/papers/stu-andrews-workshop-submission-nips2007.pdf

 


    Manifold learning. Consider ways to constrain or represent data that lives on a non-linear manifold.

 

Neigbhourhood Components Analysis

J. Goldberger, S. Roweis, G. Hinton and R. Salakhutdinov,

http://www.cs.toronto.edu/~hinton/absps/nca.pdf


Minimum Volume Embedding

B. Shaw and T. Jebara ,

http://www1.cs.columbia.edu/~jebara/papers/aistatsMVE07.pdf

 

Nonlinear Dimensionality Reduction by Semidefinite Programming

and Kernel Matrix Factorization

K. Weingberger, B. Backed and L. Saul,

http://www.seas.upenn.edu/~kilianw/publications/PDFs/kfactor_aistats05.pdf

 

Action Respecting Embedding 

S. Bowling, A. Ghodsi and D. Wilkinson, http://www.machinelearning.org/proceedings/icml2005/papers/009_Action_BowlingEtAl.pdf

 

GTM: The generative topographic mapping 

C. Bishop, http://www.ncrg.aston.ac.uk/Papers/postscript/NCRG_96_015.ps.Z

 

Nonlinear dimensionality reduction by locally linear embedding

S. Roweis and L. Saul, http://www.sciencemag.org/cgi/reprint/290/5500/2323.pdf

 

Kernel PCA and de-noising in feature spaces

S. Mika et al., http://www.kernelmachines.org/papers/MikSchSmoMueRaeSch99.ps.gz

 

A generalization of principal component analysis to the exponential family

M. Collins et al, http://www.research.att.com/~dasgupta/pca.pdf

 


    Feature selection. Aggressively discard irrelevant features in a classification problem.

 

Feature selection for SVMs

J. Weston et al., http://www.ai.mit.edu/people/sayan/webPub/feature.ps

 

Feature selection and dualities in maximum entropy discrimination

T. Jebara and T. Jaakola, http://www.cs.columbia.edu/~jebara/papers/uai.pdf

 


    Novel Kernels. Try building kernels on unusual spaces (not just standard vectors).

 

String matching kernels for text classification

H. Lodhi et al, http://www.support-vector.net/papers/string.ps

 

A kernel between sets of vectors

R. Kondor and T. Jebara

http://www.cs.columbia.edu/~jebara/papers/Kondor,Jebara_point_set.pdf

 

Probability Product Kernels

T. Jebara, R. Kondor and A. Howard

http://www1.cs.columbia.edu/~jebara/papers/jebara04a.pdf

 

Density Estimation under Independent Similarly Distributed Sampling Assumptions

T. Jebara, Y. Song and K. Thadani

http://www1.cs.columbia.edu/~jebara/papers/nips07isd.pdf

 


    Meta-Learning, Multi-Class and Multi-Task Learning. Can learning from one task help with other tasks?.

 

Multitask Learning

R. Caruana,  http://citeseer.nj.nec.com/10214.html

 

Multitask Feature and Kernel Selection for SVMs

T. Jebara,  http://www1.cs.columbia.edu/~jebara/papers/metalearn.pdf

 

Learning Internal Representations

J. Baxter,  http://citeseer.nj.nec.com/baxter95learning.html

 

Solving multiclass learning problems via error-correcting output codes

T. Dietterich and T. Bakiri, ftp.cs.orst.edu/pub/tgd/papers/jair-ecoc.ps.gz

 


    Temporal Modeling. How to model complicated dynamic systems, particularly if they have interactions, couplings and hierarchy.

 

Learning switching linear models of human motion

V. Pavlovik, et al http://www.cc.gatech.edu/~rehg/Papers/SLDS-NIPS00.pdf

 

Dynamical Systems Trees

A. Howard and T. Jebara http://www1.cs.columbia.edu/~jebara/papers/uai04.pdf

 

Nonlinear prediction of chaotic time series using support vector machines

S. Mukherjee et al, http://www.ai.mit.edu/people/girosi/home-page/nnsp97.pdf

 

Coupled hidden markov models for modeling interacting processes

M. Brand, http://www.media.mit.edu/people/brand/papers/brand-chmm.ps.gz

 


    Approximate Methods for Bayesian models.

 

Variational Bayes for mixture models

H. Attias, http://research.microsoft.com/~hagaia/uai99.ps

W. Penny, http://www.fil.ion.ucl.ac.uk/~wpenny/publications/vgbmm.ps

 

Expectation-propagation for approximate inference in dynamic Bayesian nets  

Heskes & Zoeter ftp://ftp.mbfys.kun.nl/pub/snn/pub/reports/Heskes.uai2002.ps.gz

 

 


    SVMs and variants, transduction, universum, etc.

 

Inference with the Universum

J. Weston, et. al., http://www.icml2006.org/icml_documents/camera-ready/127_Inference_with_the_U.pdf

 

Learning with Local and Global Consistency

D. Zhou, et. al., http://research.microsoft.com/~denzho/papers/LLGC.pdf

 

Transductive inference for text classification using SVMs 

T. Joachims, http://www-ai.cs.uni-dortmund.de/DOKUMENTE/Joachims_99c.ps.gz

 

Relative margin machines 

P. Shivaswamy and T. Jebara, http://www.cs.columbia.edu/~jebara/papers/nips08.pdf

 

The relevance vector machine 

M. Tipping, ftp.research.microsoft.com/users/mtipping/rvm_nips.ps.gz

 

Estimating the Support of a High-Dimensional Distribution.

Scholkopf, et. al. Microsoft Technical Report, MSR-TR-99-87. 1999.

 


    Invariance, learning a model despite some nuisance source of variation that must be separated away.

 

Rotation and Translation Invariance for Images

Come see me for the hardcopy of the paper.

 

Orbit Learning using Convex Optimization

T. Jebara and Y. Bengio, http://www1.cs.columbia.edu/~jebara/papers/snowbird3.pdf

 

Separating style and content with bilinear models

J. Tenenbaum and W. Freeman, http://www.merl.com/reports/docs/TR99-04.pdf

 

Estimating mixture models of images and  inferring

spatial transformations using EM

B. Frey and N. Jojic, http://www.psi.toronto.edu/~frey/papers/tmg-cvpr99.ps.Z

 

Kernelizing Sorting, Permutation and Alignment for Minimum Volume PCA

T. Jebara, http://www.cs.columbia.edu/~jebara/papers/permkern.pdf

 

Transformation Invariance in Pattern Recognition

Simard, et al http://yann.lecun.com/exdb/publis/psgz/simard-00.ps.gz

 


    Information Theoretic Learning. Using information theory in learning.

 

Multivariate information bottleneck

N. Friedman et al, http://www.cs.huji.ac.il/~noamm/publications/UAI2001.ps.gz

 


    Clustering and Learning mixtures without EM.

 

On Spectral Clustering: Analysis and an Algorithm

A. Ng, M. Jordan and Y. Weiss. http://ai.stanford.edu/~ang/papers/nips01-spectral.pdf

 

B-Matching for Spectral Clustering

T. Jebara and V. Shchogolev. http://www1.cs.columbia.edu/~jebara/papers/bmatching.pdf

 

Expander Flows, Geometric Embeddings and Graph Partitioning

S. Arora, S. Rao, U. Vazirani. http://www.cs.princeton.edu/~arora/pubs/arvstoc.pdf

 

 


    General application areas (vision, text, audio, compbio)

 

Kernel Independent Component Analysis

F. Bach and M. Jordan, http://cmm.ensmp.fr/~bach/kernelICA-jmlr.pdf

 

Bayesian Out-Trees (applied to images)

T. Jebara, http://www.cs.columbia.edu/~jebara/papers/uai08tree.pdf

 

Anomaly Detection in Behavior

Contact M.B. Salem, malek@cs.columbia.edu,
I have collected a data set for tens of users consisting of the actions they perform on their personal computers. The amount of data per user varies between 1 day and 9 days worth of data. The objective of the project is to monitor how computer user behavior changes over time, and measure how consistent it is over time. The approach is to build one user micro-model per epoch (e.g. per day) and compare the micro-models to measure how consistent they are. Users should be ranked by the consistency of their behavior. Any modeling technique could be used as long as it lends itself to measuring the consistency of the micro-models and ranking the users efficiently. NOTE: Only the best students will be considered for the project Any student who does exceptionally well on this project is likely to get appointed as an MS GRA if/when funds become available.

 

Probabilistic latent semantic analysis

T. Hofmann, http://www.cs.brown.edu/people/th/papers/Hofmann-UAI99.pdf

 


    Any topic you can convince us about in your abstract!

 

Potential datasets on which to try some of your learning algorithms:

http://www1.ics.uci.edu/~mlearn/MLRepository.html

Stanford Large Network Dataset Collection: http://snap.stanford.edu/data/

http://mldata.org

http://www.cs.toronto.edu/~delve/

http://www-personal.buseco.monash.edu.au/~hyndman/TSDL/

 

Feel free to also bring new papers to the list below and suggest them as well. Places to look for papers include recent machine learning conferences such as:

Neural Information Processing Systems, NIPS

Uncertainty in Artificial Intelligence, UAI

International Conference on Machine Learning, ICML

Computer Vision and Pattern Recognition, CVPR

Conference on Learning Theory, COLT

and some machine learning journals like the Journal of Machine Learning Research, Journal of Artificial Intelligence Research, Machine Learning, Pattern Recognition, Neural Computation, IEEE Transactions on Pattern Analysis and Machine Intelligence and so forth. Many recent articles from these compilations are available online or in the library. You can find copies of the papers (postscript and pdf) through Citeseer, a popular search engine for computer science publications: http://citeseer.nj.nec.com/cs