ADVANCED MACHINE LEARNING &     October 14, 2008

PERCEPTION            

COMS 4995/6772 HOMEWORK #2
PROF. TONY JEBARA

 

DUE

FRIDAY, OCTOBER 31th, 2008 BY NOON

 

Write all code in Matlab. Submit your work via Courseworks. If unable to, contact the TA via email. Make sure to tar.gz everything into a directory and then send it. Also send us a write up of your results as a postscript or pdf file containing any figures, tables and equations as well as your Matlab code and scripts as separate files. We recommend using Latex to write up your report: http://www.latex-project.org

 

Clearly denote the various components and the function calls or scripts that execute your Matlab functions. Shorter code is better. Do not submit copied code from others. You can use any draw package or Matlab to make illustrations and figures. Note, to save the current figure in Matlab as a postscript file you can type:

print –depsc filename.eps

Sample code is available on the “Tutorials” web page. Datasets are available from the “Handouts” web page.

 

 

Kalman Filters For Recognizing Human Activity

 

Build a Kalman Filter to recognize two classes of motion capture data: walking and running. Motion capture data consists of coordinates of the various limbs of a human over time while he is walking or running. The data set is contained in the file mocap.mat here:

http://www.cs.columbia.edu/~jebara/6772/hw/mocap.mat

 

To load the data, type “load mocap.mat” in matlab which creates the following cell array structures:

trainw: 4 training examples of walking sequences

trainr: 4 training examples of running sequences

testw: 4 testing examples of walking sequences

testr: 4 testing examples of running sequences

 

Each cell arrays contains 4 sequences which are the 123 scalar numbers describing the human’s configuration at each instant in time over a variable number of time steps. These range from 100-500 frames for a full motion sequence (a few seconds of walking where the figure takes a few steps). Each 123 element vector contains the X,Y,Z of a limb of the human’s body and there are a total of 41 tracked points on the body (41*3=123). Each of these sequences is essentially a matrix of size 123x200 if there are 200 frames in the sequence. To get this matrix out of one of the cell arrays, just type “M=trainw{1}”. Thus, the matrix is a bunch of 123-element vectors. To see a single vector type “M(:,1)” for the single vector at time t=1 (for t=1…200 or so, depending on how long the sequence is). The i dimensions are arranged as follows for all 41 markers:

x1,y1,z1,x2,y2,z2,...x41,y41,z41

 

 

Students who are unfamiliar with cells can look at:

http://www.mathworks.com/access/helpdesk/help/techdoc/matlab_prog/ch_da37a.html#31678

You can see what each sequence looks like by using the animate function (this is sample code available by following the tutorials link), i.e. type: “animate(trainw{1})” to animate the first walking sequence in the training set. You should what looks like a person walking (but staying in place, almost on a treadmill).

 

Follow the recipe in Jordan and in the Ghahramani paper to build a Kalman Filter or Linear Dynamical System with Gaussian transitions and emissions. You are to estimate A,C,Q,R and (P0) like the Ghahramani paper but it is ok to just assume that Q and R are spherical matrices (or diagonal if you like). Try Kalman Filters with 1 dimensional and 2 dimensional hidden states. The emissions are vectors (with a scalar covariance) over the 123-dimensional space for the continuous features we are modeling. You are welcome to try more than two dimensions for the hidden states and see the resulting log-likelihoods. However, a two-dimensional hidden state Kalman Filter will be sufficient for this application.

 

Once you’ve trained an Kalman filter for walking and a Kalman filter for running, compute the likelihood of the test sequences under each KF. You should get very good classification accuracy and the walking KF should get higher likelihood than the running KF on the walking test sequences and vice versa.

 

Be careful with numerical problems! The probabilities may be numerically very close to zero. A way to fix this is to evaluate and work with log-probabilities. That way, a probability value of 1e-200 would just be stored as –200.0 and can shrink to even smaller amounts without going to zero and giving us underflow errors.

 

Also, you might get over-fitting problems with the diagonal covariance cases and the spherical case might have less numerical issues. These types of issues do crop up with Kalman Filters and even regular mixture models so it is important to deal with them. On possible solution is to add a small regularizer to the covariances, in other words change the covariance matrix by adding the identity matrix times a small scalar amount.

 

 

You can download more motion capture data from the MOCAP website and use it to build more Kalman Filters. Ask us how and we can help you with it! A more ambitious version of this homework could turn into a project.

Also, if you are having trouble writing the udpate rules for Q and R (and P0), just set them manually but at least have your code learn the A and C matrices as a starting point on the homework


ADDENDUM: You will get full grades if you get the model to work with either the recursive Kalman filter/smooth equations (as in the Ghahramani paper) but you will get a bonus grade if you get it to work in the junction tree algorithm form (which is trickier but more general). The E-step recursions are in page 5 of Ghaharamani and are similar to the class notes but not exactly identical because of slight notation changes. Also, note that we have 4 sequences for training each kalman filter. So, the derivatives and the update rules in the Ghahramani paper are slightly different and involve summing over the sequences before doing the matrix multiply and inverse step.

ADDENDUM: COMPUTE THE LOG-LIKELIHOOD OF A SINGLE SEQUENCE DURING FILTERING AS: