MACHINE LEARNING

MACHINE LEARNING January 27, 2011

COMS 4771 HOMEWORK #1
PROF. TONY JEBARA

DUE	FEBRUARY 10^th, 2011 BY 2:00pm EST

PLEASE NOTE SUBMISSION INSTRUCTIONS.

Instructions

Please submit separate files for each of the following:

write-up;
Matlab source files;
figures, if you choose to include them separately.

Do not include any other files.

Your write-up should be in ASCII plain text format (.txt) or Postscipt (.ps) or the Adobe Portable Document Format (.pdf). Please do not submit Microsoft Office documents, LaTeX source code, or something more exotic. We will not be able to read it. LaTeX is preferred and highly recommended, but it is not mandatory. You can use any document editing software you wish, as long as the final product is in .ps or .pdf. Even if you do not use LaTeX to prepare your document, you can use LaTeX notation to mark up complicated mathematical expressions, for example, in comments in your Matlab code. See the “Tutorials” page for more information on LaTeX.

All your code should be written in Matlab. Please submit all your souce files, each function in a separate file. Clearly denote what each function does, which problem it belongs to, and what the inputs and the outputs are. Do not resubmit code or data provided to you. Do not submit code written by others. Identical submissions will be detected and both parties will get zero credit. Shorter code is better. Sample code is available on the “Tutorials” web page. Datasets are available from the “Handouts” web page.

You may include figures directly in your write-up or include them separately as .jpg, .gif, .ps or .eps files, and refer to them by filename.

When you are ready to submit your work, submit it via courseworks.columbia.edu. If something goes wrong, ask the TA's for help and if worst comes to worst, you can send work to the TA's by email.

Thanks.

Problems

1. (10 points) Crossvalidation for Polynomial Fitting: Download the Matlab code in “polyreg.m” (on the tutorial web page) to do polynomial curve fitting. Also download the dataset “dataset1.mat”. Type “load dataset1” in Matlab and you will have the variables X (scalar inputs) and Y (scalar outputs) in your Matlab environment space. Split the data into training (the first 100 points) and testing (the second 100 points). Compute the training and testing error for various polynomial order fits, i.e. for 0^th order, 1^st order all the way to 20^th order and plot both error curves graphically. Which polynomial order fits the best and why? Graphically show the fit of the data with the best polynomial order as an attached image.

2. (15 points) RBF Basis Regression: Modify the Matlab code for “polyreg.m” such that it does RBF basis curve fitting instead of polynomial regression. Fit the data in “dataset1.mat” with the first 100 points as training and the second 100 points as testing and set the RBF’s sigma parameter equal to 1.0. Compute and show the training and testing error for this model and show the fit graphically by saving the Matlab plot of f(x) overlayed on the data.

3. (15 points) Perceptron: Implement the linear perceptron in Matlab (using stochastic or gradient descent). Train it on “dataset2.mat”. Type “load dataset2” and you will have the variables X (inputs) and Y (+/- 1 labels) in your Matlab environment which contain the dataset. Use the whole data set as training. Show with figures the resulting linear decision boundary on the 2D X data. Show the binary classification error and the perceptron error you obtain throughout the run from random initialization until convergence on a successful run (some random inits may not converge or may require many iterations). Note the number of iterations needed. If using the non-stochastic algorithm, discuss the convergence behavior as you vary the step size (h).

4. (10 points) Multi-Class Discrimination: There are several possible ways in which to generalize the concept of a linear discriminant function from two classes to c classes. One possibility would be to use (c-1) linear discriminant functions, such that for inputsin class and for inputs not in class . By drawing a simple example in two dimensions for , show that this approach can lead to regions of x-space for which the classification is ambiguous. Another approach would be to use one discriminant function for each possible pair of classes and such that for patterns in class and for patterns in class . For c classes we would need discriminant functions. Again, by drawing a specific example in two dimensions for , show that this approach can also lead to ambiguous regions.