My Photo                     Machine Learning for Statistics: SDGB 7847
                                                      'What one fool could understand, another can.'
                                                                                                                                                  -- R.P. Feynman


Description

The course will give participants an opportunity to implement statistical models. We will cover numerical optimization techniques including gradient descent, newton's method and quadratic programming solvers to fit linear and logistic regression, discriminant analysis, support vector machines and neural networks. The second part of the course will focus on advanced topics for computing posterior distributions and motivate their appeal in Bayesian inference. We will survey importance and rejection sampling, Metropolis, Gibbs sampling, and Sequential Monte Carlo. Students will be exposed to convex duality, constrained optimization, bias/variance decompositions, entropy, mutual information, KL divergence, maximum likelihood/maximum a posteriori estimation, Fisher scoring, Laplace approximation, Markov chains and saddle point methods, each of which will be reemphasized from a computational perspective.

Prerequisites

Multivariate Calculus, Linear Algebra, Probability, Statistical Computing, i.e., be able to program in and have regular access to Matlab, Python or R.
Textbooks

The Elements of Statistical Learning [ESL] by Hastie, Tibshirani, Friedman
Pattern Recognition and Machine Learning [PRML] by Bishop
Convex Optimization [CVX] by Boyd (not required but a nice reference)
Grade Distribution

  • Homework 45%
  • Participation 5%
  • Midterm 20%
  • Final or Project 30%
Homework

Please typeset your homework using LaTeX which is the standard for technical or scientific documents. You may visit the www.latex-project.org to download a copy and read the following tutorial to get started. A template for the homework is available here (cls, tex, pdf). A word of advice: start early on the graded homework! The instructor has worked through all of the problems and some are challenging. Please use Matlab, Python or R. Each language has its own pros and cons although if you know one you can probably learn the others easily. Standard academic honesty policy applies.
Tentative Course Outline

  • Overview, Least Squares and MLE
  • Constrained Optimization, Ridge Regression
  • Logistic Regression, Gradient Descent and Newton's Method
  • Lasso, Subgradient Methods, QP Solvers
  • SVMs, Primal and Dual forms, KKT conditions
  • Feed Forward Neural Networks, Backpropagation
  • Midterm
  • K-means, Expectation Maximization (EM)
  • Markov Chains, State Space Models
  • Message Passing, Kalman Filters
  • Quadrature, Laplace Approximation
  • Importance/Rejection sampling, Metropolis
  • Variational Inference
  • Final or Projects


  • Date Topic(s) Reading Notes Assignments Solutions
    1.17 Method of Least Squares
    Maximum Likelihood Estimation
    Gaussian Integrals
    Determinants
    PRML and ESL (1-3) Least Squares
    The Gaussian Integral
    1.24 Review Problem Session
    Maximum Likelihood Examples
    Constrained Optimization
    Ridge Regression
    PRML and ESL (3) MLE
    Lagrange Multipliers
    Ridge Regression
    1.31 Logistic Regression
    Gradient Descent and Newton's Method
    Taylor Expansions and Hessian Matrices
    PRML and ESL (4) Logistic Regression
    Finding Roots
    Homework 1
    data
    Matlab
    R
    Python
    2.7 Discriminant Analysis
    Eigenvalues and Eigenvectors
    Lab Session
    PRML and ESL (4) LDA
    Eigenvectors
    Eigenfaces vs Fisherfaces
    2.14 Discriminant Analysis
    Spectral Decompositions and PCA
    Support Vector Machines
    PRML (6) and ESL (12) PCA, Spectral Methods
    SVM Notes, Slides
    Practical Tutorial
    2.21 Support Vector Machines
    Primal and Dual Forms
    Linear and Quadratic Programs
    PRML (7) and ESL (12) SVM Tutorial
    Duality in Optimization
    Homework 2 Python
    2.28 Kernels, Reproducing Kernel Hilbert Space
    Interior Point Methods
    Backpropagation
    PRML (7) and ESL (5.8, 12) Kernels, RKHS
    Interior Point Methods
    Kernel Demo
    SVM Demo
    3.7 Feed Forward Neural Networks
    Backpropagation
    Lab Session
    PRML (5) and ESL (11) Neural Networks
    BackProp Algorithm
    Efficient BackProp
    Neural Net Demo
    Practical Tutorial
    MNIST Demo
    3.14 Spring Break Midterm Solutions
    3.21 K-Means, Expectation Maximization
    Jensen's Inequality
    Entropy, KL Divergence, Free Energy
    Applying EM to Mixture Models
    PRML (9) and ESL (14.3, 8.5) EM Notes, Slides
    EM and Thermodynamics
    Clustering Demo
    3.28 Review Midterm
    EM Algorithm
    PRML (9, 13) Applying EM
    Hidden Markov Models
    Homework 3
    data
    Matlab
    4.4 Deriving Update Equations for EM
    Hidden Markov Models
    Forward-Backwards Algorithm
    Gamma Algorithm
    PRML (13) EM for Bernoulli Tutorial
    Forward-Backwards Notes
    Practical Tutorial
    Implementation
    4.11 Viterbi Algorithm
    Baum Welch Algorithm
    Lab Session
    PRML (13) State Space Models
    Baum Welch
    Project
    4.18 Kalman Filtering
    Numerical Integration
    Stochastic Processes
    PRML (11) Kalman Filtering
    Gaussian Convolutions
    Ergodic Theorem
    Interactive Demo
    Filtering Demo
    4.25 Importance and Rejection Sampling
    Metropolis Hastings
    PRML (11) Markov Chain Monte Carlo
    Quantum Monte Carlo
    Homework 4
    (Lab)
    5.2 Brownian Motion and the Heat Equation
    Path Integration
    Sequential Monte Carlo, Particle Filtering
    PRML (11) Einstein's Theory
    Sequential Monte Carlo
    Particle Filtering Tutorial
    MCMC Examples