## Lecture Schedule

CBMF 4761

Department of Computer Science

Spring Semester, 2004See last year's lecture page for previous topics. Tentative schedule is below.

## Lecture 1 (Mon Jan 26)

- Basic background in molecular biology -- Slides on biology background
- Models for cis-regulatory elements
Reading:Section 1.3 in Durbin. There is also good but terse material on probabilistic methods in Chapter 11 of the text -- see in particular Section 11.3 on inference.## Lecture 2 (Mon Feb 2)

- Basics on gene expression data (microarrays) -- Slides on gene expression data
- Clustering expression profiles for class discovery, regulatory element discovery
- More background resources (optional reading):
- See Courseworks for scanned lecture notes
## Lecture 3 (Mon Feb 9)

- Clustering algorithms
- REDUCE model -- Reduce paper by Bussemaker et al.
- See Courseworks for scanned lecture notes
## Lecture 4 (Mon Feb 16)

- Optimal sequence alignment
- Heuristic alignment and BLAST statistics
Reading:Durbin, Sections 2.1, 2.2 and 2.3 until the end of the subsection on global alignment (Needleman-Wunsch algorithm). Read Section 2.3 until the end of the subsection on local alignment (Smith-Waterman). Also take a look at the affine gap penalty part of Section 2.4. We won't do every variant of pairwise alignment in class, but it's useful to see that there are so many different versions.## Lecture 5 (Mon Feb 23)

- Statistical significance of alignment scores
- Hidden Markov models
- Viterbi algorithm, posterior decoding
Reading:Durbin, Section 2.5 on heuristic alignment algorithms, Section 2.7 on significance of scores (the "classical approach" subsection is most important), and Section 2.8 on deriving score parameters from data. Also read Sections 3.1 and start of 3.2 on Markov chains and Hidden Markov Models for CpG island detection.## Lecture 6 (Mon Mar 1)

- Viterbi algorithm, posterior decoding
- Baum-Welch algorithm (if we get this far)
Reading:Durbin, Section 3.2 on the Viterbi algorithm for Hidden Markov models and Section 3.2 on posterior decoding (the forward and backward algorithms). Also take a look at Section 3.3 on the Baum-Welch algorithm (expectation maximization) and the last section of Chapter 3 on scaling probabilities for the forward/backward algorithms.## Lecture 7 (Mon Mar 8)

- Baum-Welch algorithm (EM for HMMs)
- First midterm

Spring Break

## Lecture 8 (Mon Mar 22)

- Profile HMMs
- Introduction to classification, linear classifiers
Reading:Durbin, Chapter 5 on profile HMMs for modeling protein families.## Lecture 9 (Mon Mar 29)

- Classification versus clustering
- Linear classifiers: e.g. Fisher's linear discriminant, linear discriminant analysis, SVMs
- Support vector machines (SVMs) -- hard margin optimization problem
Reading:Tutorial on SVMs by Chris Burges (pdf, ps) -- some reference material for the SVM optimization problems that we'll outline in class; the text by Cristianini and Shawe-Taylor is a very good reference for this material. Golub paper -- example of a linear classifier used for discrimination between gene expression profiles between two types of leukemia.## Lecture 10 (Mon Apr 5)

- SVMs -- soft margin optimization problem
- Feature selection, kernel methods
## Lecture 11 (Mon Apr 12)

We'll present an overview of Bayes nets (probabilistic graphical models) for inferring regulatory networks and start discussing the papers listed below.

- Here are some pictures to illustrate binding of transcription factors to promoter regions and the effect on transcription: general picture of TFs and RNA polymerase; model of human b-globin gene regulation; tryptophan-regulated repressor and picture in E. coli
- Statistical validation of network models [Hartemink et al.]
- Inferring regulatory subnetworks [Pe'er et al.]
- Module networks: identifying regulatory modules [Segal et al.]
## Lecture 12 (Mon Apr 19)

We'll finish our dicussion of Bayes nets for modeling gene regulation with a discussion of the "Module networks" paper. See also the tech report listed below. In the last part of the class, we'll talk about MEME, a popular motif discovery algorithm based on EM.## Lecture 13 (Mon Apr 26)

We'll present an overview of computational genefinding, including the GENSCAN and TWINSCAN models.## Lecture 14 (Mon May 3)

For the first half of the class (before the test), we'll give an overview of papers from the special issue of Genome Research on the rat genome.

- Second midterm