CBMF W 4761 Course Site
Computational Genomics

 

Administration

When: Tue, Thu, 10:35am to 11:50am ; Spring 2012
Where: Mudd 1127
By who: Itsik Pe'er, office hours: Thu 12:00-13:00
Teaching assistant: Arthi Ramachandran
What for: 3 credit points
For who: Graduate/advanced undergraduate students of relevant fields. Although the course is listed in Computer Science, cross enrollment by students with biomedical background is encouraged.

Pre-requisite: Each student is expected to be an independent programmer


Abstract

Technology for obtaining DNA sequences have been consistently improving faster than Moore's law. This has opened a wealth of computational challenges in weaving the heaps of straw of DNA sequence data into gold of biological insight. The class serves as an introduction to computational genomics, explaining the basic challenges and teaching the general computer-science tools to tackle them. This course is intended to introduce students of both computational and bio-medical skill sets to current quantitative understanding of genomics and prepare them to computational research or industrial development in the field. Questions we'll touch on include :

  • How to get the sequence of your genome?
  • How to model different but similar genes?
  • How to model the same but mutated gene?
  • How to infer the tree of life?
  • What do we learn from comparing genomes?
  • How to find genes and signals in DNA?
  • Why is there variation within a species?
  • Do genes determine traits?
  • How does natural selection work?

The computational toolbox discussed includes parameter inference, likelihood analysis, hidden Markov and other graphical models, eigenvalue decompositions, and classification problems.


Tentative Program - under construction

  • Week 1 Introduction to genomics, statistics
  • Week 2 Alignment of high-throughput sequence reads with exact string-matching
  • Week 3 Homology, repeats and gene families by similarity searches
  • Week 4 Neutral sequence evolution by Markov models
  • Week 5 Phylogenetics and tree reconstruction
  • Week 6 Speed of evolution (Markov models)
  • Week 7 Coalescent models (MCMC)
  • Week 8 Midterm
  • Week 9 Projects Outline
  • Week 10 Ancestral recombination graphs
  • Week 11 Genetic mapping (hypothesis testing)
  • Week 12 Projects midpoint
  • Week 13 Negative, Selection
  • Week 14 Projects final presentation

FAQ

  1. Q: I'm a Computer Science student with no background in biology. Can I take the course?
    A: Sure. The necessary biological background will be given in condensed form at the first meeting, and then supplemented when necessary. The course is designed for a small, interactive class, with each student having enough weight to affect the level of background given.
  2. Q: I'm a student in Medical Informatics/Biological Sciences with no background in computing. Can I take the course?
    A: Probably not. You will need to be able to write your own programs. Some background in probability/statistics/biometry is an advantage.
  3. Q: How is the course graded?
    A: A combination of homework assignments, late midterm and a final project.
  4. Q: What are the projects like?
    A: An example may be giving you a 5 year-old paper, asking you to read, understand, implement the computational method so that it can work on today's data (which is often orders of magnitude larger and more complex), analyse what you find, and write up what you find. Submission will include:
    • Any code you've developed
    • A detailed written report
    • An executive summary of 5 slides.
  5. Q: What are the office hours?
    A: Wed morning. See my contacts.