Some recent research papers in Computational Biology
Below is a selection of Computational Biology conference and journal papers from the past few years. Most of these papers focus on interesting computational and learning techniques for biological data rather than specific biological problems. For your project, you can lean more towards interesting biological applications or towards developing computational methods, as you prefer. In a few of the papers below, fairly elaborate (and non-public) software is used to obtain the results -- you will not be able to reproduce these efforts in the time given for the project! However, you may be able to extract smaller projects from ideas in these papers or develop a simpler prototype for some of the methods.
Clustering Gene Expression Data
- Cluster analysis and display of genome-wide expression patterns Eisen, Spellman, Brown and Botstein. (pdf, ps)
- Principal component analysis for clustering gene expression data. Yeung and Ruzzo. (pdf, ps)
- Context-specific Bayesian clustering for gene expression data. Yoseph Barash, Nir Friedman. (pdf, ps)
- Class discovery in gene expression data. Amir Ben-Dor, Nir Friedman, Zohar Yakhini. (pdf, ps)
Classification of Gene Expression Data
- Support vector machine classification and validation of cancer tissue samples using microarray expression data. Furey, Cristianini, Duffy, Bednarski, Schummer and Haussler. (pdf, ps)
- Analysis of molecular profile data using generative and discriminative models. Moler, Chow and Mian. (pdf, ps)
Computational Signal Finding
- Finding motifs using random projections. Jeremy Buhler, Martin Tompa. (pdf, ps)
- An algorithm for finding signals of unknown length in DNA sequences. Giulio Pavesi, Giancarlo Mauri, Graziano Pesole. ISMB 2001 Proceedings.
- Finding Composite Regulatory Patterns in DNA Sequences. Eskin and Pevzner. (pdf, ps)
Secondary Structure Prediction
- Towards predicting coiled-coil protein interactions. Mona Singh, Peter S. Kim. (pdf, ps)
- Predicting the beta-helix fold from protein sequence data. Phil Bradley, Lenore Cowen, Matthew Menke, Jonathan King, Bonnie Berger. (pdf, ps)
Inferring Networks from Gene Expression Data
- Inferring subnetworks from perturbed expression profiles Dana Pe'er, Aviv Regev, Gal Elidan, Nir Friedman. (pdf, ps)
- Using graphical models and genomic expression data to statistically validate models of genetic regulatory networks. Hartemink, Gifford, Jaakkola and Young.(pdf, ps)
Gene Finding, Gene Structure, and Splicing
- Computational identification of promoters and first exons in the human genome. Davaluri, Grosse and Zhang (pdf, ps)
- Integrating genomic homology into gene structure prediction. Ian Korf, Paul Flicek, Daniel Duan, Michael R. Brent (pdf, ps)
- Prediction of complete gene structures in human genomic DNA. Burge and Karlin. (pdf, ps)
- Engineering support vector machine kernels that recognize translation initiation sites. Zien, Ratsch, Mika, Scholkopf, Lengauer, and Muller. (pdf, ps)
Protein Classification