Abstract
Each person's genome contains two copies of each chromosome, one inherited from the father and the other from the mother. A person's genotype specifies the pair of bases at each site, but does not specify which base occurs on which chromosome. The sequence of each chromosome separately is called a haplotype. The determination of the haplotypes within a population is essential for understanding genetic variation and the inheritance of complex diseases. The haplotype mapping project, a successor to the human genome project, seeks to determine the common haplotypes in the human population.
Experimental determination of a person's component haplotypes is an expensive and time consuming process, and it is more attractive to first determine genotypes experimentally (relatively simple) and then use them to compute haplotypes. This computation is not simple and is complicated by the fact that current sequencing technology often gives the DNA sequence with some missing nucleotide bases at some positions. Consequently, it is important to find efficient algorithms for the reconstruction of the haplotypes from noisy data. In this talk I will introduce a system which accurately reconstructs haplotypes from missing or genotype data. I will present the applicability of this system to various biological data sets and disease association studies.