Haplotype Inference Constrained by Plausible Haplotype Data
Venue
CPM '09: Proceedings of the 20th Annual Symposium on Combinatorial Pattern Matching, Springer-Verlag, Berlin, Heidelberg (2009), pp. 339-352
Publication Year
2009
Authors
Michael R. Fellows, Tzvika Hartman, Danny Hermelin, Gad M. Landau, Frances Rosamond, Liat Rozenberg
BibTeX
Abstract
The haplotype inference problem (HIP) asks to find a set of haplotypes which
resolve a given set of genotypes. This problem is of enormous importance in many
practical fields, such as the investigation of diseases, or other types of genetic
mutations. In order to find the haplotypes that are as close as possible to the
real set of haplotypes that comprise the genotypes, two models have been suggested
which by now have become widely accepted: The perfect phylogeny model and the pure
parsimony model. All known algorithms up till now for the above problem may find
haplotypes that are not necessarily plausible, i.e. very rare haplotypes or
haplotypes that were never observed in the population. In order to overcome this
disadvantage we study in this paper, for the first time, a new constrained version
of HIP under the above mentioned models. In this new version, a pool of plausible
haplotypes ~H is given together with the set of genotypes G, and the goal is to
find a subset $H \subseteq \widetilde{H}$ that resolves G. For the constrained
perfect phylogeny haplotyping (CPPH) problem we provide initial insights and
polynomial-time algorithms for some restricted cases that help understanding the
complexity of that problem. We also prove that the constrained parsimony
haplotyping (CPH) problem is fixed parameter tractable by providing a parameterized
algorithm that applies an interesting dynamic programming technique for solving the
problem.
