HAPLORE: A Program for Haplotype Reconstruction in General
Pedigrees without Recombination
Kui Zhang, Fengzhu Sun, Hongyu Zhao
Programs
Our algorithms have been implemented in a program by C++, here are the
pre-compiled executable files:
- The executable file for Linux operating system.
- The executable file for Unix operating system.
- The executable file for Windows operating system (Windows 95,95, NT, 2000,XP).
The manual (PDF format) for how to use this program can also be downloaded.
Examples
HAPLORE can handle genotype data from pedigrees as well as from unrelated individuals.
We used the following data set with different options to
test our program. You can also use it to explore it. In the following, we list the test data set,
the command lines we used in the test and the corresponding output files.
For detailed format of the input file and the options supported by HAPLORE, please refer to the
manual (PDF format) of the program.
- A genotype file - Test-Data-Set.dat, which contains the genotype
data for 42 individuals at 10 marker loci. These individuals include 38 from 3 families and 4 unrelated
individuals.
- HAPLORE can infer haplotypes from pedigrees using a set of logic rules under the assumption
of no recombinants. After this step, the partial or complete haplotypes of
each individuals are determined and listed.
- The command line: haplore_v242_win -iTest-Data_Set.dat -oTest-Output-001.dat.
- The output file Test-Output-001.dat.
- HAPLORE can list all compatible haplotype configurations without recombinants using the
haplotype elimination algorithm. After this step, all compatible haplotype pairs for each individual
are determined and listed.
- The command line: haplore_v242_win -iTest-Data_Set.dat -oTest-Output-002.dat -h1.
- The output file Test-Output-002.dat.
- HAPLORE can try to find a minimum number haplotypes that can resolve all individuals. The program
will use all haplotypes from those individuals who have a determined haplotype pair. The program often
fails if there are not enough number of determined haplotypes. If the program does not fail,
the program will output all compatible haplotype pairs for each individual.
- The command line: haplore_v242_win -iTest-Data_Set.dat -oTest-Output-003.dat -m1.
- HAPLORE can estimate haplotype frequencies and list all compatible haplotype configurations without recombinants
with their posterior probabilities using the EM algorithm and the partition-ligation technique.
The program provides several options to control the process. After this step, all compatible haplotype configurations
with their posterior probabilities are listed.
- The command line: haplore_v242_win -iTest-Data_Set.dat -oTest-Output-004.dat -e1.
- The output file Test-Output-004.dat.
- The command line: haplore_v242_win -iTest-Data_Set.dat -oTest-Output-005.dat -e1 -l1.
- The output file Test-Output-005.dat.
Program History
- April 20, 2006: a new routine is added to find the haplotype configuration with the maximum posterior probability among all compatible haplotype configurations.
- April 12, 2006: the program is modified to allow the output of all compatible haplotype configurations with their corresponding
posterior probabilities and the single haplotype configurations with the maximum posterior probability.
- April 10, 2006: new routines are added for the likelihood function calculation. The likelihood can be calculated
based on the Elston-Stewart algorithm and the array transformation technique, which potentially allow the program to handle
larger pedigrees.
- December 1, 2005: the program is modified to handle data with alleles that are not coded by consecutive integers.
- January 1, 2005: the paper is published and the program is formally released.
- May 15, 2002: the program is created and tested.
References
- Kui Zhang, Fengzhu Sun, Hongyu Zhao. 2005. HAPLORE: A Program for Haplotype Reconstruction in General Pedigrees without Recombination. Bioinformatics 21: 90-103.
- Kui Zhang, Hongyu Zhao. 2006. A Comparison of Several Methods for Haplotype Frequency Estimation and Haplotype Reconstruction for Tightly Linked Markers from General Pedigrees. Genetic Epidemiology 30: 423-437.
Old Programs:
If you would like to use the old program (only deal with haplotype data) and browse the supplemental materials of our paper
in Proc. Natl. Acad. Sci. USA 99: 7335-7339 , please go to
here.
If you would like to use another old program that can deal with genotype data from unrelated individuals
but only has a few options for block partitioning and tag SNP selection, please go to
here.
We are planning to update our program regularly. You are welcome to suggest features
that you want us to implement into this program. We greatly appreciate if you could point out any bugs
when you use our program. Our contact information is:
Created Date: March 20, 2003
Last Updated Date: May 01, 2009