INFOSTIP: INdividuals FOr Sequencing by Total Information Potential

About

We have designed a tool for informed selection of individuals to sequence from a population. It takes as input a collection of shared regions for each pair of samples (in the form of a .match file) and a sequencing budget. It has been implemented in C++.

Installation steps

Installation instructions
1. Download the source code.
2. Extract the archive using the command tar -zxvf ImputationTool.tar.gz
3. Then run command "make install."

Usage
Command: "./infostip <Location-of-match-files> <Match-file-prefix> <No-of-chromosomes> <Sequencing-budget> > <Output-file>"
Optional Input: "<No-of-SNPs>"

Options
1. Location-of-match-files: Path to the folder containing the .match files. (eg. /Path_to_folder/ )
2. Match-file-prefix: Match files should be named in the format prefix.chromosome_no.match (eg. ABC.1.match). Here the prefix is "ABC.", chromosome_no is "1". Enter "prefix." in the command line input.
3. No-of-chromosomes: Number of chromosomes on which you want to run the method.
4. Sequencing-budget: Number of individuals you want to select from the population to sequence.
5. Optional Input - No-of-SNPs: Specify this parameter if you want to consider only shared regions having number of SNPs greater than the value you specify here.

Output
1. Individual Picked: Individual ID of the individual picked.
2. Utility: Total length (in bp) of the shared region that an individual shares with all unsequenced individuals across al chromosomes.

Matchfile Format

Each line in the .match file represents a shared segment for a pair of individuals with the following fields:

1. Family ID 1
2. Individual ID 1
3. Family ID 2
4. Individual ID 2
5. Chromosome
6. Segment start (bp)
7. Segment end (bp)
8. Segment start (SNP)
9. Segment end (SNP)
10. Total SNPs in segment
11. Genetic length of segment
12. Units for genetic length (cM or MB)
13. Mismatching SNPs in segment
14. 1 if Individual 1 is homozygous in match; 0 otherwise
15. 1 if Individual 2 is homozygous in match; 0 otherwise

Match files can be generated using GERMLINE