QuadGT

From SEQwiki
Jump to: navigation, search

Application data

Created by Miklós Csűrös & Eric Bareke
Biological application domain(s) SNPs
Principal bioinformatics method(s) SNP calling, Variant calling
Technology Any
Created at DIRO & Sainte-Justine UHC Research Centre, University of Montreal
Maintained? Yes
Input format(s) SAM
Output format(s) VCF
Programming language(s) Java

Summary: QuadGT is a software package for calling single-nucleotide variants in four sequenced genomes: normal-tumor pairs coupled with parents. Genotypes are inferred using a joint model of parental variant frequencies, de novo germline mutations, and somatic mutations. The model quantifies the descent-by-modification relationships between the unknown genotypes by using a set of parameters in a Bayesian inference setting.

"Error: no local variable "counter" was set." is not a number.

Description

QuadGT is a software package for calling single-nucleotide variants in four sequenced genomes: normal-tumor pairs coupled with parents. Genotypes are inferred using a joint model of parental variant frequencies, de novo germline mutations, and somatic mutations. The model quantifies the descent-by-modification relationships between the unknown genotypes by using a set of parameters in a Bayesian inference setting.

Note that you can use QuadGT on any subset of the four related genomes, including parent-offspring trios, and normal-tumor pairs without parental samples.

The software package assumes a thorough probabilistic model of single-nucleotide variants with the following notable features.

Each locus has four possible alleles (A,C,G,T) Point mutations between related genomes assume standard DNA evolution models. The implemented models include the basic Jukes-Cantor model with a single parameter (expected number of substitutions per locus and haplotype), and the more parameter-rich Hasegawa-Kishino-Yano (a.k.a. Felsenstein's F84) model with purine-pyrimidine balance %(A+G)=%(C+T)=50%, which is automatically satisfied under Chargaff's rules of %A=%T and %C=%G. Four parameters of the HKY model adjust sequence divergence, transition/transversion ratio, nucleotide composition (GC-content and amino-keto %(A+C)-%(T+G) ratio). Parental allele frequencies can be set from a database of known variants such as the University of Washington's Exome Variant Server. Diploid parental genotypes have adjustable heterozygous/homozygous ratio (respecting multi-locus alleles). Various inheritance models span autosomes, sex chromosomes, and mitochondrial DNA. Tumor purity is considered explicitly, and estimated from tumor and normal reads. Basecall quality scores are considered explicitly, and can be automatically recalibrated for mapping to error probabilities.


Links


References

none specified


To add a reference for QuadGT, enter the PubMed ID in the field below and click 'Add'.

 


Search for "QuadGT" in the SEQanswers forum / BioStar or:

Web Search Wiki Sites Scientific