How-to/SNP detection
Contents
SNP detection
Decision Helper
I want to quickly call SNP versus a reference =>Freebayes, samtools
Software Packages
Free Software
Freebayes
Freebayes is the scuccessor of Poly- Giga- and BAMBayes and should be much faster than these. Like these it relies on BAM files. It has also been described in some more detail by its developed on Biostar
- Pros
- very easy to run for simple SNP calling
- Does not assume any ploidy
- can read BAM files via STDIN
GATK
The Genome Analysis toolkit GATK allows multiple steps. The authors used their pipeline for variant calling using the NA12878 exome data set and compared their results to those of Crossbow (which uses SOAPsnp). Based on these results they concluded that crossbow had a lower spcecificity.
One easy way to to run GATK and other tools might be to use this variant pipeline mentioned on Biostar
- Pro
- Likely relatively specific
- Con
- relatively complex pipelines
MAQ
samtools
samtools using the pilepup or mpileup pipeline http://samtools.sourceforge.net/mpileup.shtml
This thread here describes some potential problems that occur when using the BAQ parameter. (In effect it recommends to turn it off, if one uses e.g. BWA that finds indels. http://seqanswers.com/forums/showthread.php?t=11965
ssahaSNP
ssahaSNP - ssahaSNP is a polymorphism detection tool. It detects homozygous SNPs and indels by aligning shotgun reads to the finished genome sequence. Highly repetitive elements are filtered out by ignoring those kmer words with high occurrence numbers. More tuned for ABI Sanger reads. Developers are Adam Spargo and Zemin Ning from the Sanger Centre. Compaq Alpha, Linux-64, Linux-32, Solaris and Mac