Difference between revisions of "How-to/short read aligners"

From SEQwiki
< How-to
How-toHow-to/short read aligners
Jump to: navigation, search
m
Line 3: Line 3:
 
Short read mapping refers to the process of alignment of sequencing reads (a.k.a. reads) onto the reference sequence.  The reference sequence is often pre-processed into an indexed form for rapid searching.
 
Short read mapping refers to the process of alignment of sequencing reads (a.k.a. reads) onto the reference sequence.  The reference sequence is often pre-processed into an indexed form for rapid searching.
 
For a technical overview of mapping algorithm, please see reference ([http://www.nature.com/nmeth/journal/v7/n6/full/nmeth0610-479b.html 1]). An updated (as of April 2011) evaluation of short read aligners is available at ([http://www.nature.com/jhg/journal/v56/n6/pdf/jhg201143a.pdf 2]) and one (from August 2011) also targeting SNP discovery ([http://www.nature.com/srep/2011/110805/srep00055/full/srep00055.html 3])
 
For a technical overview of mapping algorithm, please see reference ([http://www.nature.com/nmeth/journal/v7/n6/full/nmeth0610-479b.html 1]). An updated (as of April 2011) evaluation of short read aligners is available at ([http://www.nature.com/jhg/journal/v56/n6/pdf/jhg201143a.pdf 2]) and one (from August 2011) also targeting SNP discovery ([http://www.nature.com/srep/2011/110805/srep00055/full/srep00055.html 3])
 +
In addition here are some updated ROC curves including bowtie2 [http://lh3lh3.users.sourceforge.net/alnROC.shtml here]
  
 
=Decision Helper=
 
=Decision Helper=

Revision as of 20:51, 2 November 2011

Short read mapping

Short read mapping refers to the process of alignment of sequencing reads (a.k.a. reads) onto the reference sequence. The reference sequence is often pre-processed into an indexed form for rapid searching. For a technical overview of mapping algorithm, please see reference (1). An updated (as of April 2011) evaluation of short read aligners is available at (2) and one (from August 2011) also targeting SNP discovery (3) In addition here are some updated ROC curves including bowtie2 here

Decision Helper

This is based on personal experience and prevalence and based on literature data on the perfromance but only meant to give you a quick primer.

  • Genome data
    • If only speed matters use bowtie: Bowtie
    • BWA is a bit slower but already more sensitive BWA,
    • If sensitivity and specificity is needed. Try Stampy, Novoalign or SHRiMP 2

Software Packages

Free Software

BWA
Compatible with illumina, SOLiD and 454 data

  • Pros
    • The SAM/BAM output adhere to SAM format, contains mapped and unmapped data, easy to parse
  • Cons
    • Not fully threaded. sampe and samse can only utilize 1 CPU. bwasw (454 longer reads) can be fully threaded, though
    • Not as sensitive as Stampy and Novoalign

Bowtie
Compatible with illumina and SOLiD data. Bowtie is discussed in the forum.

  • Pros
    • Fast
  • Cons
    • No mapping quality reported
    • Not as sensitive as Stampy and Novoalign

Stampy
Compatible with illumina data

  • Pros
    • Balance of speed and sensitivity
  • Cons
    • Can be slow even using BWA as premapper

SHRiMP2

  • Pros
    • Higher sensitivity than BWA
    • One step mapping, Indexing of genome is not needed
    • Alignment can take less time than BWA is the reference sequence is short, e.g. mapping of reads against a targeted region
  • Cons
    • Alignment speed is slow IF mapping is done onto a large genome

TMAP
Aligner specifically tuned for Ion Torrent PGM data

  • Pros
    • Uses a selection of algorithms to balance speed and sensitivity
  • Cons

Commercial Software

CLC workstation

  • Pro
    • GUI, easy to use
  • Cons
    • Expensive
    • Alignment is spurious based on our dataset
    • Alignment speed is NOT impressive at all compared to BWA or Bowtie (i7 860 + 16GB memory, windows 2008 R2-64bit)

Further Reading Material and References

  • Comparisons

Appendix

A list of short read aligners

Illumina BWA SHRiMP2 Bowtie Stampy Novoalign
454 GSMapper SSAHA2 BLAT Mosaik BWA-SW
SOLiD Bfast BWA NovoalignCS


(1) Brief review of alignment aglorithm - Alignment section

(2) Evaluation of next-generation sequencing software in mapping and assembly

A list of old short reads aligners