Difference between revisions of "BWA"

From SEQwiki
Jump to: navigation, search
m
Line 30: Line 30:
 
== XT:A tag ==
 
== XT:A tag ==
  
N stands for for <tt>BWA_TYPE_NO_MATCH</tt> (bwtaln.h)
+
N stands for for <tt>BWA_TYPE_NO_MATCH</tt> (bwtaln.h).
 +
If the number of ambiguous bases in the reference (XN:i tag) is greater than 10, this tag is also set to N.
 +
 
 +
== NM, CM tags ==
 +
 
 +
If "-c" was given on the command line, CM is written and NM otherwise (the tags are mutually exclusive).
  
 
== XC:i tag ==
 
== XC:i tag ==
  
 
The XC:i tag is output when the clipped length of a read is less than the full read length.
 
The XC:i tag is output when the clipped length of a read is less than the full read length.
 +
 +
== XO, XG tags ==
 +
 +
Documentation says that XG is the number gap extensions.
 +
The source code seems to indicate that XG is the total no. of gaps (open+extend) (bwase.c):
 +
  printf("\tXM:i:%d\tXO:i:%d\tXG:i:%d", p->n_mm, p->n_gapo, p->n_gapo+p->n_gape);
  
 
{{Link box}}
 
{{Link box}}

Revision as of 18:51, 2 September 2010

Application data

Created by Heng Li and Richard Durbin
Biological application domain(s) Read alignment, Mapping
Principal bioinformatics method(s) FM-Index
Technology Sanger, Illumina, 454, ABI SOLiD
Created at Sanger Institute
Maintained? Yes
Input format(s) compressed/uncompressed fastq/fasta
Output format(s) SAM
Software features Gapped alignment, paired-end mapping
Programming language(s) C
Licence GPLv3, MIT
Operating system(s) Unix

Summary: Fast, accurate, memory-efficient aligner for short and long sequencing reads

"Error: no local variable "counter" was set." is not a number.


Links


References

  1. . 2009. Bioinformatics


To add a reference for BWA, enter the PubMed ID in the field below and click 'Add'.

 


Notes

Ambiguous bases in reference sequences

According to the BWA paper, "Non-A/C/G/T bases on the reference genome are converted to random nucleotides."

BWA uses a fixed seed for the random number generator. This means that running bwa index twice on the same FASTA file will result in the same index.

(The seed is set to 11 in bntseq.c.)

XT:A tag

N stands for for BWA_TYPE_NO_MATCH (bwtaln.h). If the number of ambiguous bases in the reference (XN:i tag) is greater than 10, this tag is also set to N.

NM, CM tags

If "-c" was given on the command line, CM is written and NM otherwise (the tags are mutually exclusive).

XC:i tag

The XC:i tag is output when the clipped length of a read is less than the full read length.

XO, XG tags

Documentation says that XG is the number gap extensions. The source code seems to indicate that XG is the total no. of gaps (open+extend) (bwase.c):

 printf("\tXM:i:%d\tXO:i:%d\tXG:i:%d", p->n_mm, p->n_gapo, p->n_gapo+p->n_gape);

Search for "BWA" in the SEQanswers forum / BioStar or:

Web Search Wiki Sites Scientific