You do not have permission to edit this page, for the following reason:
The action you have requested is limited to users in the group: Users.
Short description:
Please summarise the application in a few sentences. Avoid links here. Fast, accurate, memory-efficient aligner for short and long sequencing reads
Software version:
Biological application domain(s) (Phylogenetics, Genomics, ...):
Mapping
Principal bioinformatics method(s) (Assembly, Mapping, ...):
Read mapping,
Technology (Sanger, Illumina, 454, SOLiD, Ion Torrent, ...):
Sanger, Illumina, 454, ABI SOLiD
Interface (Command line, Web UI, Desktop GUI, SOAP WS, HTTP WS, API, QL):
Command line,
Resource type (Command-line tool, Web application, Desktop application, Script, Suite, Workbench, Database portal, Workflow, Plug-in, Library, Web API, Web service, SPARQL endpoint):
Sanger Institute
BWA (Burrows-Wheeler Aligner) is an aligner using the Burrows-Wheeler transform to index the reference genome, which decreases memory usage compared to aligners using k-mer hashing. BWA includes two read alignment algorithms, the first is usually meant when simply the "BWA algorithm" is mentioned. It is callable via the command <tt>bwa align</tt>. The second algorithm is "BWA-SW", it can be called via the command <tt>bwa bwasw</tt>. That tool is described in its own article [[BWA-SW]]. Both algorithms use the same index on disk, which can be created with <tt>bwa index</tt>. = Implementation notes = The following notes were obtained by inspecting the source code. == Ambiguous bases in reference sequences == According to the BWA paper, "Non-A/C/G/T bases on the reference genome are converted to random nucleotides." BWA uses a '''fixed seed''' for the random number generator. This means that running <tt>bwa index</tt> twice on the same FASTA file will result in the same index. (That seed is set to the value 11 in bntseq.c.) == The "XT:A" tag == The value "N" stands for <tt>BWA_TYPE_NO_MATCH</tt> (bwtaln.h). If the number of ambiguous bases in the reference (which is stored in the "XN:i" tag) is greater than 10, this tag is also set to "N". == The NM and CM tags == If "-c" was given on the command line, CM is written and NM otherwise (the tags are mutually exclusive). == The XC:i tag == The XC:i tag is output when the clipped length of a read is less than the full read length. == The XO and XG tags == The documentation says that XG is the number gap extensions, but the source code seems to indicate that XG is the total no. of gaps (open+extend) (bwase.c): printf("\tXM:i:%d\tXO:i:%d\tXG:i:%d", p->n_mm, p->n_gapo, p->n_gapo+p->n_gape); == Regular index vs. color space index == A color space index (created with the -c option to bwa index) and a regular index '''cannot coexist''' in the same directory unless different prefixes are chosen (with the -p option). == Length of contig names == Contig names must not be longer than 1024 characters. If a name is longer, there is no error message, but mapping still does not work. == Unapplied Patches == * [http://sourceforge.net/mailarchive/message.php?msg_id=26175347 Solid paired-end patch] plus [http://sourceforge.net/mailarchive/message.php?msg_id=27685167 its correction]
Once you save the form, you will have the chance to add links and references.
Summary of edit
This is a minor edit Watch this page
Cancel