20147302

From SEQwiki
Jump to: navigation, search

This reference describes GSNAP.

PMID PMID 20147302
Title Fast and SNP-tolerant detection of complex variants and splicing in short reads
Year 2010
Journal Bioinformatics
Author Wu TD, Nacu S
Volume
Start page


Error: No contents found at URL http://www.ebi.ac.uk/europepmc/webservices/rest/MED/20147302/citations/4000.

According to Europe PubMed Central, this reference has Error: no local variable "citations" was set. " Error: no local variable "citations" was set. " is not a number. citations.

For reference, you can check Google Scholar, which lacks an API because Google ...


Error: Invalid JSON. According to Almetric, this reference has an Altmetric score of Error: no local variable "altscore" was set. " Error: no local variable "altscore" was set. " is not a number..

Full text description

Motivation: Next-generation sequencing captures sequence differences in reads relative to a reference genome or transcriptome, including splicing events and complex variants involving multiple mismatches and long indels. We present computational methods for fast detection of complex variants and splicing in short reads, based on a successively constrained search process of merging and filtering position lists from a genomic index. Our implementation GSNAP can align both single-end and paired-end reads as short as 14 nt and of arbitrarily long length. It can detect short- and long-distance splicing, including interchromosomal splicing, in individual reads using probabilistic models or a database of known splice sites. Our program also permits SNP-tolerant alignment to a reference space of all possible combinations of major and minor alleles, and can align reads from bisulfite treated DNA for the study of methylation state.

Results: In comparison testing, GSNAP has speeds comparable to existing programs, especially in reads of 70 nucleotides or more, and is fastest in detecting complex variants with 4 or more mismatches or insertions of 1–9 nucleotides and deletions of 1–30 nucleotides. Although SNP tolerance does not increase alignment yield substantially, it affects alignment results in 7–8% of transcriptional reads, typically by revealing alternate genomic mappings for a read. Simulations of bisulfite-converted DNA show a decrease in identifying genomic positions uniquely in 6% of 36-nt reads and 3% of 70-nt reads.

Availability: Source code in C and utility programs in Perl are freely available for download as part of the GMAP package at http://share.gene.com/gmap.