Gnumap

From SEQwiki
Jump to: navigation, search

Application data

Principal bioinformatics method(s) Read mapping
Technology Illumina
Created at Brigham Young University
Maintained? Yes
Input format(s) FASTQ
Programming language(s) C++

Summary: The Genomic Next-generation Universal MAPper (gnumap) is a program designed to accurately map sequence data obtained from next-generation sequencing machines (specifically that of Solexa/Illumina) back to a genome of any size. Currently, gnumap is designed to be used with the _int.txt data received from the Solexa/Illumina machine.

"Error: no local variable "counter" was set." is not a number.

With the emergence of high-throughput next-generation sequencing machines, an incredible amount of data is being produced at a very high rate. The big problem is mapping this data back to the genome. One significant problem with many genomic mapping programs is the way duplicate regions in genomic DNA are dealt with. Since it is impossible to know where exactly where a duplicate region should be mapped to, many programs simply throw out these sequences. Often, this results in a loss of nearly 40% of the data.

This project develops GNUMAP, a program capable of handling such repetitive regions. By using the posterior probability of mapping a given read to a specific genomic loation, we are able to account for these repetitive reads by distributing them across several regions in the genome. In addition, the output of the program is created in such a way that it can be easily viewed through other free and readily- available programs. Several benchmark data sets were created with spiked-in duplicate regions, and GNUMAP was able to more accurately account for these duplicate regions.

Links


References

  1. . 2009. Bioinformatics


To add a reference for Gnumap, enter the PubMed ID in the field below and click 'Add'.

 


Search for "Gnumap" in the SEQanswers forum / BioStar or:

Web Search Wiki Sites Scientific