20565853

This reference describes NGS-DesignTools.

PMID	PMID 20565853
Title	Designing deep sequencing experiments: structural variation, haplotype assembly, and transcript abundance.
Year	2010
Journal	BMC Genomics
Author	Bashir A, Bansal V, Bafna V.
Volume	11
Start page	385

Error: No contents found at URL http://www.ebi.ac.uk/europepmc/webservices/rest/MED/20565853/citations/4000.

According to Europe PubMed Central, this reference has Error: no local variable "citations" was set. " Error: no local variable "citations" was set. " is not a number. citations.

For reference, you can check Google Scholar, which lacks an API because Google ...

Error: Invalid JSON. According to Almetric, this reference has an Altmetric score of Error: no local variable "altscore" was set. " Error: no local variable "altscore" was set. " is not a number..

Full text description

BACKGROUND: Massively parallel DNA sequencing technologies have enabled the sequencing of several individual human genomes. These technologies are also being used in novel ways for mRNA expression profiling, genome-wide discovery of transcription-factor binding sites, small RNA discovery, etc. The multitude of sequencing platforms, each with their unique characteristics, pose a number of design challenges, regarding the technology to be used and the depth of sequencing required for a particular sequencing application. Here we describe a number of analytical and empirical results to address design questions for two applications: detection of structural variations from paired-end sequencing and estimating mRNA transcript abundance. RESULTS: For structural variation, our results provide explicit trade-offs between the detection and resolution of rearrangement breakpoints, and the optimal mix of paired-read insert lengths. Specifically, we prove that optimal detection and resolution of breakpoints is achieved using a mix of exactly two insert library lengths. Further- more, we derive explicit formulae to determine these insert length combinations, enabling a 15% improvement in breakpoint detection at the same experimental cost. On empirical short read data, these predictions show good concordance with Illumina 200bp and 2Kbp insert length libraries. For transcriptome sequencing, we determine the sequencing depth needed to detect rare transcripts from a small pilot study. With only 1 Million reads, we derive corrections that enable almost perfect prediction of the underlying expression probability distribution, and use this to predict the sequencing depth required to detect low expressed genes with greater than 95% probability. CONCLUSIONS: Together, our results form a generic framework for many design considerations related to high- throughput sequencing. We provide software tools (http://bix.ucsd.edu/projects/NGS-DesignTools) to derive platform independent guidelines for designing sequencing experiments (amount of sequencing, choice of insert length, mix of libraries) for novel applications of next generation sequencing.

20565853

Full text description

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

wiki navigation

Software

Tools