DESeq

From SEQwiki
Jump to: navigation, search

Application data

Created by Anders S
Biological application domain(s) RNA-Seq quantification, ChIP-seq
Principal bioinformatics method(s) statistical testing, Sequencing quality control
Created at European Molecular Biology Laboratory
Maintained? Yes
Input format(s) table with count data
Output format(s) table
Programming language(s) R
Licence GPLv3
Operating system(s) UNIX, Windows, Mac OS X
Contact: sanders@fs.tum.de

Summary: DESeq is an R package to analyse count data from high-throughput sequencing assays such as RNA-Seq and test for differential expression. The latest version is DESeq2 (released April 2013).

"Error: no local variable "counter" was set." is not a number.

DESeq uses a model based on the negative binomial distribution and offers, in brief, the following features:

Count data is discrete and skewed and is hence not well approximated by a normal distribution. Thus, a test based on the negative binomial distribution, which can reflect these properties, has much higher power to detect differential expression.

Tests for differential expression between experimental conditions should take into account both technical and biological variability. Recently, several authors have claimed that the Poisson distribution can be used for this purpose. However, tests based on the Poisson assumption (this includes the binomial test and the chi-squared test) ignore the biological sampling variance, leading to incorrectly optimistic p values. The negative binomial distribution is a generalisation of the Poisson model that allows to model biological variance correctly.

In the former two points, DESeq is similar to earlier tools, especially to edgeR. DESeq estimate the variance in a local fashion, using different coefficients of variation for different expression strengths. This removes potential selection biases in the hit list of differentially expressed genes, and gives a more balanced and accurate result.

DESeq's applicability is not limited to RNA-Seq. Rather, it may be used for many kinds of count data derived from high-throughput experiments.

Beside from the differential testing functionality, DESeq offers two transformations for stabilizing the variance of count data: the Variance Stabilizing Transformation (VST), and the regularized logarithm (rlog). These can be used for visualization and data exploration, such as for calculating sample-sample distances.

Links


References

  1. . 2010. Genome Biology
  2. . 2014. Genome Biology


To add a reference for DESeq, enter the PubMed ID in the field below and click 'Add'.

 


Search for "DESeq" in the SEQanswers forum / BioStar or:

Web Search Wiki Sites Scientific