Difference between revisions of "DESeq"

From SEQwiki
Jump to: navigation, search
Line 1: Line 1:
 
{{Bioinformatics application
 
{{Bioinformatics application
 
|sw summary=DESeq is an R package to analyse count data from high-throughput sequencing assays such as RNA-Seq and test for differential expression.
 
|sw summary=DESeq is an R package to analyse count data from high-throughput sequencing assays such as RNA-Seq and test for differential expression.
|bio domain=RNA-Seq (and other high-throughput assays yielding count data)
+
|bio domain=RNA-Seq, other high-throughput assays yielding count data  
 
|bio method=statistical testing, noise estimation
 
|bio method=statistical testing, noise estimation
 
|bio tech=most
 
|bio tech=most
Line 10: Line 10:
 
|output format=table
 
|output format=table
 
|language=R
 
|language=R
|licence=GPLv3,  
+
|licence=GPLv3,
 
|os=UNIX and Windows, MacOS X
 
|os=UNIX and Windows, MacOS X
 
}}
 
}}
Line 22: Line 22:
  
 
DESeq's applicability is not limited to RNA-Seq. Rather, it may be used for many kinds of count data derived from high-throughput experiments.
 
DESeq's applicability is not limited to RNA-Seq. Rather, it may be used for many kinds of count data derived from high-throughput experiments.
 +
{{Links}}
 +
{{References}}
 +
{{Link box}}

Revision as of 15:30, 25 May 2010

Application data

Created by Anders S
Biological application domain(s) RNA-Seq, other high-throughput assays yielding count data
Principal bioinformatics method(s) statistical testing, noise estimation
Technology most
Created at European Molecular Biology Laboratory
Maintained? Yes
Input format(s) table with count data
Output format(s) table
Programming language(s) R
Licence GPLv3
Operating system(s) UNIX and Windows, MacOS X

Summary: DESeq is an R package to analyse count data from high-throughput sequencing assays such as RNA-Seq and test for differential expression.

"Error: no local variable "counter" was set." is not a number.

DESeq uses a model based on the negative binomial distribution and offers, in brief, the following features:

Count data is discrete and skewed and is hence not well approximated by a normal distribution. Thus, a test based on the negative binomial distribution, which can reflect these properties, has much higher power to detect differential expression.

Tests for differential expression between two experimental conditions should take into account both technical and biological variability. Recently, several authors have claimed that the Poisson distribution can be used for this purpose. However, tests based on the Poisson assumption (this includes the binomial test and the chi-squared test) ignore the biological sampling variance, leading to incorrectly optimistic p values. The negative binomial distribution is a generalisation of the Poisson model that allows to model biological variance correctly.

In the former two points, DESeq is similar to earlier tools, especially to edgeR. One of the new features of DESeq is the ability to estimate the variance in a local fashion, using different coefficients of variation for different expression strengths. This removes potential selection biases in the hit list of differentially expressed genes, and gives a more balanced and accurate result.

DESeq's applicability is not limited to RNA-Seq. Rather, it may be used for many kinds of count data derived from high-throughput experiments.

Links


References

  1. . 2010. Genome Biology
  2. . 2014. Genome Biology


To add a reference for DESeq, enter the PubMed ID in the field below and click 'Add'.

 


Search for "DESeq" in the SEQanswers forum / BioStar or:

Web Search Wiki Sites Scientific