Software/list

From SEQwiki
< Software
SoftwareSoftware/list
Jump to: navigation, search

Below is (one of many possible) dynamic tables of software data, created from pages in the wiki. To add a package to the list, use the following form:

 


CSV

JSON


NameSummaryBio TagsMeth TagsFeaturesLanguageLicenceOS
.NET BIO".NET Bio is an open source library of common bioinformatics functions, intended to simplify the creation of life science applications. The core library implements a range of file parsers and formatters for common file types, connectors to commonly-used web services such as NCBI BLAST, and standard algorithms for the comparison and assembly of DNA, RNA and protein sequences. Sample tools and code snippets are also included."Sequence analysisC#Windows
Linux
4peaksAllows viewing sequencing trace files, motif searching trimming, BLAST and exporting sequences.SequencingSequence analysisFreewareMac OS X
A5A5 is an integrative pipeline for genome assembly that automates sequence data cleaning, error correction, assembly, and quality control by chaining a number of programs together with additional custom algorithms.Sequence assemblyDe-novo assemblyGPLv3Linux
Mac OS X
AB Large Indel ToolIdentifies deviations in clone insert size that indicate intra-chromosomal structural variations compared to a reference genome.Indel detection
Sequencing
MappingPerlGPLLinux 64
AB Small Indel ToolThe SOLiD™ Small Indel Tool processes the indel evidences found in the pairing step of the SOLiD™ System Analysis pipeline Tool (Corona Lite).Indel detection
Sequencing
Read mapping
Sequence alignment
Perl
C++
GPLLinux 64
ABBAAssembly Boosted By Amino acid sequence is a comparative gene assembler, which uses amino acid sequences from predicted proteins to help build a better assemblySequence assemblySequence assembly
Scaffolding
Artistic LicenseLinux
ABMapperMaps RNA-Seq reads to target genome considering possible multiple mapping locations and splice junctionsGenomics
Transcriptomics
Read mapping
Sequence alignment
C++
Perl
GPLv3Linux
ABySSABySS is a de novo sequence assembler designed for short reads and large genomes.Sequence assembly (de novo assembly)Sequence assembly
Sequence assembly (de-novo assembly)
MPI
OpenMP
C++Commercial
Freeware
POSIX
Linux
Mac OS X
Adapter Removal (software)Removes adaptor fragments from raw short read sequence data and outputs data to FASTA format.WorkflowsAdapter removalSequence trimmingJavaCustom LicenceLinux 64
Windows
Mac OS X
ADTExAberration Detection in Tumour Exome (ADTEx) is a tool for copy number variation (CNV) detection for whole-exome data from paired tumour/matched normal samples.Copy number estimation
Exome analysis
Cancer biology
Next Generation Sequencing
Statistical calculationCopy number analysisPython
R
GPLv3GNU/Linux
AGEAGE is a tool that implements an algorithm for optimal alignment of sequences with SVs.Structural variationSequence alignmentCreative Commons license (Attribution-NonCommerical).
AGILEA hash table based high throughput sequence mapping algorithm for longer 4A54 reads that uses diagonal multiple seed-match criteria, customized q-gram filtering and a dynamic incremental search approach among other heuristics to optimize every step of the mapping processMappingC
Agp2amosmissingFormattingWindows
Linux
AlcovnaALgorithms for COmparing and Visualizing Non Assembled dataSNP detectionJava
ALEXA-SeqAlternative Expression Analysis by massively parallel RNA sequencingRNA-Seq quantification
Alternative splicing
PerlGPLv3
ALLPATHSDe novo assembly of whole-genome shotgun microreads.Sequence assembly (de novo assembly)Sequence assembly
Sequence assembly (de-novo assembly)
Alta-CyclicAlta-Cyclic is a Illumina Genome-Analyzer (Solexa) base caller.Base-calling
AMOSAMOS is a Modular, Open-Source whole genome assembler.Sequence assembly
Sequence assembly validation
Integrated solution
Formatting
Sequence assembly visualisation
C
Perl
Linux
ANCHORPost-processing tools for de novo assembliesDe-novo assemblySequence assemblyC++
Python
BCCA (academic use)Linux
AnCorrWebserver/tool for evaluation coordinate and ordinal correlations between genomic tracks and/or expression or protein binding profiles.GenomicsStatistical calculation
Correlation
Easy-to-use point-and-click web interface with context helpPerl
C
CGI
PHP
JavaScript
free for non-commercial academic researchCross-Platform
Anno-JAnnotation Browsing 2.0SequencingVisualisationCreative Commons - Attribution-NonCommercial-ShareAlike
ANNOVARANNOVAR: Functional annotation of genetic variants from high-throughput sequencing dataGenomics
Genetics
Annotation
Variant prioritisation
Gene-based annotation
region-based annotation
filter-based annotation
PerlCommercial
Freeware
Linux
Windows
Mac OS X
ArachneARACHNE is a program for assembling data from whole genome shotgun sequencing experiments.Sequence assembly
AREMAREM: Aligning Short Reads from ChIP-sequencing by Expectation MaximisationChIP-seqRead mapping
Peak calling
PythonLinux
Arfarf is a genetic analysis program for sequencing data.
Array Suite (Array Studio/Server)Array Studio is a complete analysis and visualization package for NextGen sequencing data, as well as other -OMIC data types. Array Server is a backend enterprise server for storage and analysis of -OMIC and NextGen sequencing data.Genomics
SNP detection
Indel detection
Read mapping
Gene expression profiling
Data Visualisation
Variant annotation and analysis
coverage analysis
Read mapping
C#CommercialWindows
ArrayExpressHTSR-based pipeline for RNA-Seq data analysis.RNA-Seq
RNA-Seq quantification
R
ArrayStarArrayStar is an easy-to-use gene expression analysis software package that offers powerful visualization and statistical tools to help you analyze your microarray data.Gene expression analysisStatistical calculation
Differential expression analysis
Ontology comparison
Genetic variation analysis
CommercialWindows 7 64-bit or Higher
Mac OS X 10.7
10.8
or 10.9 with Parallels Desktop
ASCEmpirical Bayes method to detect differential expression.RNA-Seq quantificationStatistical calculation and probability
ATACATAC is a computational process for comparative mapping between two genome assemblies, or between two different genomes.Sequence alignment
Sequence assembly validation
Linux
Atlas SuiteAtlas is a suite of variant analysis tools specializing in the separation of true SNPs and insertions and deletions (indels) from sequencing and mapping errors in Whole Exome Capture Sequecing (WECS) data. SNPs may be called using the Atlas-SNP2 application and indels may be called using the Atlas-Indel2 application.SNP detection
Indel detection
Variant callingRuby
C
BSDPOSIX
Atlas-SNP2Atlas-SNP2 is a SNP detection tool developed for next generation sequencing platformsSNP detectionRubyFreewareUNIX
Avadis NGSStrand NGS formerly Avadis NGS is a desktop software platform for alignment, analysis, visualization, and management of data generated by next-generation sequencing (NGS) platforms. It supports workflows for RNA-Seq, DNA-Seq, small RNA-Seq, ChIP-seq, and Methyl-Seq data analysis. Strand NGS is designed with the biologist in mind.ChIP-seq
RNA-Seq
Sequencing
DNA-Seq
Small RNA-Seq
RNA
Methyl-Seq
MeDIP-Seq
Pathway or network analysis
Sequence alignment
Visualisation
Sequencing quality control
Sequence analysis
Analysis
Biological interpretation
Rich Visualisation
Identify effects of SNPs on transcripts
Identify Structural Variants from Paired Reads (Insertions
Deletions
Translocations
Inversions)
Identify binding site peaks in ChIP-seq data
Identify motifs around binding sites
Determine gene expression levels and identify differentially expressed genes De-convolve transcript expression levels and identify differential splice variants
Identify Novel Exons
Identify Novel Splice Junctions
Identify Fusion Genes Perform QC on Reads
determine on-and off-target reads
and filter anomalous reads
Determine Enriched GO Terms
Determine Significant Pathways
Java
R
Python
CommercialWindows
Linux
Mac OS X
Baa.pluse transcripts to assess a de novo assemblySequence assembly (de novo assembly)Sequence assembly validation
Sequence alignment analysis
PerlGPLany
BambinoVariant detector and graphical alignment viewer for SAM/BAM format data.SNP detection
Genetic variation
Java
BambusBambus is a general purpose scaffolderScaffolding
BAMseekBAMseek is a large file viewer for BAM and SAM alignment files.Genomics
Transcriptomics
Sequence alignment visualisationJavaGPLv3Cross-Platform
BamToolsBamTools provides a fast, flexible C++ API & toolkit for reading, writing, and managing BAM files.Sequence alignment analysisC++MITCross-Platform
BamViewInteractive Java application for visualising the large amounts of data stored for sequence reads which are aligned against a reference genome sequenceVisualisationJavaGPLMac OS X
UNIX
Windows
Barcode generatorGenerator of sequence barcodes suitable for Illumina sequencing.DNA barcodingPython
Barcrawl BartabBarcrawl facilitates the design of barcoded primers, for multiplexed high-throughput sequencing.DNA barcodingGPL
BarraCUDABarracuda is a high-speed sequence aligner based on BWA and uses the latest Nvidia CUDA architecture for accelerating alignments of sequence reads generated by the next-generation sequencers.Sequence analysisRead mapping
Sequence alignment
Gapped and ungapped alignment
paired-end mapping
GPGPU
parallel execution
C
C++
CUDA
GPLv3
MIT
Linux
BatmanBayesian tool for methylation analysis (Batman) for analyzing methylated DNA immunoprecipitation (MeDIP) profilesDNA methylationJavaLGPL
BayesCallBayesian basecallerSequencingBase-callingC++
Python
GPLv3
BayesPeakA Bayesian hidden Markov model to detect enriched locations in ChIP-seq data.ChIP-seq
Simulation experiment
Statistical calculationMulticoreRGPL
BaySeqIdentify differential expressed genesRNA-Seq quantificationDifferential expression analysisR
BBMapBBMap is a fast splice-aware aligner for RNA and DNA. It is faster than almost all short-read aligners, yet retains unrivaled sensitivity and specificity, particularly for reads with many errors and indels.Sequence alignment
Whole genome resequencing
RNA-Seq alignment
SNP detection
Metagenomics
Phylogenetics
Alternative splicing
Resequencing
Quality control
Read binning
Read mapping
Sequence alignment
Sequence contamination filtering
Sequence trimming
RNA-Seq analysis
Multithreaded. Faster and more accurate than competing aligners. Splice-aware.Java 7BSDWindows
  • NIX

Mac OS X
all supporting JVM
BBSeqTool for analyzing RNA-Seq data to analyze gene expressionRNA-Seq quantificationR
Bcbio-nextgenPython scripts and modules for automated next gen sequencing analysis. These provide a fully automated pipeline for taking sequencing results from an Illumina sequencer, converting them to standard Fastq format, aligning to a reference genome, doing SNP calling, and producing a summary PDF of results.WorkflowsRead mapping
Sequence alignment
Peak calling
Sequence motif recognition
Genotyping
Sequencing quality control
Differential expression analysis
Sequence trimming
Filtering
Genomic region matching
PythonMITplatform-independent
BEADSChIP-seq data normalization for IlluminaChIP-seqStandardisation and normalisation
BEAPThe Blast Extension and Assembly Program (BEAP) uses a short starting DNA fragment to recursively blast nucleotide databases to obtain all sequences that overlaps to construct the a "full length" sequence.Read mapping
BEDToolsBEDTools is an extensive suite of utilities for comparing genomic features in BED format.GenomicsMappingFeature overlaps
UNIX pipes
coverage
split-alignments
BAM support
C++GPLv2Linux
Mac OS X
BedutilsNGSUtils is a suite of software tools for working with next-generation sequencing datasets. Staring in 2009, we (Liu Lab @ Indiana University School of Medicine) starting working with next-generation sequencing data. We initially started doing custom coding for each project in a one-off manner. It quickly became apparent that this was an inefficient manner to work, so we started assembling smaller utilities that could be adapted into larger, more complicated, workflows. We have used them for Illumia, SOLiD and 454 sequencing data. We have used them for DNA and RNA resequcing, ChIP-seq, CLIP-Seq, and targeted resequencing (Agilent exome capture and PCR targeting). These tools are also used heavily in our in-house DNA and RNA mapping pipelines.

These tools have of great use within our lab group, and so we are happy to make them available to the greater community.

NGSUtils is made up of 50+ programs, mainly written in Python. These are separated into modules based on the type of file that is to be analyzed. There are four modules:
BelvuAn X-windows viewer for multiple sequence alignmentsSequence alignment visualisationLinux
BFASTBlat-like Fast Accurate Search Tool.Whole genome resequencingRead mapping
Sequence alignment
Genome indexing
parallel execution
command line
CGPLSolaris
UNIX
BFCounterBFCounter is a program for counting k-mers in DNA sequence data.K-mer countingC++GPLv3
BigBWATool to run the Burrows-Wheeler Aligner-BWA on a Hadoop cluster. It supports the algorithms BWA-MEM, BWA-ALN, and BWA-SW, working with paired and single reads. It implies an important reduction in the computational time when running in a Hadoop cluster, adding scalability and fault-tolerancy.Whole genome resequencing
Genomics
Exome capture
Sequencing
Resequencing
Exome and whole genome variant detection
Exome analysis
Exome
Read mapping
Sequence alignment
Mapping
Gapped alignment
paired-end mapping
JavaGPLv3Linux 64
BINGbiomedical informatics pipeline (BING) for the analysis of NGS data that offers several novel computational approaches to 1. image alignment, 2. signal correlation, compensation, separation, and pixel-based cluster registration, 3. signal measurement and base calling, 4. quality control and accuracy measurement.Base-calling
Sequencing quality control
BioJava"BioJava is an open-source project dedicated to providing a Java framework for processing biological data. It provides analytical and statistical routines, parsers for common file formats and allows the manipulation of sequences and 3D structures. The goal of the biojava project is to facilitate rapid application development for bioinformatics. "GenomicsJavaLGPL 2.1
BionimbusCloud environment for analysis of microarray and second generation sequencing data.Linux
Amazon EC2
cloud
BioNumericsBioNumerics can be used for the analysis of all major applications in bioinformaticsSequence assembly
Whole genome resequencing
Genotyping
Sequence analysis
Comparative genomics
Quality control
Workflows
Data handling
Microbial Surveillance
Epidemiology
Sequence assembly
Read mapping
Sequence alignment
Annotation
Genome visualisation
Variant calling
Comparative genomics
Workflows
CommercialWindows 7 or Higher
BioPerl"BioPerl, a community effort to produce Perl code which is useful in biology. "GenomicsPerlCross-Platform
BioPHPbiology tools for php.GenomicsPHPGPLv2
BiopiecesThe Biopieces are a collection of bioinformatics tools that can be pieced together in a very easy and flexible manner to perform both simple and complex tasks. The Biopieces work on a data stream in such a way that the data stream can be passed through several different Biopieces, each performing one specific task: modifying or adding records to the data stream, creating plots, or uploading data to databases and web services.GenomicsSequence alignment
Visualisation
Sequencing quality control
Sequence analysis
Perl
Python
Ruby
C
GPLv2
BiopythonBiopython provides a tool kit for writing bioinformatics and computational molecular biology software in Python.Sequence analysis
Phylogenetics
Population genetics
Protein structure analysis
Sequence parsingVariousPythonBiopython License (MIT/BSD style)Linux
Windows
Mac OS X
BioRuby"BioRuby comes with a comprehensive set of free development tools and libraries for bioinformatics and molecular biology, for the Ruby programming language. BioRuby has components for sequence analysis, pathway analysis, protein modelling and phylogenetic analysis; it supports many widely used data formats and provides easy access to databases, external programs and public web services, including BLAST, KEGG, GenBank, MEDLINE and GO."GenomicsRubyCross-Platform
BioSmalltalkBioSmalltalk provides an environment to build bioinformatics scripts and applications using the most powerful object technology as of today, the Smalltalk programming environmentSequence analysis
Phylogenetics
Population genetics
Protein structure analysis
Sequence parsingVariousSmalltalkLinux
Windows
Mac OS X
BiQ AnalyzerBiQ Analyzer is a software tool for easy visualization and quality control of DNA methylation data. With more than 2,000 downloads so far, BiQ Analyzer has become a standard tool for processing DNA methylation data from bisulfite sequencing.DNA methylation
Epigenomics
JavaWindows
Linux
Mac OS X
Solaris
BiQ Analyzer HTBiQ Analyzer HT is an enhanced version of BiQ Analyzer that provides extensive support for high-throughput bisulfite sequencing. BiQ Analyzer HT facilitates the processing, quality control and initial analysis of single-basepair resolution DNA methylation data. It was developed for deep bisulfite sequencing of one or more loci using the Roche 454 platform, but it easily extends to other sequencing platforms. BiQ Analyzer HT features a biologist-friendly graphical user interface, a fast alignment algorithm and a variety of ways to visualize DNA methylation data.DNA methylation
Sequencing
Epigenetics
JavaWindows
Linux
Mac OS X
Solaris
Bis-SNPBisSNP is a package based on the Genome Analysis Toolkit (GATK) map-reduce framework for genotyping in bisulfite treated massively parallel sequencing (Bisulfite-seq, NOMe-seq and RRBS) on Illumina platform. It uses bayesian inference with either manually specified or automatically estimated methylation probabilities of different cytosine context(not only CpG, CHH, CHG in Bisulfite-seq, but also GCH et.al. in other bisulfite treated sequencing) to determine genotypes and methylation levels simultaneously.DNA methylation
SNP detection
Sequencing
Genotyping
Epigenetics
Bisulfite mapping
SNP calling
Methylation calling
Accurate SNP and methylation calling in Bisulfite-seq/NOMe-seq/RRBSJava
Perl
MITLinux
Mac OS X
BismarkBismark is a tool to map bisulfite treated sequencing reads and perform methylation calling in a quick and easy-to-use fashion.Genomics
DNA methylation
Epigenomics
Read mapping
Bisulfite mapping
Methylation calling
fast and convenient Bisulfite-Seq output
very flexible
PerlGPLv3Linux
Mac OS X
Windows
BisonBison allows users with access to a computer cluster to rapidly align whole-genome bisulfite sequencing or RRBS reads. It can align both directional and non-directional libraries and uses bowtie2.DNA methylation
Sequencing
Epigenetics
Read mapping
Bisulfite mapping
Methylation calling
BAM support
Bisulfite sequencing
MPI
CUnix-like
Linux
Mac OS X
BLASTBLAST finds regions of similarity between biological sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance.Sequence analysisLinux
BLAST Ring Image Generator"BRIG is a cross-platform (Windows/Mac/Unix) application that can display circular comparisons between a large number of genomes, with a focus on handling genome assembly data. "Comparative genomicsVisualisation
Sequence assembly visualisation
Cross-Platform
BLATFast, accurate spliced alignment of DNA sequencesRead mapping
Sequence alignment
Read mappingCFreewareLinux
Mac OS X
Blixema graphical blast viewerSequence analysis
Phylogenetics
Sequence alignment visualisationGPLLinux
BOATCan accurately and efficiently map sequencing reads back to the reference genome.Read mappingGPL
BortBort parses Blast output and quantifies hits by contig and read counts.RNA-Seq quantificationPerlany
BOWBOW - Bioinformatics On Windows is essentially a windows port of BWA and SAMTOOLS
BowtieBowtie is an ultrafast, memory-efficient short read aligner.Read mapping
Genome indexing (Burrows-Wheeler)
Mac OS X
Linux
Windows
BRATaccurate and efficient tool for mapping short reads obtained from the Illumina Genome Analyzer following sodium bisulfite conversion. Both single and paired ends are supported.DNA methylation
Epigenomics
Bisulfite mapping
Mapping
GPLv3
BRCA-diagnosticComputational screening test for BRCA1/2 mutants in human genomic DNAPersonalised medicinePerl
BreakDancerBreakDancer is an application for detecting structural rearrangements and indels in short read sequencing dataGenomics
Indel detection
Structural variation
Perl
C++
GPLv3
BreakpointerBreakpointer is a fast tool for locating sequence breakpoints from the alignment of single end reads (SE) produced by next generation sequencing (NGS). It adopts a heuristic method in searching for local mapping signatures created by insertion/deletions (indels) or more complex structural variants(SVs). With current NGS single-end sequencing data, the output regions by Breakpoint mainly contain the approximate breakpoints of indels and a limited number of large SVs.Indel detection
Exome and whole genome variant detection
Statistical calculationC++
Perl
GPL
BreakSeqDatabase of known human breakpoint junctions and software to search short reads against them.Structural variationRead mapping
BreakTransBreakTrans is a computer program that maps predicted gene fusions to genomic structural rearrangements so as to validate both types of events.Post-analysis
BreakwayBreakway is a suite of programs that take aligned genomic data and report structural variation breakpoints.Whole genome resequencing
Genomics
Indel detection
Structural variation
Genetic variation
SNP calling
Sequence analysis
Fast
specific
UNIX pipes
PerlGPLLinux
Mac OS X
Windows
BS SeekerMapping tool for bisulfite treated readsEpigenomicsBisulfite mappingPython
BS-SeqThe source code and data for the "Shotgun Bisulphite Sequencing of the Arabidopsis Genome Reveals DNA methylation Patterning" Nature paper by Cokus et al. (Steve Jacobsen's lab at UCLA). POSIX.EpigenomicsBisulfite mapping
BSMAPshort reads mapping software for bisulfite sequencingDNA methylationRead mapping
Bisulfite mapping
Bisulfite sequencingC++GPLv3Linux 64
BSSimBSSim: Bisulfite sequencing simulator for next-generation sequencing.DNA methylation
Sequencing
Epigenetics
Modelling and simulationBSSim can allow users to mimic various methylation level.PythonGPLv3UNIX
Linux
Mac OS X
Windows
BtrimBtrim is a fast and lightweight software to trim adapters and low quality regions in reads.Sequence trimmingLinux
BWAFast, accurate, memory-efficient aligner for short and long sequencing readsMappingRead mappingGapped alignment
paired-end mapping
CGPLv3
MIT
UNIX
BWA-SWFast, accurate, memory-efficient aligner for long sequencing readsMappingRead mappingGapped alignment
Local alignment
CGPLv3
MIT
UNIX
CABOGCelera Assembler is scientific software for DNA research.Sequence assembly (de novo assembly)Sequence assemblyRobust to homopolymer run lengthLinux
CANGSCANGS is a flexible and user-friendly utility to trim sequences, filter low quality sequences, and produce input files for further downstream analyses for 454 sequences. CANGS can be used to assign the taxonomic grouping based on similarity with sequences from the NCBI databaseMetagenomics
Phylogenetics
Sequencing quality control
Sequence trimming
Primer removal
Perl
CARPETA web‐based package for the analysis of ChIP‐chip and expression tiling dataChIP-on-chipGenotypingC++
CASHXParse, map, quantify and manage large quantities of short-read sequence data.TranscriptomicsRead mapping
CATCHA tool for exploring patterns in ChIP profiling data.ChIP-seq
ChIP-on-chip
Sequence alignment
Clustering
parallel execution
graphical browsing of results
JavaOpen Source
CatchAllEstimate ecological diversity with both parametric and non-parametric estimators.Metagenomics
Population genetics
CEQerCEQer (Comparative Exome Quantification analyzer) is a graphical, event-driven tool for copy number abnormalities/allelic-imbalance coupled analysis of whole-exome sequencing data. By using case-control matched exome data, CEQer performs a comparative digital exonic quantification to generate CNA data and couples this information with exome-wide LOH and allelic imbalance detection.Copy number estimation
Genetic variation analysis
Exome analysis
Copy number and allelic imbalance analyses from matched exomes.C#GPLv3Windows
Mac
Linux
CexoRStrand specific peak-pair calling in ChIP-exo dataChIP-exoPeak callingR/Bioconductor package
can run on major computer platforms
RGPL-2 + file LICENSELinux
Mac OS X
Windows
CGA ToolsTools for viewing, manipulating and converting data from Complete GenomicsFile reformattingC++Apache License 2.0Linux
UNIX
Mac OS X
CHiCAGOCHiCAGO (Capture HiC Analysis of Genomic Organization) is a set of tools for calling significant interactions in Capture HiC data, such as Promoter Capture HiC.EpigenomicsPeak callingR
Python
Artistic-2.0Linux
OSX
ChimeraScanIdentifies chimaeric transcripts in RNA-Seq dataGene structure
ChIP-Seq (application)The ChIP-seq web server provides access to a set of useful tools performing common ChIP-seq data analysis tasks, including positional correlation analysis, peak detection, and genome partitioning into signal-rich and signal-poor regions. It is an open system designed to allow interoperability with other resources, in particular the motif discovery programs from the Signal Search Analysis (SSA) server.ChIP-seqRead mapping
Peak calling
C
Perl
GPLLinux
Mac OS X
ChIPmetaCombining data from ChIP-seq and ChIP-chip.ChIP-seq
ChIP-on-chip
Transcription factors and regulatory sites
Peak calling
ChIPMunkChIPMunk is a fast heuristic DNA motif digger based a on greedy approach accompanied by bootstrapping. ChIPMunk identifies the strong motif with the maximum Kullback Discrete Information Content in a given set of DNA sequences.ChIP-seq
Sequence motifs
Sequence motif comparison
Sequence motif discovery
efficient motif discovery for huge datasets up to tens of thousands of sequences; multi-core CPU support; usage of the ChIP-seq base coverage peak dataJavaFreewareplatform-independent
CHiPSeqFrom Science Johnson, 2007ChIP-seqPeak calling
ChIPseqRChIP-seq qanalysis toolChIP-seqR
ChipsterUser-friendly NGS data analysis software with built-in genome browser and workflow functionality. Chipster includes tools for ChIP-seq, RNA-seq, miRNA-seq and MeDIP-seq analysis, and functionality for exome-seq and CGH-seq will soon be added.DNA methylation
ChIP-seq
RNA-Seq
Immunoprecipitation experiment
Read mapping
Peak calling
Sequence motif recognition
Genome visualisation
Sequencing quality control
Differential expression analysis
Sequence trimming
Pathway or network analysis
Methylation analysis
Java
R
GPLv3platform-independent
ChromaSigAn unsupervised learning method, which finds, in an unbiased fashion, commonly occurring chromatin signatures in both tiling microarray and sequencing data.ChIP-on-chip
Chromatin
Sequence motif recognitionPerl
C
ChromHMMChromHMM is software for learning and characterizing chromatin states.EpigenomicsStatistical calculationJavaGPLv3
CircosCircos is tool for visualizing data in a circular format. It was developed for genomic data but can work for many other kinds of data as well.Comparative genomicsVisualisationPerlWindows
Linux
CisGenomeAn integrated tool for tiling array, ChIP-seq, genome and cis-regulatory element analysisChIP-seq
ChIP-on-chip
Sequence motifs
Sequence motif recognition
Gibbs sampling
Integrated solution
Data retrieval
C
C++
UNIX
Windows
CistromeGalaxy-based web service for analysis of ChIP dataChIP-seq
ChIP-on-chip
Python
CLCbio Genomics WorkbenchDe novo and reference assembly SNP and small indel detection and annotation.Sequence assembly (de novo assembly)
Whole genome resequencing
Genomics
Transcriptomics
ChIP-seq
SNP detection
Indel detection
RNA-Seq
Regulatory RNA
Mapping
Sequence assembly
Sequence assembly (de-novo assembly)
Read mapping
Sequence alignment
Ab-initio gene prediction
Adapter removal
Annotation
Bisulfite mapping
SNP calling
Heat map generation
Sequence assembly validation
Advanced and user-friendly analyses of genomic
transcriptomic
and epigenomic NGS data in a graphical user-interface. Wizard driven tools and a freely available developer toolkit
SIMD implementation
multi-threading
hybrid assembly
Integrated solution
Java
C++
CommercialWindows
Mac OS X
Linux
Clean readsclean_reads cleans NGS (Sanger, 454, Illumina and solid) reads.Sequencing quality control
Sequence trimming
Python
CleaveLandA pipeline for using degradome data to find cleaved small RNA targets.Regulatory RNAPerl
R
Freeware
CLEVERCLEVER is a tool to discover structural variations such as (larger) insertions and deletions in genomes from paired-end sequencing reads.Genomics
Structural variation
Copy number estimation
command lineC++
Python
GPLv3any
ClipCropa new method and implementation named ClipCrop for detecting SVs with single-base resolutionWhole-genome sequencing
CloudAlignerHadoop-based short read alignerRead mapping
Hadoop
JavaGPLcloud
CloudBurstCloudBurst is a parallel read-mapping algorithm optimized for mapping next-generation sequence data to the human genome and other reference genomes.SNP detection
Genotyping
Personalised medicine
Read mappingparallel execution
Hadoop
Academic Cloud Computing Initiative
Java
ClustDBA powerful tool for exact sequence matchingLinux
Cluster FlowA command-line pipeline tool which uses common cluster managers to run bioinformatics analysis pipelines.Analysis pipelinePerlGPLv3Linux
CNANormA normalization method for Copy Number Aberration in cancer samples.Genomics
Copy number estimation
Cancer biology
Peak calling
Standardisation and normalisation
R
Perl
GPLv2Linux
Mac OS X
Windows
CNAsegWe present a novel approach, called CNAseg, to identify CNAs from second-generation sequencing data. It uses depth of coverage to estimate copy number states and flowcell-to-flowcell variability in cancer and normal samples to control the false positive rate.Structural variationR
CNB MetaGenomics toolsA number of tools and meta-tools developed at CNB/CSIC for the analysis of metagenomics data (some rely on QIIME).Metagenomics
Sequencing
Bash
Perl
Python
C
EU-GPLLinux
Unix-like
POSIX
CnDProgram to detect copy number variation in inbred mouse strainsCopy number estimationDGPL
CNVerCNVer is a method for CNV detection that supplements the depth-of-coverage with paired-end mapping information, where matepairs mapping discordantly to the reference serve to indicate the presence of variation. CNVer combines this information within a unified computational framework called the donor graph, allowing it to better mitigate the sequencing biases that cause uneven local coverage. CNVer can also reconstruct the absolute copy counts of segments of the donor genome, and work with low coverage datasets.Structural variation
Copy number estimation
Perl
C++
CnvHMMWashU copy number variant (CNV) detection algorithm for Illumina/Solexa data.Structural variationLinux
CNVkitCNVkit is a software toolkit to infer and visualize copy number from targeted DNA sequencing data.Structural variation
Copy number estimation
Variant callingCopy number analysis
data visualization
PythonBSDGNU/Linux
Mac OS X
CNVnatorCNV discovery and genotyping from read-depth analysis of personal genome sequencingGenotyping
Copy number estimation
CNVseqCNV-seq, a method to detect copy number variation using high-throughput sequencing. pubmedCopy number estimationPerl
R
CompreheNGSivecompreheNGSive is an interactive visualization of the end results of the next-generation sequencing pipeline.Next Generation SequencingVisualisationPython
Qt
LGPLMac OS X
Linux
CoNAn-SNVCoNAn-SNV is a probabilistic framework for the discovery of single nucleotide variants in WGSS data. This software explicitly integrates information about copy number state of different genomic segments into the inference of single nucleotide variants.SNP detectionC
ConDeTriConDeTri is a content dependent read trimming software for Illumina/Solexa sequencing dataGenomics
RNA-Seq
Sequencing
Sequence trimmingPerl
ContEstGATK tool to estimate amount of cross-individual contaminating sequence in a datasetMetagenomic sequencingSequencing quality controlJavaBSD
ContraCopy number analysis for exome-sequencing / targeted-resequencing. Two methods of analysis available: Case vs Control, or Case vs Baseline. Function available for creating a baseline from multiple samples.Genomics
Sequencing
Copy number estimation
Cancer biology
Copy number analysis
baseline (pseudo-control) creation
Python
R
GPLv3Linux 64
Linux
ContrailA Hadoop based genome assembler for assembling large genomes in the cloudsSequence assembly (de novo assembly)Sequence assembly
Sequence assembly (de-novo assembly)
Hadoop
CopySeqCopySeq analyzes the depth-of-coverage of whole genome resequencing data to predict CNVs and to infer quantitative locus copy-number genotypes.Structural variation
Genotyping
Personalised medicine
Copy number estimation
Java
R
Mac OS X
Linux
CoralCorrects sequencing errors in short read data via multiple alignmentsSequence error correctionC++
CORAL (Contig Ordering Algorithm)An algorithm has been developed to order fingerprinted clones within contigs.Sequence error correctionJava
CortexCortex is an efficient and low-memory software framework for analysis of genomes using sequence data. Cortex allows de novo assembly of variants without having to do a consensus assembly first. Also allows comparison of genomes without using consensus, and alignment of sequence data to a de Bruijn graphGenomicsSequence assembly
Variant calling
CGPLv3
CPTRAIntegrated transcriptome analysis from Sanger, 454, Solexa, SOLiD, etc readsRNA-Seq alignment
RNA-Seq quantification
Python
CRACCRAC is a mapping software specialized for RNA-Seq data. It detects mutations, indels, splice or fusion junctions in each single read.RNA-Seq alignment
SNP detection
Indel detection
RNA-Seq quantification
Alternative splicing
Gene structure
Read mapping
Genome indexing (Burrows-Wheeler)
C++CeCILLLinux
Linux 64
Mac OS X
CRAVATCRAVAT is a web-based resource for cancer-related analysis of genomic variants (single base substitutions and indels). CRAVAT provides scoring, sorting, filtering, and interactive visualizations to assist with identification of important variants.Genomics
Genetic variation
Genetics
Annotation
Variant prioritisation
SNP annotation
Variant classification
Sequence annotation
Gene-based annotation
Detailed Reports
Easy-to-use point-and-click web interface
GUI
HTML5 canvas graphics
Identify effects of SNPs on transcripts
Interactive
Multiple replicates information used; automated pipeline; easy genomic annotation; finding hotspots;
SNP annotation
User-friendly
User Management
Variant annotation and analysis
Web
Linux
CRISPIdentifies rare and common variants in pooled sequencing dataSNP detectionPython
CrossbowCrossbow is a cloud-computing software tool that combines the aligner BOWTIE and the SNP caller SOAPsnp.SNP detectionRead mapping
CUDA-ECA scalable parallel algorithm for correcting sequencing errors in high-throughput short-read data so that error-free reads can be available before DNA fragment assembly.Sequencing quality controlread error correctionC
CufflinksCufflinks assembles transcripts and estimates their abundances in RNA-Seq samples. It accepts aligned RNA-Seq reads and assembles the alignments into a parsimonious set of transcripts. Cufflinks then estimates the relative abundances of these transcripts based on how many reads support each one.Transcriptomics
RNA-Seq alignment
RNA-Seq
RNA-Seq quantification
Alternative splicing
Read mapping
Transcriptome assembly
Differential expression analysis
Boost
CummeRbundAllows for persistent storage, access, exploration, and manipulation of Cufflinks high-throughput sequencing data. In addition, provides numerous plotting functions for commonly used visualizations.RNA-Seq quantificationVisualisation
CurtainCurtain is a Java wrapper around next-generation assemblers such as Velvet which allows the incremental introduction of read-pair information into the assembly process. This enables the assembly of larger genomes than would otherwise be possible within existing memory constraints.Sequence assembly (de novo assembly)Sequence assembly
Sequence assembly (de-novo assembly)
Apache License 2.0
Cutadaptremove adapter sequences from high-throughput sequencing data using alignmentPython
Cython
MIT
DCLIPdCLIP is a Perl program for discovering differential binding regions in two comparative CLIP-Seq (HITS-CLIP, PAR-CLIP or iCLIP) experiments.CLIP-Seq
HITS-CLIP
PAR-CLIP
ICLIP
Sequence alignment analysisPerl
C
UNIX
Unix-like
DecGPUParallel and distributed error correction algorithm for high-throughput short reads.Sequence assembly (de novo assembly)Sequence error correctionC++GPLv3Linux
DeconSeqDeconSeq can be used to automatically detect and efficiently remove any type of sequence contamination from metagenomic datasets, including human or other host sequences. The tool uses a modified version of the BWA-SW aligner and can be applied to longer-read datasets (150+bp read length). DeconSeq is available as both standalone and web-based versions.Genomics
Transcriptomics
Metagenomics
Sequence contamination filteringPerl
C
GPLv3UNIX
Mac OS X
DeepToolsUser-friendly tools for the normalization and visualization of deep-sequencing data.Genomics
ChIP-seq
Visualisation
Conversion
Standardisation and normalisation
Data Visualisation
coverage analysis
File reformatting
normalization
GC plot analysis
PythonGPLv3Linux
Mac OS X
DeFusedeFuse is a software package for gene fusion discovery using RNA-Seq data. The software uses clusters of discordant paired end alignments to inform a split read alignment analysis for finding fusion boundaries. The software also employs a number of heuristic filters in an attempt to reduce the number of false positives and produces a fully annotated output for each predicted fusionRNA-Seq
Gene structure
DEGseqan R package to identify differentially expressed genes or isoforms for RNA-seq data from different samplesRNA-Seq quantificationDifferentially expressed gene identificationR
DESeqDESeq is an R package to analyse count data from high-throughput sequencing assays such as RNA-Seq and test for differential expression. The latest version is DESeq2 (released April 2013).ChIP-seq
RNA-Seq quantification
Sequencing quality control
Statistical testing
RGPLv3UNIX
Windows
Mac OS X
DIALA computational pipeline for identifying single-base substitutions between two closely related genomes without the help of a reference genome.SNP detection
Comparative genomics
C
Python
MITLinux
DiBayesBayesian identification of SNPs in color space (SOLiD) dataSNP detectionGPL
DiffBindDifferential Binding Analysis of ChIP-seq peak data Compute differentially bound sites from multiple ChIP-seq experiments using affinity (quantitative) data. Also enables occupancy (overlap) analysis and plotting functions.ChIP-seqDifferential binding sitesMultiple replicates information used; automated pipeline; finding hotspots;RArtistic-2.0Linux
Mac OS X
Windows
DiffrepsdiffReps is developed to find different peaks in ChIP-seq. It scans the whole genome using a sliding window, performing millions of statistical tests and report the significant hits. diffReps takes into account the biological variations within a group of samples and uses that information to enhance the statistical power. Considering biological variation is of high importance, especiallly for in vivo brain tissues.ChIP-seqDifferential BindingMultiple replicates information used; automated pipeline; easy genomic annotation; finding hotspots;PerlGPLv3Linux
Windows
Mac OS X
DindelCalls small indels from short-read sequence dataIndel detectionLocalised reassembly
DiscoSnpdiscoSnps : qualitative de-novo SNP caller. Extremely low memory and time efficient. No reference genome needed. Call both homozygous and heterozygous SNPs.Sequence assembly (de novo assembly)
Sequencing
Genotyping
Comparative genomics
High-throughput sequencing
Population Genomics
Sequence assembly (de-novo assembly)
Read depth analysis
de novo (reference free) SNP callingC++CeCILLUnix-like
iOS
DNA BaserTool for manual and automatic sequence assembly, analysis, editing, sample processing, metadata integration, file format conversion and mutation detection.SNP detection
Structural variation
Sequence assembly
Sequence analysis
Sequence assembly editing
Portable. Does not require installation. Can run from USB stick. Only 3MB.Commercial
Freeware
Windows
DNA Chromatogram ExplorerDNA Chromatogram Explorer is a Windows Explorer clone dedicated to DNA sequence analysis and manipulation.Sequence analysisFile reformatting
Chromatogram visualisation
Portable. Does not require installation. Can run from USB stick. Only 1MB.FreewareWindows
DNAADNAA (DNA Analysis) software for analysis of Next-Generation Sequencing data.DNA methylation
SNP detection
Structural variation
Sequencing quality control
Modelling and simulation
Statistical calculation
GPL
DNaseRDNase I footprinting analysis of DNase-seq data in RDNase-seqPeak calling
Nucleic acid sequence feature detection
R/Bioconductor package
can run on major computer platforms
RGPL-2 + file LICENSELinux
Mac OS X
Windows
DNAzipA series of techniques that in combination reduces a single genome to a size small enough to be sent as an email attachment.File reformattingC++
DrFASTFast mapper for dibase encoded data.
DSAPAutomated multiple-task web service designed to provide a total solution to analyzing deep-sequencing small RNA datasets generated by next-generation sequencing technologyTranscriptomics
Regulatory RNA
browser based
DSGseqThis program aims to identify differentially spliced genes from two groups of RNA-seq samples.RNA-Seq
Alternative splicing
Gene expression
Statistical calculation
Differential expression analysis
RNA-Seq analysis
C
R
Commercial
Freeware
Linux
Windows
Mac OS X
DSRCCompression algorithm for genomic data in FASTQ formatFile reformatting
E-miRPerl tools for processing miRNA sequencing dataTranscriptomics
Regulatory RNA
Ea-utilsFASTQ processing utilitiesSequencing quality control
Sequence trimming
C++MIT
EagleViewEagleView is an information-rich genome assembler viewer with data integration capability.Sequence assembly visualisation
EagleView genome viewerEagleView is an information-rich genome assembler viewer with data integration capability.VisualisationLinux
Windows
Mac OS X
EaSeqEaSeq is developed for user-friendly exploration, visualization and analysis of genome-wide single-read sequencing data (mainly ChIP-seq).

Both individual genomic loci and populations of loci can be visualized e.g. as plots of average signals, scatter diagrams, or clustered heatmaps. The underlying loci can then be inspected just by selecting them in the plots - or they can be 'gated out' for further analysis.

EaSeq also integrates more than 20 tools for analysis, including peak-finding, quantitation, normalization, clustering, distance analysis, randomization, scoring, and normalization. Finally, it automatically generates legends and descriptions of the handling and can store plots together with underlying data and these descriptions as a single compact session file.
ChIP-SeqVisualisation
Peak calling
Viewer
Clustering
Integrated solution
Genome visualisation
Analysis
Biological interpretation
Regression analysis
Correlation
Filtering
Genomic region matching
Format conversion
Read summarisation
Enrichment
Global test
Visualisation Quality assessement
And more...
Interactive
User-friendly
GUI
WYSIWYG
ChIPseq analysis
Data Visualisation
Data analysis
Genome Viewer
Integrated solution
Memory efficient
Multicore
Integrated tutorials
Automatic documentation
Integrated chat forum
Freeware
Custom Licence
Windows 7 or Higher
EasyfigGenome comparison figure generatorComparative genomicsComparative genomicsPythonGPLv3Windows
Mac OS X
GNU/Linux
EBCallEBCall is a software package for somatic mutation detection (including InDels). EBCall uses not only paired tumor/normal sequence data of a target sample, but also multiple non-paired normal reference samples for evaluating distribution of sequencing errors, which leads to an accurate mutaiton detection even in case of low sequencing depths and low allele frequencies.
ECHOReference-free short read error correction from diploid genomes, with explicit modeling of heterozygous sites.SNP detection
Indel detection
Sequence error correctionPython
C++
BSD
EDENAAn assembler dedicated to process the millions of very short reads produced by the Illumina Genome Analyzer.Sequence assembly
EdgeRedgeR is an R/Bioconductor software package for statistical analysis of replicated count data. Methods are designed for assessing differential expression in comparative RNA-Seq experiments, but are generally applicable to count data from other genome-scale platforms (ChIP-seq, MeDIP-Seq, Tag-Seq, SAGE-Seq etc).DNA methylation
ChIP-seq
RNA-Seq
RNA-Seq quantification
Gene expression analysis
Statistical calculationRLGPLWindows
Mac OS X
UNIX
ELANDEfficient Large-Scale Alignment of Nucleotide Databases. Whole genome alignments to a reference genome. Written by Illumina author Anthony J. Cox for the Solexa 1G machine.Sequence alignmentCommercial
EMBFFrequency-based, de novo short-read clustering method that organizes erroneous short sequences originating in a single abundant sequence into a tree structure; in this structure, each “child” sequence is considered to be stochastically derived from its more abundant “parent” sequence with one mutation through sequencing errors.Mapping
EpigenomeA bioinformatic pipeline that scores epigenetic alterations according to strength and significance and links them to potentially affected genes.EpigenomicsBisulfite mappingR
Python
EpiGRAPHEpiGRAPH enables biologists to analyze genome and epigenome datasets with powerful statistical and machine learning methods. In a typical workflow, the user uploads a set of genomic regions of interest (e.g. experimentally mapped enhancers, hotspots of epigenetic regulation or sites exhibiting disease-specific alterations), and EpiGRAPH searches a large database of (epi-) genomic attributes for significant overlap and correlation with the regions in the input dataset. Furthermore, EpiGRAPH can predict the status of genomic regions that were not included in the input dataset.Epigenomics
Machine learning
Statistical calculationbrowser based
ERANGEERANGE is a Python package for doing RNA-seq and ChIP-seq.ChIP-seq
RNA-Seq alignment
RNA-Seq quantification
DNA transcription
RNAseq analysis
Chipseq analysis
Python
ERDSERDS is a free, open-source software, designed for detection of copy number variants (CNVs) on human genomes from next generation sequence data. It uses paired Hidden Markov models (PHMM) based on the expected distribution of read depth of short reads and the presence of heterozygous sites. ERDS is NOT good for whole exome data.Copy number estimation
ERGO Genome Analysis and Discovery SystemERGO provides a systems-biology informatics toolkit centered on comparative genomics to capture, query and visualize sequenced genomes. Building upon the most comprehensive genomic database available anywhere integrated with the largest collection of microbial metabolic and non-metabolic pathways and using Igenbio's proprietary algorithms, ERGO assigns functions to genes, integrates genes into pathways, and identifies previously unknown or mischaracterized genes, cryptic pathways and gene products.Transcriptomics
Metagenomics
Transcription factors and regulatory sites
Phylogenetics
Comparative genomics
SNP
Functional genomics
Gene structure
Exome analysis
Genome Wide Association Studies
Metabolic pathways
Sequence alignment
SNP detection
SNP annotation
Gene expression analysis
Pathway or network analysis
Sequence annotation
Sequence functional annotation
Metabolic network modelling
CommercialWeb
ERNEExtended Randomized Numerical alignEr for accurate alignment of NGS reads. It can map bisulfite-treated reads.Sequence alignment
Genomics
DNA methylation
Sequencing
Epigenetics
Read mapping
Bisulfite mapping
Bisulfite sequencing
sequence alignment
C++GPLv3Linux
Mac OS X
Windows
Error Correction Evaluation ToolkitEvaluation of error correction resultsData handlingSequencing quality controlPython
Perl
POSIX
Est2assemblyProcesses raw sequence data from Sanger or 454 sequencing into a hybrid de-novo assembly, annotates it and produces GMOD compatible output, including a SeqFeature database suitable for GBrowse.Genomics
RNA-Seq alignment
Sequence assembly
ESTcalcEstimation of project costs for RNA-Seq study.RNA-SeqPerl
EULEREULER-SR is a program for de novo assembly of reads. Contrary to the overlap-layout approach, EULER-SR uses a de Bruijn graph to construct an assembly. The assembly of a genome corresponds to an Eulerian path in the de Bruijn graph. Long (possibly erroneous) reads, and mate-pairs are used to determine parts of the correct Eulerian traversal in the assembly.Sequence assembly
Sequence assembly (de-novo assembly)
C++
Perl
Linux
ExomeCNVIdentifies copy number variation from targeted exome sequencing dataExome capture
Copy number estimation
R
ExomeCopyCNV detection from exome sequencing read depthCopy number estimation
Exome and whole genome variant detection
Exome analysis
simultaneous normalization and segmentationRGPL 2.0+Linux
Windows
Mac OS X
ExomePicksExomePicks is a program that suggests individuals to be sequenced in a large pedigree.
ExonerateVarious forms of alignment (including Smith-Waterman-Gotoh) of DNA/protein against a reference. Authors are Guy St C Slater and Ewan Birney from EMBL. C for POSIX.Sequence alignmentCGPLLinux
FAASTFlowspace Assisted Alignment Search ToolRead mappingLinux
FaBoxTools for splitting, joining and otherwise manipulating FASTA format sequence files.Sequence assembly
Phylogenetics
FACSRapid and accurate classification of sequences as belonging or not belonging to a reference sequence.MetagenomicsPerl
f
GPLv2Linux
FalconFALCON: experimental PacBio diploid assemblerSequence assemblySequence assembly
FAST: Fast Analysis of Sequences ToolboxThe FAST Analysis of Sequences Toolbox (FAST) is a set of Unix tools (for example fasgrep, fascut, fashead and fastr) for sequence bioinformatics modeled after the Unix textutils (such as grep, cut, head, tr, etc).Genomics
Sequence analysis
Sequence analysis
Sequence parsing
GNU/Linux
Mac OS X
FastQ ScreenFastQ Screen provides a simple way to screen a library of short reads against a set of reference libraries. Its most common use is as part of a QC pipeline to confirm that a library comes from the expected source, and to help identify any sources of contamination.Genomics
Transcriptomics
Read mapping
Sequencing quality control
Summarises the mapping of a library against a series of reference sequencesPerlGPLv3Linux
Mac OS X
Windows
FastQCFastQC aims to provide a simple way to do some quality control checks on raw sequence data coming from high throughput sequencing pipelines.Sequencing quality controlJavaGPLv3UNIX
Windows
FastQValidatorChecking that FastQ files are follows standardsQuality controlSequencing quality controlC++
FDMDetects differential transcription in RNA-Seq dataRNA-Seq quantification
FeatureCountsfeatureCounts is a very efficient read quantifier. It can be used to summarize RNA-seq reads and gDNA-seq reads to a variety of genomic features such as genes, exons, promoters, gene bodies and genomic bins. It is included in the Bioconductor Rsubread package and also in the SourceForge Subread package.Next Generation SequencingRead summarisationread summarisationR
C
GPLv3Linux 64
Mac OS X
Mac OS X; x86 64
FHiTINGS"FHiTINGS is designed for use in rapidly identifying, classifying, and parsing internal transcribed spacer (ITS) DNA sequences after a BLASTn search. This software is useful for fungal ecology studies using next generation sequencing (NGS)."Metagenomics
Comparative genomics
Sequence alignmentPythonCross-Platform
FigaroFigaro is a software tool for identifying and removing the vector from raw DNA sequence data without prior knowledge of the vector sequence.SequencingK-mer countingAMOSC++
Perl
Artistic LicenseUNIX
FilterProduces a filtered version of an sRNA dataset, controlled by several user-defined criteria, including sequence length, abundance, complexity, transfer and ribosomal RNA removal.WorkflowsSequence contamination filteringmulti-threadingJavaCustom LicenceLinux 64
Windows
Mac OS X
FindPeaks 3.1Findpeaks was developed to perform analysis of ChIP-seq experiments.ChIP-seqPeak callingGPLv3
FindPeaks 4.0 (Vancouver Short Read Package)The Vancouver Short Read Analysis Package (VSRAP) contains the FindPeaks application for Chip-Seq and RNA-Seq analysis, as well as utilities for SNP finding, working with aligned sequence files and a nascent database for storing SNPs across multiple libraries.Genomics
SNP detection
Peak calling
Formatting
Sequence alignment analysis
command lineJavaGPLLinux
Windows
Mac OS X
FLASHIdentifies paired-end reads which overlap in the middle, converting them to single long readsSequence assembly
Read pre-processing
combining forward and reverse readsCOpen SourceLinux 64
FlexbarFlexible barcode and adapter processing for next-generation sequencing platformsGenomicsAdapter removal
Sequence trimming
DNA barcoding
Quality control
Read pre-processing
Paired read support
separate barcode reads
multi-threading
C++GPLv3Linux
Windows
Mac OS X
FlowerTool for reformatting SFF files into other formats or tab-delimitedHaskell
FlowSimTool for simulating errors in 454 sequencing dataSequence error correction
Modelling and simulation
Haskell
FluxFluxCapacitor s a computer program to predict splice form abundancies from reads of an RNA-seq experiment. FluxSimulator can generate simulated data for testing RNA-seq pipelinesRNA-SeqModelling and simulation
ForgeDe novo assembly using a combination of next-generation and Sanger readsSequence assembly (de novo assembly)
Genomics
Sequence assembly
FragGeneScanApplication for finding (fragmented) genes in short readsGenomics
Metagenomics
C
Perl
GPL
FrameDPSensitive peptide detection on noisy matured sequences. A self-training integrative pipeline for predicting CDS in transcripts which can adapt itself to different levels of sequence qualities.RNA-Seq
FreClua frequency-based, de novo short-read clustering method that organizes erroneous short sequences originating in a single abundant sequence into a tree structure; in this structure, each “child” sequence is considered to be stochastically derived from its more abundant “parent” sequence with one mutation through sequencing errors. The root node is the most frequently observed sequence that represents all erroneous reads in the entire tree, allowing the alignment of the reliable representative read to the genome without the risk of mapping erroneous reads to false-positive positions.RNA-Seq alignmentRead mapping
FreebayesBayesian genetic variant detector (SNPs, indels, MNPs)GenomicsMIT
FREECA tool for control-free copy number alteration (CNA) detection using deep-sequencing data, particularly useful for cancer studies.Copy number estimationLinux
Linux 64
Windows
FusionAnalyserFusionAnalyser is a new graphical, event-driven tool dedicated to the identification of driver fusion rearrangements in human cancer through the analysis of paired-end high-throughput transcriptome sequencing data. Tested on Illumina. Requires short, paired-end sequences.Gene structure
High-throughput sequencing
Advanced and user-friendly analysis of RNA-seq data for fusion discoveryC#GPLv3Windows
Linux
FusionCatcherFusionCatcher searches for novel/known somatic fusion genes, translocations, and chimeras in RNA-seq data (paired-end or single-end reads from Illumina NGS platforms like Solexa/HiSeq/NextSeq/MiSeq) from diseased samples).RNA-Seq
Gene structure
Sequence alignmentPythonGPLv3
  • NIX
FusionHunterIdentifies gene fusions in RNA-Seq dataRNA-SeqPerl
C
Linux
Linux 64
FusionMapDetects fusion events in both single- and paired-end datasets from either RNA-Seq or gDNA-Seq studies and characterize fusion junctions at base-pair resolution.Gene structureSplit-read mappingC#Commercial
Freeware
Windows
Linux
Linux 64
FusionSeqIdentifies fusion transcripts from paired end RNA-Seq data.RNA-Seq
Gene structure
Sequence alignment analysisCCreative Commons - Attribution; Non-commercial 2.5Mac OS X
UNIX
Linux
FuzzypathAssemblerGenomicsSequence assembly
Sequence assembly (de-novo assembly)
G-Mo.R-SeqG-Mo.R-Se is a method aimed at using RNA-Seq short reads to build de novo gene models.RNA-Seq alignmentCeCILLLinux
G-SQZHuffman coding-based sequencing-reads specific representation scheme that compresses data without altering the relative order.File reformattingC++
Galaxy"Galaxy is an open, web-based platform for data intensive biomedical research. Whether on the free public server or your own instance, you can perform, reproduce, and share complete analyses. "Sequence assembly
Whole genome resequencing
Genomics
Comparative genomics
Functional genomics
Sequence assembly
Sequence alignment
Visualisation
Sequencing quality control
PythonCross-Platform
GalignIdentifies polymorphisms between sequence reads obtained using Illumina/Solexa technology and a reference genomeSNP detectionRead mappingGPL
GambitA cross-platform GUI for sequence visualization and analysis.VisualisationGPL 2.0+
Commercial
GAMESGAMES (Genomic Analysis of Mutations Extracted by Sequencing) is a tool for mining and prediction of functional effect of mutation.SNPSNP detection
Indel detection
SNP annotation
PerlLinux
GASSSTFast and accurate aligner for short an long readsSequence alignment
Mapping
Gapped alignment
short and long reads
C++CeCILLLinux
GASVSoftware for classification and comparison of structural variants measured via paired-end sequencing and/or array-CGH.Structural variationGPLv3
GATKThe Genome Analysis Toolkit (GATK) is a structured programming framework designed to enable rapid development of efficient and robust analysis tools for next-generation DNA sequencers. The GATK solves the data management challenge by separating data access patterns from analysis algorithms, using the functional programming philosophy of Map/ReduceSNP detectionLocalised reassembly
Statistical calculation
Java
Python
GBrowseGenome ViewerVisualisationGenome ViewerPerlOpen SourceLinux
Mac OS X
Windows
GeeFuDatabase tool for genomic assembly and feature dataGenomicsSequence assemblyRuby
GEMGEM is a java software tool to analyze transcription factor binding ChIP-seq/ChIP-exo data. It predicts binding events, performs de novo motif discovery and use the motif to improve the binding event calling. It calls binding events right at (or very close to ) the motif positions, deconvolves closely spaced homotypic binding events and accurately discovers binding motifs.ChIP-seqPeak calling
Sequence motif discovery
probabilistic mixture model
motif prior
multi-threading
JavaCommercial
Freeware
Cross-Platform
GEM libraryA set of very optimized tools for indexing/querying huge genomes/files. Provided so far: a very fast exact mapper, and an unconstrained split-mapperRead mappingC
Python
OCaml
GPLv3
GENALICE MAPFrom FASTQ to VCF in 30 min or less. Ultra-fast Next-Generation Sequencing (NGS) read alignment and variant calling solution.GenomicsRead mapping
Variant calling
ultrafast alignment and variant callingCommercialLinux
GENE-CounterGENE-counter is a computational pipeline for analyzing RNA-Sequencing (RNA-Seq) data for differential gene expressionRNA-SeqLinux
Mac OS X
Genedata ExpressionistGenedata Expressionist performs efficient and quality compliant analysis of next generation sequencing (NGS) and other genomic profiling dataGenomics
Transcriptomics
DNA methylation
ChIP-seq
SNP detection
Indel detection
RNA-Seq
Sequencing
Laboratory information management
Copy number estimation
Epigenetics
Gene structure
Read mapping
Sequence alignment
Annotation
Bisulfite mapping
Clustering
Genome visualisation
Sequencing quality control
Sequence analysis
Statistical calculation
Gene expression analysis
RNA-Seq analysis
Data processing
Data management
Data analysis
Data storage
Downstream analysis
Pipeline Management
Workflows
Reporting
Audit Trail
User Management
Data Visualisation
JavaCommercialWindows
Linux
GeneiousSearch, organize and analyze genomic and protein information of any size via desktop program that provides publication ready images to enhance the impact of your research.Sequence assembly (de novo assembly)
Genomics
RNA-Seq
Metagenomics
Epigenomics
Structural variation
Sequence analysis
Phylogenetics
Population genetics
Sequence assembly
Read mapping
Sequence alignment
Visualisation
Annotation
Sequence assembly validation
Genome visualisation
Variant calling
DNA barcoding
Sequence motif discovery
JavaCommercialWindows
Mac OS X
Linux
Solaris
GeneProfGeneProf is a web-based, graphical software suite and database resource for high-throughput-sequencing experiments (RNA-seq and ChIP-seq).ChIP-seq
RNA-Seq
Read mapping
Visualisation
Peak calling
Sequencing quality control
Differential expression analysis
User-friendly
wizards
tutorials
examples
very flexible
reproducible
transparent
extensible
API
Java
JavaScript
Commercial
Freeware
browser based
GeneTalkGeneTalk, a web-based platform, that can filter, reduce and prioritize human sequence variants from NGS data and assist in the interpretation of personal variants in clinical context.Structural variationAnnotation
Sequence contamination filtering
Sequence analysis
Variant calling
Genetic variation analysis
Plotting and rendering
Exome analysis
Variant classification
Easy-to-use point-and-click web interface
data visualization
data filtering
Fast
SNP annotation
SNP calling
Variant annotation and analysis
variant counting
Ruby
JavaScript
Freemium
Genomatix Mining Station (GMS)The Genomatix Mining Station (GMS) offers mapping of NGS reads onto genomes, transcriptomes and splice-junction libraries. It is a client-server based solution and can be controlled through an intuitive GUI or via command-line. It covers different tasks such as, as genomic positioning, SNP detection, splice analyses and genomic enrichments.ChIP-seq
SNP detection
RNA-Seq
Sequence assembly
Read mapping
SNP calling
Correlation
Client-server based system allows for command-line and web-based access. Grid engine is used for job scheduling and mapping is run on multiple cores. Can be combined with a Genomatix Genome Analyzer (GGA) for a fully integrated NGS solution.C++
Java
Flash
CommercialWindows
Mac OS X
Linux
Genome TraxGenome Trax™ enables you to identify human genome variations of functional significance by mapping your NGS data to known elements such as disease mutations and regulatory sites.SNP detection
Indel detection
Structural variation
Regulatory genomics
Commercial
GenomeBrowseA free genome browser for exploring sequencing pile-up and coverage data with numerous annotation tracks hosted on the cloud.Sequence assembly (de novo assembly)
Sequence alignment
Whole genome resequencing
Genomics
Sequencing
Sequence analysis
Genetics
Exome and whole genome variant detection
Exome analysis
Next Generation Sequencing
Visualisation
Sequence assembly visualisation
Sequence alignment visualisation
Windows
Linux
Mac OS X
GenomedataGenomedata is a format for efficient storage of multiple tracks of numeric data anchored to a genome. The format allows fast random access to hundreds of gigabytes of data, while retaining a small disk space footprint.StorageSignalPython
C
GPLLinux
Mac OS X
GenomeJackGenomeJack is a genome browser specialized in next-generation sequencing data. Advantages are intuitive interface and smooth drag'n drop response.Genomics
Personalised medicine
VisualisationJavaFreewareWindows
Mac OS X
Linux
GenomeMapperGenomeMapper is a short read mapping tool designed for accurate read alignments. It quickly aligns millions of reads either with ungapped or gapped alignments. It can be used to align against multiple genomes simulanteously or against a single reference.Read mapping
Sequence alignment
GenometaGenometa is a Java based local bioinformatics program which allows rapid analysis of metagenomic short read datasets. Millions of short reads can be accurately analysed within minutes and visualised in the browser component. A large database of diverse bacteria and archaea has been constructed as a reference sequence.Genomics
Metagenomics
Read mapping
Visualisation
Read mapping
Data Visualisation
JavaLinux
Mac OS X
Windows
GenomeToolsThe GenomeTools genome analysis system is a free collection of bioinformatics tools for genome informatics.1.3.6GenomicsIntegrated solutionCBSDPOSIX
Linux
Mac OS X
OpenBSD
Windows (Cygwin)
UNIX
GenomeViewGenomeView is a next-generation stand-alone genome browser and editor initiated in the BSB group at VIB and currently developed at Broad Institute. It provides interactive visualization of sequences, annotation, multiple alignments, syntenic mappings, short read alignments and more. Many standard file formats are supported and new functionality can be added using a plugin system.Genomics
Transcriptomics
Sequencing
Sequence analysis
Comparative genomics
Quality control
Visualisation
Data retrieval
Genome visualisation
Sequence alignment visualisation
Plotting and rendering
Visualisation of a multitude of genomics dataJavaGPLplatform-independent
GenomicToolsGenomicTools is a platform for the analysis and manipulation of high-throughput sequencing data such as RNA-seq and ChIP-seq.Genomics
ChIP-seq
RNA-Seq
Peak calling
Heat map generation
Genome comparison
create custom pipelines
feature overlaps
identify binding site peaks in ChIP-seq data
create read profiles
create read heatmaps
C
C++
GPLv2
GenoMinerA proprietary NGS analysis solution. Powerful hardware comes with preinstalled software, organized in workflows.Sequence assembly
Sequence assembly (de novo assembly)
ChIP-seq
RNA-Seq
Sequence assembly
Sequence alignment
Peak calling
Sequence error correction
Plotting and rendering
Mutation detection
Gene expression profiling
GenoMiner provide workflows for Reference assembly
De novo assembly
ChIPSeq
RNASeq and more. You upload your files at the beginning
and you get the results at the end
while you can choose from various tools to use for analysis.
JavaCommercialLinux
GenoREADGenoREAD is a web-based, sequence verification software that can be used to compare Sanger sequencing trace files against a reference sequence. Users can either submit their sequencing results one clone at a time, or they can submit a series of clones (as a project) to run at once. Results can be viewed online or downloaded.Sequencing
Clone verification
Sequence assembly
Read mapping
Sequence alignment
Perl
PHP
JavaScript
Linux 64
GenoViewerA feature rich NGS assembly viewer/browser.Viewerlarge file loading
multicontig handling
SNP/InDel/Read Error display and search
mutation table generation and export
consensus sequence generation and export
JavaFreewareplatform-independent
GensearchNGSA user friendly framework for re-sequencing in a diagnostics context: searching for mutations/variants, especially on well known genes.Exome captureSequence alignment
Sequence alignment visualisation
Read alignment
Variant prioritisation
Mutation detection
Plugin framework
Cafe Variome submission
JavaCommercialUNIX
Windows
GenVisionGenVision is a genomic visualization software package that is fully integrated with Lasergene and is designed to support easy generation of publication quality graphics and maps.GenomicsVisualisationCommercialWindows
Mac OS X
GeoseqInstead of mapping the reads to reference genomes or sequences, Geoseq maps a reference sequence against the sequencing data. It is web-based, and holds pre-computed data from public libraries.ResequencingRead mapping
GigaBayesA short-read SNP and short-INDEL discovery program.Genomics
SNP detection
SNP calling
GimmeMotifsGimmeMotifs is a de novo motif prediction pipeline, especially suited for ChIP-seq datasets. It incorporates several existing motif prediction algorithms in an ensemble method to predict motifs and clusters these motifs using the WIC similarity scoring metric.ChIP-seq
Epigenomics
Gene regulation
Sequence motif comparisonPythonMITLinux
GirafeThe R/Bioconductor package girafe facilitates the functional exploration of alignments of sequence reads from next-generation sequencing data to a genome. It allows users to investigate the genomic intervals together with the aligned reads and to work with, visualise and export these intervals.Sequence alignmentR
Gk arraysGk-arrays are a data structure to index the k-mers in a collection of reads.Genomics
Transcriptomics
Metagenomics
Sequence assembly
Read mapping
Sequence error correction
programming libraryC++CeCILL-C licenseLinux
Linux 64
Mac OS X
any
GlobalSeqTesting for association between RNA-Seq and high-dimensional dataRNA-Seq
ChIP-Seq
MicroRNA-Seq
Meth-Seq
Omnibus test
Global test
Joint significance test
RGPL-3cross-platform
GMAPGMAP (Genomic Mapping and Alignment Program) for mRNA and EST Sequences.Read mapping
Sequence alignment
C
Bourne shell
UNIX
GnumapThe Genomic Next-generation Universal MAPper (gnumap) is a program designed to accurately map sequence data obtained from next-generation sequencing machines (specifically that of Solexa/Illumina) back to a genome of any size. Currently, gnumap is designed to be used with the _int.txt data received from the Solexa/Illumina machine.Read mappingC++
Goby frameworkGoby is a next-gen data management framework designed to facilitate the implementation of efficient next-gen data analysis pipelines.RNA-SeqFile reformattingJavaGPLv2
Golden HelixGolden Helix is a bioinformatic software provider and analytic service provider.Genomics
SNP detection
Epigenomics
Sequencing
Copy number estimation
Data handling
Annotation
Genome visualisation
Sequencing quality control
Sequence contamination filtering
Variant calling
Statistical calculation
Collapsing methods
Variant classification
Windows
Linux
Mac OS X
GoseqAn R package to detect Gene Ontology (GO) categories and other categories of genes (such as KEGG pathways) that are over/under represented in an RNA-seq data.RNA-Seq quantificationGene set testingRLGPLUNIX
Windows
GowindaGowinda: unbiased analysis of gene set enrichment for Genome Wide Association StudiesGenomics
Sequencing
Population genetics
Population Genomics
Ontology comparison
Functional enrichment
Genome-wide association study
MulticoreJavaMozilla Public LicenseMac OS X
Linux
Windows
GPSGPS is a high spatial resolution peak detection algorithm for ChIP-seq data.Genomics
ChIP-seq
Epigenomics
Transcription factors and regulatory sites
Peak callingmulti-threadingJavaCommercial
Freeware
Cross-Platform
GPSeqAnalyze RNA-seq data to estimate gene and exon expression, identify differentially expressed genes, and differentially spliced exonsRNA-Seq quantificationR
C
GRIDSSGRIDSS is a structural variant caller which combines whole genome breakend assembly with read pair and split read support using a probabilistic model.Genomics
Structural variation
Genetic variation analysis
Sequence assembly
Split-read mapping
Read alignment
Assembly
Whole genome breakend assembly
JavaGPLv3Cross-Platform
GRSReference-based data compression for storage of resequencing dataFile reformattingsequence compressionC
Bourne shell
Commercial
Freeware
Linux
Linux 64
GSNAPGSNAP can align both single-end and paired-end reads as short as 14 nt and of arbitrarily long length. It can detect short- and long-distance splicing, including interchromosomal splicing, in individual reads using probabilistic models or a database of known splice sites. Our program also permits SNP-tolerant alignment to a reference space of all possible combinations of major and minor alleles, and can align reads from bisulfite treated DNA for the study of methylation state.DNA methylation
RNA-Seq alignment
Read mapping
Bisulfite mapping
C
Perl
Hairpin AnnotationGenerates a secondary structure from an RNA sequence and highlights regions of interest using RNAplotWorkflowsJavaCustom LicenceLinux 64
Windows
Mac OS X
HaplowserHaplowser: comparative haplotype browser for personal genome and metagenomeVisualisation
Haplotype inference
JavaGPL
HawkEyeAn interactive visual analytics tool for genome assemblies.Sequence assembly visualisationSequence assembly visualisationC++Artistic LicenseLinux
Mac OS X
HeliSphereOpen-source LINUX software package intended for use in analyzing data produced by the HeliScope Single Molecule Sequencer.Whole genome resequencing
Genomics
SNP detection
RNA-Seq
MappingFreewareLinux
HIProgram for haplotype reconstruction from paired-end reads.Haplotype reconstructionJava
HicupA mapping pipeline for HiC interaction data. Performs independent mapping on each end of the interaction pair and removes commonly found artefacts.EpigenomicsRead mappingPerlGPLv3UNIX
Linux
Mac OS X
HINTHMM-based Identification of TF FootprintsEpigenomics
Transcription factors and regulatory sites
Regulatory genomics
Peak calling
Nucleic acid sequence feature detection
Digital Genomic FootprintingPythonGPLv3Unix-like
HiPipeHiPipe is to make NGS data analysis quick and easy with high performance pipelines and intuitive web GUI.GenomicsRead mapping
Variant calling
Workflows
JavaScript
Java
Bash
platform-independent
HiTECAn algorithm which provides a highly accurate, robust, and fully automated method to correct reads produced by high-throughput sequencing methods.Sequence error correctionC++GPLv3Linux
HMMSplicerSplice junction discovery in RNA-Seq dataRNA-Seq alignmentPython
HPeakHidden Markov model (HMM)-based Peak-finding algorithm for analyzing ChIP-seq data to identify protein-interacting genomic regions.ChIP-seq
HTSeqPython framework to process and analyse high-throughput sequencing (HTS) dataPythonGPLv3
Hybrid-SHRECImproves sequence data quality using information from multiple platforms.Sequence error correctionJava
IBD2Our algorithm uses a non-homogeneous hidden Markov model (HMM) that employs local recombination rates to identify chromosomal regions that are identical by descent (IBD=2) in children of consanguineous or non-consanguineous parents solely based on genotype data of siblings derived from high-throughput sequencing platforms.Exome captureR
Java
IbisIbis (Improved base identification system), is an accurate, fast and easy-to-use base caller for the Illumina sequencing system, which significantly reduces the error rate and increases the output of usable reads. Ibis is faster and makes fewer assumptions about chemistry and technologySequencingBase-callingStatistical learning of base calling parameters and calibrated quality scoringPython
C
C++
Non-commercialLinux
Windows (Cygwin)
ICORNIteratively aligns deep coverage of short sequencing reads to correct errors in reference genome sequences and evaluate their accuracy.Sequence assembly
Sequencing quality control
IDBAIDBA (Iterative De Bruijn graph short read Assembler) is a short read assembler based on iterative De Bruijn graph. It is developed under 64-bit Linux, but should be suitable for all unix-like systemSequence assembly (de novo assembly)Sequence assemblyPOSIX
Linux
Linux 64
IGVThe Integrative Genomics Viewer (IGV) is a high-performance visualization tool for interactive exploration of large, integrated datasets. It supports a wide variety of data types and format, including short-read alignments in the SAM/BAM format. Data can be viewed from local files or over the web via http.GenomicsVisualisationJavaLGPLWindows
Mac OS X
Linux
IlluminateAnalytics toolkit in Python for Illumina HiSeq and MiSeq metricsGenomicsSequencing quality controlobject-oriented access to results of binary parsing
some command line support
PythonMITUnix-like
IlluminatorSoftware for machines running Windows to identify variants in Illumina short read data.SNP detection
Indel detection
IMAGE“Iterative Mapping and Assembly for Gap Elimination”. IMAGE closes gaps in a draft assembly using Illumina paired-end reads.Sequence assembly editing
InchwormEmploys the Kmer graph method to reconstruct (in many cases full-length) transcripts from Illumina RNA-Seq (preferrably strand-specific) reads.Sequence assembly (de-novo assembly)
RNA-Seq
Transcriptome assembly
InGAPinGAP is an integrated platform for next-generation sequencing project, the core function of which is to detect SNPs and indels using a Bayesian algorithm.SNP detectionRead mapping
Sequence assembly visualisation
Ingenuity Variant AnalysisIngenuity Variant Analysis is a web application that helps researchers studying human disease to identify causal variantsWhole genome resequencing
Genomics
SNPs
Exome
Variant classification
Integrated Genome BrowserVisualisation software for next-generation genomicsGenomicsVisualisationJavaOpen Sourceplatform-independent
IobioReal-time Genomic Analysis iobio uses immediate visual feedback to make understanding complex genomic datasets more intuitive, and analysis more interactiveWorkflowsVisualisation
IOmicsiOmics is a cloud based workflow analysis framework for managing, analyzing and visualizing NGS data.Genomics
Transcriptomics
ChIP-seq
RNA-Seq
Regulatory RNA
Exome capture
Epigenomics
Sequence assembly
Sequence alignment
Ab-initio gene prediction
Variant calling
MiRNA analysis
Exome analysis
Commercialcloud
IQSeqIntegrated Isoform Quantification Analysis based on A Partial Sampling FrameworkRNA-Seq quantification
Alternative splicing
C++
ISAACISAAC comprises of genome aligner and variant caller, by Illumina.Runtime SpeedC++Linux 64
IsasFast aligner for color and base space short read data.Sequence alignmentLinux
IsoEMExpectation maximization algorithm for estimating alternative splicing isoform frequenciesAlternative splicingExpectation MaximisationJava
ISSAKEShort Sequence Assembly by K-mer search and 3' read Extension, Immunology version (iSSAKE)MetagenomicsSequence assemblyPerl
Python
GPLv2
JBrowseSlick, speedy genome browser with a responsive and dynamic AJAX interface for visualization of genome data. Being developed by the GMOD project as a successor to GBrowse.VisualisationPerl
JavaScript
Open Sourcebrowser based
JellyfishFast, memory-efficient k-mer counting algorithmC++GPLv3Linux 64
Mac OS X
JointSLMCopy number estimation from read depth informationCopy number estimationR
KARMAK-tuple Alignment with Rapid Matching AlgorithmDNA methylation
Sequencing
Epigenetics
Read mapping
KBASE"KBase provides a computational framework and tools for integrating and analyzing large, diverse datasets generated by the scientific community to advance predictive understanding, manipulation, and design of biological processes in an environmental context. The purpose of KBase is to enable users to integrate a wide spectrum of genomics and systems biology data, models, and information for microbes, microbial communities, and plants. Powerful tools within KBase will allow users to analyze and simulate data to predict biological behavior, generate and test hypotheses, design new biological functions, and propose new experiments. "Comparative genomicsAnnotationLinux
KismethWeb-based tool for bisulfite sequencing analysisDNA methylation
Epigenomics
Bisulfite mapping
KissnpkisSnp compares two sets of NGS raw reads, detecting Single Nucleotide Polymorphism occurring between the two sets. The two sets typically come from the sequencing of two individuals from the same species or from closely related species.Transcriptomics
SNP detection
Indel detection
Comparative genomics
Sequence assembly
Sequence assembly (de-novo assembly)
Data retrieval
SNP callingCCeCILLLinux
KmergenieKmerGenie estimates the best k-mer length for genome de novo assembly. Given a set of reads, KmerGenie first computes the k-mer abundance histogram for many values of k. Then, for each value of k, it predicts the number of distinct genomic k-mers in the dataset, and returns the k-mer length which maximizes this number. Experiments show that KmerGenie's choices lead to assemblies that are close to the best possible over all k-mer lengths.Sequence assemblySequence assemblyC++
Python
R
KNIMESoftware for organizing bioinformatic workflowsWorkflowsGPLv3Windows
Mac OS X
Linux
Knime4Biocustom nodes for the interpretation of Next Generation Sequencing data with KNIME.Genomics
Transcription factors and regulatory sites
Gene regulation
Data retrievalKNIMEJavaGPLv3any
KronaKrona creates interactive HTML5 charts of hierarchical data (such as taxonomic abundance in a metagenome).MetagenomicsVisualisationInteractive
Animation
HTML5 canvas graphics
JavaScript
Perl
Linux
UNIX
Mac OS X
Lab7Data workflow management platform to streamline NGS analysesGenomics
Laboratory information management
WorkflowsPython
JavaScript
CommercialMac OS X
Linux
LasergeneLasergene is a comprehensive DNA and protein sequence analysis software suite comprised of seven applications which include functions ranging from sequence assembly and SNP detection, to automated virtual cloning and primer design.Sequence assembly (de novo assembly)
Genomics
SNP detection
Indel detection
Mapping
Transcription factors and regulatory sites
Sequence analysis
Phylogenetics
Protein structure analysis
Sequence assembly
Read mapping
Sequence alignment
Peak calling
Annotation
Sequence analysis
Scaffolding
Sequence alignment analysis
Chromatogram visualisation
PCR primer design
CommercialWindows XP
Windows 7 or Higher
Mac OS X 10.7
10
8
or 10.9
LASTShort read alignment program incorporating quality scoresGenomics
Comparative genomics
Sequence alignmentC++GPLv3
LASTZA tool for (1) aligning two DNA sequences, and (2) inferring appropriate scoring parameters automaticallyGenomicsRead mapping
Sequence alignment
Mac OS X
Linux
LobSTRlobSTR is an alignment and genotyping tool for profiling short tandem repeats from next generation sequencing dataSequencingRepeat sequence organisation analysisFast
Scalable
sequence alignment
Gapped alignment
C++
R
Python
FreewareUNIX
LOCASLOCAS low-coverage short-read assemblerSequence assemblyC++Linux
LookSeqAJAX-based browser for deep sequencing dataSequence assembly visualisation
LSKB, Life Science Knowledge BankLSKB Life Science Knowledge Bank, is a comprehensive drug discovery and genomic research workbench, knowledgebase, & data management system.GenomicsAnnotation
SNP calling
Sequence analysis
Differential expression analysis
Pathway or network analysis
Classification
Analysis pipeline
Functional analysis
MACSModel-based Analysis of ChIP-seq data.ChIP-seqPeak callingPythonArtistic Licenseplatform-independent
MagicViewerLarge-scale short reads and sequencing depth visualization.Sequence assembly (de novo assembly)
Exome capture
Genetic variation
Visualisation
SNP calling
Javaplatform-independent
MapDamageIdentifies and quantifies DNA damage patterns in ancient DNASequencing
DNA
Sequencing quality control
Statistical calculation
Python
R
Linux
Mac OS X
MapNextMapNext provides four mainly analysis: (i) unspliced alignment and clustering of reads, (ii) spliced alignment of transcriptomic reads, (iii) SNP detection and calculation of SNP frequency from population sequences, and (iv) storage of result data into database to make it available for more flexible query and further analyses.RNA-Seq alignment
SNP detection
Sequence alignmentC++
Perl
MapsemblerMapsembler is a targeted assembly software. It takes as input a set of NGS raw reads and a set of input sequences (starters). It first determines if each starter is read-coherent, e.g. whether reads confirm the presence of each starter in the original sequence. Then for each read-coherent starter, Mapsembler outputs its sequence neighborhood as a linear sequence or as a graph, depending on the user choice.Transcriptomics
Metagenomics
Exome capture
RNA-Seq quantification
Sequencing
Sequence assembly
Read mapping
De novo assembly
Identify Novel Exons
Remove contaminants
Detect enzymes in metagenomics NGS reads
CCeCILLLinux
MapSpliceWe introduce a second generation splice detection algorithm, MapSplice, whose focus is high sensitivity and specificity in the detection of splices as well as CPU and memory efficiency. MapSplice can be applied to both short (&lt;75 bp) and long reads (75 bp). MapSplice is not dependent on splice site features or intron length, consequently it can detect novel canonical as well as non-canonical splices. MapSplice leverages the quality and diversity of read alignments of a given splice to increase accuracy.RNA-Seq alignmentMappingC++
Python
Linux
MapViewVisualisation of short reads alignment on desktop computerVisualisationLinux
Windows
MAQMapping and Assembly with Qualities (renamed from MAPASS2). Particularly designed for Illumina-Solexa 1G Genetic Analyzer, and has preliminary functions to handle ABI SOLiD data.Genomics
SNP detection
Read mappingC++
Perl
GPL
MAQGeneComplete pipeline for mutant discovery, with web front endSNP detectionRead mapping
Integrated solution
MARGARITASNP detection and genotyping from low-coverage sequencing dataSNP detection
Genotyping
MasonA fast, feature-rich and hackable read simulator for the simulation of NGS and Sanger data.GenomicsSequence assembly
Mapping
Modelling and simulation
Empirical or simple model for position dependent errors
can write out sample position and extensive information about the sampled infix
haplotype simulation through mutation of reference sequence.
C++GPLv3UNIX
Windows
MauveMauve Genome Alignment software, for comparing two or more draft or finished genomesGenomics
Transcriptomics
Visualisation
Sequence assembly validation
Sequence alignment comparison
C++
Java
GPLMac OS X
Windows
Linux
MAXIMUSHybrid reference and de novo assembly pipelineGenomicsSequence assembly
MAYDAYExtensible platform for visual data exploration and interactive analysis and provides many methods for dissecting complex transcriptome datasets.RNA-SeqVisualisation
MeerkatMeerkat is designed to identify structure variations (SVs) from paired end high throughput sequencing data.Structural variation
MEGANMetagenome Analysis Software - MEGAN (âMEtaGenome ANalyzerâ) is a new computer program that allows laptop analysis of large metagenomic datasets. In a preprocessing step, the set of DNA reads (or contigs) is compared against databases of known sequences using BLAST or another comparison tool. MEGAN can then be used to compute and interactively explore the taxonomical content of the dataset, employing the NCBI taxonomy to summarize and order the results.Metagenomicsmetagenomic analysis
functional classification
Commercial
Freeware
MegraftMegraft is a software tool to graft ribosomal small subunit (16S/18S) fragments from metagenomes onto full-length SSU sequences, enabling accurate diversity estimates from fragmentary and non-overlapping sequence data.Metagenomics
Sequence analysis
Phylogenetics
Community analysis
Rarefaction
Sequence analysisPerlGPLv3Linux
UNIX
Mac OS X
MeraculousDe novo genome assembler from short readsSequence assemblyDe novo assembly
scaffolding
Perl
C
METAGENassistUser-friendly, web-based analytical pipeline for comparative metagenomic studies. Input can be derived from either 16S rRNA data or NextGen shotgun sequencing.Metagenomics
Machine learning
Visualisation
Statistical calculation
Sequence clustering
Easy-to-use point-and-click web interface; data visualization; publication-quality graphs and charts; wide variety of statistical methods; taxon-to-phenotype mapping; data filtering and normalization; supports many common input formats
Metagenome@KinMetagenome@Kin is an automated high throughput sequencing data analysis tool for the 16S/28S/ITS rRNA genes. It measures the composition and diversity of microbial and fungal species in natural environments.MetagenomicsSpecies frequency estimation
Statistical analysis
MetaSimThe software can be used to generate collections of synthetic reads.Genomics
Metagenomics
Sequence assembly
Mapping
Modelling and simulation
JavaCommercial
Freeware
MetaxaMetaxa uses Hidden Markov Models to identify, extract and classify small-subunit (SSU) rRNA sequences (12S/16S/18S) of bacterial, archaeal, eukaryotic, chloroplast and mitochondrial origin in metagenomes and other large sequence setsMetagenomics
Sequence analysis
Phylogenetics
Community analysis
Sequence analysis
Hidden Markov Model
PerlGPLv3Linux
UNIX
Mac OS X
MethMarkerMethMarker facilitates the design of DNA methylation assays for COBRA, bisulfite SNuPE, bisulfite pyrosequencing, MethyLight and MSP. It also implements a systematic workflow for design, optimization and (computational) validation of DNA methylation biomarkers. This workflow starts from a preselected differentially methylated region (DMR) and results in an optimized DNA methylation assay that is ready to be tested in a large-scale clinical trial.DNA methylation
Epigenomics
JavaWindows
Linux
Mac OS X
Solaris
MethpipeThe MethPipe software package is a computational pipeline for analyzing bisulfite sequencing data (BS-seq, WGBS and RRBS). MethPipe provides tools for mapping bisulfite sequencing read and estimating methylation levels at individual cytosine sites. Additionally, MethPipe includes tools for identifying higher-level methylation features, such as hypo-methylated regions (HMR), partially methylated domains (PMD), hyper-methylated regions (HyperMR), and allele-specific methylated regions (AMR).DNA methylation
Sequencing
Epigenetics
Bisulfite mappingC++GPL (>= 3)Linux
Mac OS X
MethylCoderPipeline for fast, simple processing of BiSulfite-treated reads into methylation data. Includes scripts for analysis and visualization. In addition to a binary output, the direct output of methylcoder is a text file that indicates per-nucleotide methylation context (CG/CHG/CHH) and methylation levels (both coverage and C-T conversions)Genomics
DNA methylation
Epigenomics
Sequencing
Read mapping
Bisulfite mapping
Python
C
BSDLinux
Linux 64
Mac OS X
MetMapProduces corrected site-specific methylation states from MethylSeq experiments and annotates unmethylated islands across the genome.DNA methylation
MeVVisualisation of genomic data, Differential Gene Expression based on DEGseq, DESeq and edgeRRNA-SeqVisualisation
Differential expression analysis
Sequence clustering
Classification
Artistic License
MG-RASTMG-RAST is a fully-automated service for annotating metagenome samples providing analysis tools for comparisonMetagenomics
Phylogenetics
Metabolic reconstruction
AnnotationPerl
C
GO
JavaScript
Python
UNIX
MicroRazerSMicroRazerS is a tool optimized for mapping short RNAs onto a reference genome.Read mappingC++Linux
Microsoft Biology FoundationC#/.NET library for biological applications.C#
MICSACombines positional information with information on motif occurrences to better predict binding sites of transcription factors (TFs)ChIP-seq
Sequence motifs
Sequence motif comparison
MiniaDe novo assembly of human genomes on a desktop computerSequence assembly (de novo assembly)Sequence assemblyMemory efficient and fastC++CeCILLLinux
Mac OS X
MIP ScaffolderMIP Scaffolder is a program for scaffolding contigs produced by fragment assemblers using mate pair data.ScaffoldingC++
Perl
Linux
MIRAMIRA 3 - Whole Genome Shotgun and EST Sequence AssemblerSequence assembly (de novo assembly)
RNA-Seq alignment
SNP detection
Sequence assembly
Read mapping
Local sequence alignment
K-mer counting
Graph reduction
Learning algorithm
C++GPLLinux
Mac OS X
UNIX
MiRanalyzerWeb-server for identifying and analyzing miRNA in next-gen sequencing experimentsRegulatory RNAAnnotation of micro RNA
differential expression
Java
Perl
browser based
MiRCatPredicts mature miRNAs and their precursors from an sRNA dataset and a genome.WorkflowsMicroRNA detectionDetection and prediction of known or novel miRNAs
secondary structure generation
JavaCustom LicenceLinux 64
Windows
Mac OS X
MiRDeepDiscovering known and novel miRNAs from deep sequencing dataRegulatory RNAPerl
MiRNAkeyA software pipeline for the analysis of microRNA Deep Sequencing dataRegulatory RNAJava
Perl
Linux
Mac OS X
MiRProfDetermines normalised expression levels of sRNAs matching known miRNAs in miRBase.WorkflowsMicroRNA detectionJavaCustom LicenceLinux 64
Windows
Mac OS X
MiRspringmissingPerl
JavaScript
GNU
MirToolsWeb server for microRNA profiling and discovery based on high-throughput sequencingTranscriptomics
Regulatory RNA
Perl
PHP
MirTriosmirTrios is a web server to accurately detect de novo mutations (DNMs) based on Expectation-maximization (EM) model. mirTrios also surports identification of rare inherited mutations, known diagnostic variants, as well as the prioritisation of novel and promising candidate genes.Genomics
Sequence analysis
De novo mutation detection
Read mapping
Variant calling
De novo mutation detection
Perl
PHP
JavaScript
Linux
MISOAn alternative to Cufflinks, MISO (Mixture-of-Isoforms) is a probabilistic framework that quantitates the expression level of alternatively spliced genes.RNA-Seq
RNA-Seq quantification
Alternative splicing
MlgtProcessing and analysis of high throughput, long-read (e.g. Roche 454) sequences generated from multiple loci and multiple biological samples. Sequences are assigned to their locus and sample of origin, aligned and trimmed. Where possible, genotypes are called and variants mapped to known alleles.Exome capture
Genotyping
Resequencing
Sequence error correction
Sequence contamination filtering
Sequence analysis
Read alignment
DNA barcoding
Sequence assignment
sequence alignment
allignment error correction
variant counting
genotype calling
allele-matching
RGPL >=2Windows
UNIX
Mac OS X
MMSEQPipeline and methodology for simultaneously estimating isoform expression and allelic imbalance in diploid organisms using RNA-seq data.DNA transcriptionC++Mac OS X
Linux 64
MochiViewHybrid genome browser and motif visualization/analysis/management desktop software.Genomics
ChIP-seq
RNA-Seq
ChIP-on-chip
Sequence motifs
Genome visualisation
Sequence motif comparison
Desktop hybrid genome browser and motif visualization/analysis softwareJavaLinux
Mac OS X
Windows
MoDILProgram to detect small indels in next generation sequencing dataGenomics
Indel detection
Python
MOMShort-read mappingGenomicsRead mapping
MOSAIKReference guided aligner/assembler.Sequence assemblyC++Commercial
GPLv2
Windows
Linux
Mac OS X
MotifLabMotifLab is a general workbench for analysing regulatory sequence regions and discovering transcription factor binding sites and cis-regulatory modules within a rich graphical interface.ChIP-seq
Sequence motifs
Transcription factors and regulatory sites
Sequence analysis
Gene regulation
Visualisation
Peak calling
Genome visualisation
Sequence contamination filtering
Sequence motif discovery
Format conversion
Complete motif discovery workbench: import DNA sequences and download additional data tracks from UCSC Genome Browser
visualize tracks in the integrated genome browser
perform de novo motif discovery or search for instances of known motifs
use operations to pre- and post-process data tracks
explore the data interactively with various analysis tools
automatically record all your steps in protocols that can be executed as workflows
JavaFreewareplatform-independent
MPscanMPscan (multi-pattern scan) is a program for mapping short reads (<30bp) exactly on a set of reference sequences (eg, a genome) without indexing the reference. MPscan performs only exact mapping (no substitution, nor indels), is fast (optimal complexity), and easy to use.Genomics
Transcriptomics
Read mappingC++Linux
Mac OS X
MrBayes"MrBayes is a program for Bayesian inference and model choice across a wide range of phylogenetic and evolutionary models. MrBayes uses Markov chain Monte Carlo (MCMC) methods to estimate the posterior distribution of model parameters."PhylogeneticsStatistical calculationCross-Platform
MrCaNaVaRmrCaNaVaR is a copy number caller that analyzes the next-generation sequence mapping read depth to discover large segmental duplications and deletions. It also has the capability of predicting absolute copy numbers of genomic intervals.Genomics
Personalised medicine
Copy number estimation
Read depth analysisCCommercial
Freeware
POSIX
MrFASTmrFAST is designed to map short reads generated with the Illumina platform to reference genome assemblies; in a fast and memory-efficient manner.Genomics
Read alignment
Read alignmentCBSDUNIX
MrsFASTmrsFAST is a micro-read substitution-only Fast Alignment Search Tool. mrsFAST is a cache-oblivous short read mapper that optimizes cache usage to get higher performance.GenomicsMapping
Read alignment
CBSDUNIX
MTRMetagenomics software for clustering at multiple ranks.MetagenomicsC++
Matlab
MU2AGenomic variant annotation toolSNPSNP annotationJavaApache License 2.0Windows
Linux
Mac OS X
MultiPSQMultiPSQ expedites the analysis and evaluation of multiplex-pyrograms, generated from multiplex pyrosequencingPhylogenetics
Microbial Surveillance
Public health and epidemiology
Peak calling
Classification
Peptide mass fingerprint
C++LGPLv3Windows
Linux
MultiQCMultiQC aggregates results from multiple bioinformatics analyses across many samples into a single report. It searches a given directory for analysis logs and compiles a HTML report. It's a general use tool, perfect for summarising the output from numerous bioinformatics tools.Next Generation SequencingQuality controlData VisualisationPythonGPLv3Windows
Mac OS X
Linux
MUMmerMUMmer is a modular system for the rapid whole genome alignment of finished or draft sequence. Basically it is a ultra-fast alignment of large-scale DNA and protein sequencesGenomics
Transcriptomics
Read mapping
Sequence alignment
Artistic LicenseLinux
MUMmerGPUMUMmerGPU is a low cost, ultra-fast sequence alignment program designed to handle the increasing volume of data produced by HTS.Genomics
Transcriptomics
Sequence alignment
MuMRescueLiteProbabilistically reincorporates multi-mapping tags into mapped short read data.Genomics
ChIP-seq
MappingPythonMIT
MuSICA 2Assembles millions of short (36-nucleotide) reads collected from a single flow cell lane of Illumina Genome Analyzer to shotgun-sequence ~800 human full-length cDNA clones.Clone verificationSequence assemblyPerl
MutascopeMutascope is a software suite designed to analyze data from high throughput sequencing of PCR amplicons, with an emphasis on normal-tumor comparison for the accurate and sensitive identification of low prevalence mutations.Cancer biologyVariant calling
Analysis pipeline
PerlUNIX
MuTectMuTect is a method developed at the Broad Institute for the reliable and accurate identification of somatic point mutations in next generation sequencing data of cancer genomes.SNP calling
MyrialignSoftware to align short reads produced by a short read genome sequencer to a reference genome. Alignments can contain any number of SNPs, insertions and deletions, up to a user specified cutoff. Myrialign can use a Cell Broadband Engine processor to accelerate alignments if available, for example on a PlayStation 3 running GNU/Linux.

Myrialign performs brute force alignment using a variant on the "bitap" algorithm that aligns several thousand reads to a reference in parallel. It uses bit-parallelism, multiple processors, and Cell SPUs if available.

Unlike other reference genome alignment software, heuristics and hashtable lookups are not used. Myrialign will find alignments with any number of errors up to a user specified cutoff. The emphasis is on doing a 100% accurate search as fast as is possible.
Read mapping
Sequence alignment
MyrnaMyrna is a cloud computing tool for calculating differential gene expression in large RNA-seq datasets.RNA-Seq alignment
RNA-Seq quantification
MzipReference-based sequence data compression toolFile reformatting
NarrowPeaksAnalysis of variation in ChIP-seq using functional PCAChIP-seqPeak calling
Differential Binding
R/Bioconductor package
can run on major computer platforms
RArtistic-2.0Linux
Mac OS X
Windows
NCBI Genome Workbench"NCBI Genome Workbench is an integrated application for viewing and analyzing sequence data. With Genome Workbench, you can view data in publically available sequence databases at NCBI, and mix this data with your own private data."Sequencing
Sequence analysis
Sequence annotation
Next Generation Sequencing
Visualisation
Genome visualisation
Cross-Platform
NepheleNIAID, a cloud based tool for microbiome or metagenomics data analysis. Nephele allows you to process and analyze 16S raw data using pipelines based on QIIME and mothur. It also facilitates WGS functional analyses using bioBakery tools such as MetaPhlAn, HUMMAnN and metagenomic assembly using the a5-miseq/UDBA-UD.Metagenomics
Microbial Ecology
Microbiome
Functional analysis
Taxonomic profiling
Metagenomic Assembly
Pipelines
NesoniNesoni is a high-throughput sequencing data analysis toolset.RNA-Seq alignment
SNP detection
Phylogenetics
Sequence alignmentlargely for bacterial genomesPython
NewblerThe assembly/mapping program developed by 454 Life Sciences for of 454 dataSequence assembly (de novo assembly)Sequence assembly
Read mapping
C++UnknownLinux 64
NexalignNexalign is a program to align millions of short reads from next-generation sequencing data sets to reference genomesRead mappingC++
R
GPLUNIX
NextGen Utility ScriptsA collection of links to scripts available for working with data generated by new sequencing technologies.A collection of many different scripts
NextGENede novo and reference assembly of Roche/454, Illumina and SOLiD data. Uses a novel Condensation Assembly Tool approach where reads are joined via "anchors" into mini-contigs before assembly which reduces sequencing errors. Requires Win or MacOS.Sequence assembly (de novo assembly)
SNP detection
Indel detection
Metagenomics
Exome capture
Unique condensation tool
Data Visualisation
very flexible
C++CommercialWindows
Ngs backbonengs_backbone is a bioinformatic application created to work on sequence analysis by using NGS (Next Generation Sequencing) and sanger sequences. It is capable of cleaning reads, do de novo assembly or mapping against a reference and annotate SNPs, SSRs, ORFs, GO terms and sequence descriptions.Genomics
SNP detection
Sequence assembly
Read mapping
AGPLUNIX
NGS-DesignToolsTools to assist in designing deep sequencing experiments for haplotype reconstruction and structural variant breakpoint detectionRNA-Seq quantification
Structural variation
Haplotype inference
Modelling and simulation
Ngs-pipelineComplete solution for human re-sequencing projectsSNP detection
Indel detection
Epigenomics
Structural variation
Personalised medicine
Read mapping
Sequence annotation
PerlGPLv3Linux
Ngs.plotngs.plot is a program that allows you to easily visualize your next-generation sequencing (NGS) samples at functional genomic regions. The signature advantage of ngs.plot is that it collects a large database of functional elements for many genomes. A user can ask for a functionally important region to be displayed in one command. It handles large sequencing data efficiently and has only modest memory requirement. A web-based version (integrated into Galaxy) is also available for the ones who are allergic to terminals.Transcriptomics
Epigenomics
VisualisationData VisualisationR
Python
GPL (>= 3)All
NGSUtilsNGSUtils is a suite of software tools for working with next-generation sequencing datasetsGenomics
Transcriptomics
Formatting
Sequencing quality control
Sequence contamination filtering
Variant calling
Read pre-processing
PythonGPLLinux
Mac OS X
NGSViewHigh-throughput sequencing technologies introduce novel demands on tools available for data analysis. We have developed NGSView, a generally applicable, flexible and extensible next-generation sequence alignment editor. The software allows for visualization and manipulation of millions of sequences simultaneously on a desktop computer, through a graphical interface. NGSView is available under an open source license and can be extended through a well documented API.GenomicsSequence assembly visualisation
NOISeqNext Generation Sequencing (NGS) technologies are increasingly being used for gene expression pro�filing as a replacement for microarrays. The expression level given by these technologies is the number of reads in the library mapping to a given feature (gene, exon, transcript, etc.), i.e., the read counts. Most of the statistical methods for assessment of differential expression using count data rely on parametric assumptions about the distribution of the counts (Poisson, Negative Binomial, …). Moreover, many of them need replicates to work and tend to have problems to evaluate differential expression in features with low counts.

NOISeq is a non-parametric approach for the identification of differentially expressed genes from count data. NOISeq empirically models the noise distribution of count changes by contrasting fold-change differences (M) and absolute expression differences (D) for all the features in samples within the same condition. This reference distribution is then used to assess whether the M-D values computed between two conditions for a given gene is likely to be part of the noise or represent a true differential expression.

The are two variants of the method: NOISeq-real uses replicates, when available, to compute the noise distribution and, NOISeq-sim simulates them in absence of replication. It should be noted that the NOISeq-sim simulation procedure assimilates to technical replication and does not reproduce biological variability, which is necessary for population inferential analysis.
Gene expressionDifferential expression analysis
NovelSeqA computational framework to discover the content and location of long novel sequence insertions using paired-end sequencing dataIndel detection
Structural variation
Sequence assembly
Read mapping
Variant calling
CBSDUNIX
NovocraftNovoalign is a program for mapping short reads from the Illumina/SOLiD sequencing platform(s) to a reference genome.Whole genome resequencing
Genomics
ChIP-seq
RNA-Seq alignment
Regulatory RNA
Read mappingBisulfite sequencing
Mate-pair/jumping libraries
parallel execution
insertions/deletions
SAM format output
paired-end
colourspace
MPI
C++Commercial
Freeware
Mac OS X
Linux 64
NPSIdentify nucleosome positions given histone-modification ChIP-seq or nucleosome sequencing at the nucleosome level.ChIP-seq
Epigenomics
Python
NucleRnucleR is a R/Bioconductor package for working with tiling arrays and next generation sequencing. It uses a novel aproach in this field which comprises a deep profile cleaning using Fourier Transform and peak scoring for a quick and flexible nucleosome callingChIP-seq
Epigenomics
ChIP-on-chip
DNA packaging
Peak calling
Annotation
Multicore
Integrated solution
RLGPLv3Cross-Platform
OasesDe novo transcriptome assembler for very short readsSequence assembly (de-novo assembly)
Transcriptome assembly
supports strand specific and paired-end RNA-seq data setsCGPLv3
OasisOasis is a web application that allows for the fast and flexible online analysis of small-RNA-seq (sRNA-seq) data. It was designed for the end user in the lab, providing an easy-to-use web frontend including video tutorials, demo data, and best practice step-by-step guidelines on how to analyze sRNA-seq data. Oasis' exclusive selling points are a differential expression module that allows for the multivariate analysis of samples, a classification module for robust biomarker detection, and an API that supports the batch submission of jobs. Both modules include the analysis of novel miRNAs, miRNA targets, and functional analyses including GO and pathway enrichment. Oasis generates downloadable interactive web reports for easy visualization, exploration, and analysis of data on a local system.RNARead mapping
Differential expression analysis
differential expression
classification
API
miRNA target prediction
JavaScript
Perl
Java
Python
Bash
C
all supporting JVM
ODINODIN is an HMM-based approach to detect and analyse differential peaks in pairs of ChIP-seq data.Genomics
Regulatory genomics
Peak callingPythontested for Linux (Ubuntu)
OLegoOLego is a program specifically designed for de novo spliced mapping of mRNA-seq reads. OLego adopts a seeding and extension scheme, and does not rely on a separate external mapper. It achieves high sensitivity of junction detection by using very small seeds (12-14 nt), efficiently mapped using Burrows-Wheeler transform (BWT) and FM-index. This also makes it particularly sensitive for discovering small exons. It is implemented in C++ with full support of multiple threading, to allow fast processing of large-scale data.Genomics
RNA-Seq alignment
RNA-Seq
Read mapping
Sequence alignment
capable of using very small seeds for splice Read mapping
but still fast and accurate
C++GPLv3Linux
Linux 64
Mac OS X
Omixon Variant ToolkitOmixon Target Standard, Target HLA and Target Pro are designed to help clinical, diagnostic and research labs to efficiently get the maximum accuracy and precision from their targeted NGS data.SNP detection
Indel detection
Mapping
Sequence analysis
Comparative genomics
Sequence assembly
Read mapping
Sequence alignment
easy to use parameters
full documentation
also a plugin available in CLCbio and Geneious
Commercial
Freeware
interoperable
Optimus PrimerAutomated primer design for large-scale resequencing by second generation sequencingResequencingPCR primer design
PacBio conversion toolsTools to convert from PacBio HDF5 format to other commonly used formats & libraries to read HDF5 from Java & RFile reformattingJava
R
Python
PaCGeEPaCGeE (Parallel Computational Genomics Engine) is a suite of HPC accelerated sequence data analysis tools for assembly and analysis. The tool set comprises of many popular open source and proprietary software for a high performance, high throughput and high quality data analysis. The PaCGeE family of parallel NGS analysis tools are Cloud-MAQ, VELVET-P, EULER, ERANGE, BOWTIE, BFAST, MPI-BLAST, ChIP Seq Peak Finder etcRead mappingCommercial
PALMAWe present a novel approach based on large margin learning that combines accurate splice site predictions with common sequence alignment techniques. By solving a convex optimization problem, our algorithm -- called PALMA -- tunes the parameters of the model such that true alignments score higher than other alignments. We study the accuracy of alignments of mRNAs containing artificially generated micro-exons to genomic DNA. In a carefully designed experiment, we show that our algorithm accurately identifies the intron boundaries as well as boundaries of the optimal local alignment. It outperforms all other methods: for 5702 artificially shortened EST sequences from C. elegans and human it correctly identifies the intron boundaries in all except two cases. The best other method is a recently proposed method called exalin which misaligns 37 of the sequences. Our method also demonstrates robustness to mutations, insertions and deletions, retaining accuracy even at high noise levels.RNA-Seq alignmentSequence alignment
PALMapperFast and Accurate Spliced Alignments of Sequence Reads.Read mappingC++GPLv3
PanGEATool which enables a fast and user-friendly analysis of allele specific gene expression using the 454 technology.SNP detection
RNA-Seq
DNA transcription
PerlMozilla Public License
PARalyzerTool to analyze cross-linking and immunoprecipitation data (CLIP)JavaCommercial
Freeware
Partek Genomics SuiteEasy to use software providing A to Z analysis for all Next Generation Sequencing and Microarray data.Transcriptomics
ChIP-seq
SNP detection
RNA-Seq quantification
DNA transcription
Epigenomics
Alternative splicing
Functional genomics
PASHPash 3.0 performs sequence comparison and read mapping and can be employed as a module within diverse configurable analysis pipelines, including ChIP-seq and methylome mapping by whole-genome bisulfite sequencingDNA methylation
Epigenomics
Sequence alignment
Bisulfite mapping
PASSPASS performs fast gapped and ungapped alignments of short DNA sequences onto a reference DNA, typically a genomic sequence. It is designed to handle a huge amount of reads such as those generated by Solexa, SOLiD or 454 technologies. The algorithm is based on a data structure that holds in RAM the index of the genomic positions of seed" words (typically 11-12 bases) as well as an index of the precomputed scores of short words (typically 7-8 bases) aligned against each other.Sequence alignmentC++Linux
Windows
PatchworkPatchwork is a bioinformatic tool for analyzing and visualizing allele-specific copy numbers and loss-of-heterozygosity in cancer genomes. The data input is in the format of whole-genome sequencing data which enables characterization of genomic alterations ranging in size from point mutations to entire chromosomes.

High quality results are obtained even if samples have low coverage, ~4x, low tumor cell content or are aneuploid.

Patchwork is available in two formats. The first, named simply patchwork, takes BAM files as input whereas patchworkCG takes input from CompleteGenomics files. Detailed guides and information regarding these can be found in their respective tabs.
Structural variation
Copy number estimation
Allele specific copy numbers.RLinux
Mac OS X
PatMaNPatman searches for short patterns in large DNA databases, allowing for approximate matches. It is optimized for searching for many small pattern at the same time, for example microarray probes.Read mapping
PE-AssemblerA simple 3' extension approach to assembling paired-end reads and capable of parallelizationSequence assembly (de novo assembly)ScaffoldingC++
PeakAnalyzerPeakAnalyzer is a set of applications for processing ChIP signal peaks.ChIP-seq
Functional genomics
Java
C++
R
PeakRangerA multi-purpose, ultrafast ChIP Seq peak callerChIP-seqPeak callingC++Artistic LicenseLinux
Mac OS X
PeakSeqPeakSeq enables systematic scoring of ChIP-seq experiments relative to controls. A methodology for identifying punctate binding sites in ChIP-seq experiments based on their characteristics. publicationChIP-seqC
Perl
PeakTracePeakTrace is an alternative basecaller for improving the quality and read length of Sanger DNA sequencing traces. The PeakTrace basecaller works with trace files produced by the ABI 310, 3700, 3100, 3130, 3730, and 3500 DNA sequencers. MegBACE sequencers are also supported.SequencingBase-callingDNA basecallerCCommercialWindows
Mac OS X
Linux 64
PECANAlignment method practical for large genomic sequences.Sequence alignment
PEMerThe package is composed of three modules, PEMer workflow, SV-Simulation and BreakDB. PEMer workflow is a sensitive software for detecting SVs from paired-end sequence reads. SV-Simulation randomly introduces SVs into a given genome and generates simulated paired-end reads from the ‘novel’ genome. Subsequent analysis with PEMer workflow on the simulated reads can facilitate parameterize PEMer workflow. BreakDB is a web accessible database developed to store, annotate and dsplay SV breakpoint events identified by PEMer and from other sources.Structural variation
PERalignA probabilistic framework is described to predict the alignment to the genome of all paired-end read transcript fragments in a paired-end read dataset. Starting from possible exonic and spliced alignments of all end reads, our method constructs potential splicing paths connecting paired ends. An expectation maximization method assigns likelihood values to all splice junctions and assigns the most probable alignment for each transcript fragment.RNA-Seq alignmentC++Linux
PerMPerM (Periodic Seed Mapping) uses periodic spaced seeds to significantly improve mapping efficiency for large reference genomes when compared to state-of-the-art programs.Genomics
SNP detection
Read mappingC++Apache License 2.0Linux
PhredThe phred software reads DNA sequencing trace files, calls bases, and assigns a quality value to each called base.Base-callingCSolaris
IRIX
AIX
Phred Phrap Consed Cross matchThe phred software reads DNA sequencing trace files, calls bases, and assigns a quality value to each called base. Phrap is a program for assembling shotgun DNA sequence data. Cross_match is a general purpose utility for comparing any two DNA sequence sets using a 'banded' version of swat. Consed/Autofinish is a tool for viewing, editing, and finishing sequence assemblies created with phrap.Sequence assembly
Sequence alignment
Base-calling
Local sequence alignment
PhymmA classifier for metagenomic data, that has been trained on 539 complete, curated genomes and can accurately classify reads as short as 100 base pairsMetagenomics
PiCallIdentifies short indel polymorphisms in population sequencing dataIndel detection
Population genetics
C
PICSPICS identifies binding event locations by modeling local concentrations of directional reads, and uses DNA fragment length prior information to discriminate closely adjacent binding events via a Bayesian hierarchical t-mixture model.ChIP-seqR
PileLinePileLine is a flexible command-line toolkit for efficient handling, filtering, and comparison of genomic position (GP) files produced by next-generation sequencing experiments. PileLineGUI adds a graphical interface.Plotting and renderingJavaLGPL
PindelA pattern growth approach to detect break points of large deletions and medium sized insertions from paired end short reads.Indel detection
Structural variation
Read mapping
Split-read mapping
Localised reassembly
C++Linux
Mac OS X
Windows
Pipeline PilotAnalysis and workflow development of Next Generation Sequencing and gene expression.Sequence assembly (de novo assembly)
Whole genome resequencing
Genomics
ChIP-seq
SNP detection
RNA-Seq
Sequence analysis
Gene expression analysis
Next Generation Sequencing
Read mapping
Sequence alignment
Sequence analysis
Variant calling
Comparative genomics
Gene expression analysis
Integrated solution wrapping custom and third party tools for integration
analysis
and reporting
C++
Java
Perl
R
Pilot Script
CommercialLinux
Windows
PIQAPIQA is a quality analysis pipeline designed to examine genomic reads produced by Next Generation Sequencing technology (Illumina G1 Genome Analyzer). It is a set of libraries for R.Sequencing quality controlR
PoissonSeqIdentify differential expressed genesGene expressionDifferential expression analysis
PolyBayesShortA re-incarnation of the PolyBayes SNP detection tool developed by Gabor Marth at Washington University. This version is specifically optimized for the analysis of large numbers (millions) of high-throughput next-generation sequencer reads, aligned to whole chromosomes of model organism or mammalian genomes. Developers at Boston College.SNP detectionLinux
Linux 64
PoolHapComputational tool for inferring haplotype frequencies from pooled samples when haplotypes are known. In future version, haplotype unknown analysis will be supported.Read mapping
Regression analysis
PoPoolationToolbox specifically designed for the population genetic analysis of sequence data from pooled individuals.Population geneticsPerl
R
PoPoolation2PoPoolation2 allows to compare allele frequencies for SNPs between two or more populations and to identify significant differences. PoPoolation2 requires next generation sequencing data of pooled genomic DNA (Pool-Seq). It may be used for measuring differentiation between populations, for genome wide association studies and for experimental evolution.Genomics
Population genetics
Perl
R
Post Assembly Genome Improvement Toolkit" Tools to generate automatically high quality sequence by ordering contigs, closing gaps, correcting sequence errors and transferring annotation. With the advent of next generation sequencing a lot of effort was put into developing software for mapping or aligning short reads and performing genome assembly. For genome assembly the problem of generating a draft assembly (i.e. a set of unordered contigs) has now been very well addressed - but for users who need high quality assemblies for their analyses there are still unresolved issues: this is where PAGIT is used. "Sequence assembly (de novo assembly)Sequence assembly validationLinux
PrecrecPrecrec is a CRAN package that provides accurate computations of ROC and Precision-Recall curves.RGPL-3Windows
Mac OS X
Unix-like
PRICEPRICE uses paired-read information to iteratively increase the size of existing contigs.Sequence assemblyC++
PRINSEQPRINSEQ is a sequence processing tool that can be used to filter, reformat and trim genomic and metagenomic sequence data. It generates summary statistics of the input in graphical and tabular formats that can be used for quality control steps. PRINSEQ is available as both standalone and web-based versions.Genomics
Transcriptomics
Metagenomics
Sequence contamination filtering
Sequence trimming
Read pre-processing
PerlGPLv3UNIX
Mac OS X
Windows
ProbeMatchMatches a large set of oligonucleotide sequences against a genome database using gapped alignmentsMappingLinux
Mac OS X
ProbHDWe present a new strategy for identifying heterozygous sites in a single individual by using a machine learning approach that generates a heterozygosity score for each chromosomal position. Our approach also facilitates the identification of regions with unequal representation of two alleles and other poorly sequenced regions. The availability of confidence scores allows for a principled combination of sequencing results from multiple samples.SNP detection
Population genetics
Perl
R
Python
Promisec Endpoint ManagerPromisec Endpoint Manager 2.41 uses patented, agentless technology to quickly and remotely inspect yan organisations endpoint environment to discover, analyze, and remediate any abnormalities that then in turn lead to failed audits and uncorrect business intelligence.Security
Endpoint Management
Remediation
ProxygenesWe introduce a clustering method which significantly reduces the size of a metagenome dataset while maintaining a faithful representation of its functional and taxonomic content.MetagenomicsRead mapping
Annotation
PybedtoolsPython extension to BEDTools that allows use of all BEDTools programs directly from Python, as well as feature-by-feature manipulation, automatic handling of temporary files, and more.GenomicsRead mappingSee full descriptionPythonGPLv2Windows (Cygwin)
Linux
Linux 64
Mac OS X
PyroBayesPyroBayes is a novel base caller for pyrosequences from the 454 Life Sciences sequencing machines.SNP detectionBase-calling
PyroMapPyroMap accurately maps pyrosequencing reads onto reference sequences using a selectively weighted Smith-Waterman (SW^2) algorithm to incorporate quality scores into alignment.Read mappingPython
PyroNoiseClustering of pyrosequencing (454) data with noise model (AmpliconNoise) and chimaera removal (Perseus) for sequence diversity analysis.Metagenomics
Phylogenetics
QCALLSNP detection and genotyping from low-coverage sequencing data on multiple diploid samplesSNP detection
QpalmaQPalma is an alignment tool targeted to align spliced reads produced by Next Generation sequencing platformsRNA-Seq alignmentSequence alignmentPython
C++
QRidgeGiven a set of RNA-Seq data, QRidge assembles the short reads into long full-length transcripts.Sequence assembly (de-novo assembly)
Transcriptome assembly
Sequence assembly
QSeqQSeq is DNASTAR's Next-Gen application for RNA-Seq, ChIP-seq, and miRNA alignment and analysis.ChIP-seq
RNA-Seq
Regulatory RNA
Sequence alignment
Visualisation
Peak calling
CommercialWindows 7 64-bit or Higher
Mac OS X 10.7
10.8
or 10.9 with Parallels Desktop
QSRAQuality-value guided Short Read Assembler, created to take advantage of quality-value scores as a further method of dealing with error. Compared to previous published algorithms, our assembler shows significant improvements not only in speed but also in output quality.Sequence assembly (de novo assembly)Sequence assembly
QuadGTQuadGT is a software package for calling single-nucleotide variants in four sequenced genomes: normal-tumor pairs coupled with parents. Genotypes are inferred using a joint model of parental variant frequencies, de novo germline mutations, and somatic mutations. The model quantifies the descent-by-modification relationships between the unknown genotypes by using a set of parameters in a Bayesian inference setting.SNPsSNP calling
Variant calling
Java
QuakeProgram to detect and correct errors in DNA sequencing reads. Using a maximum likelihood approach incorporating quality values and nucleotide specific miscall rates,Sequence error correction
QualiMapQualimap is a platform-independent application written in Java and R that provides both a Graphical User Inteface (GUI) and a command-line interface to facilitate the quality control of alignment sequencing data.Sequencing quality controlJava
R
QUASTQUAST stands for QUality ASsessment Tool. It evaluates a quality of genome assemblies by computing various metrics and providing nice reports.Sequence assembly
Data handling
Visualisation
Sequence assembly validation
Data Visualisation
Assembly Quality Evaluation
Detailed Reports
Python
C
Perl
GPLv2Linux
Mac OS X
QuESTQuEST is a Kernel Density Estimator-based package for analysis of massively parallel sequencing data from chromatin immunoprecipitations (ChIP-seq or ChIPseq).ChIP-seqC++GPLv2
QuipAggressive compression of FASTQ and SAM/BAM files.File reformattingCBSD (3-clause)any
R2RR2R is a simple to use package for very sensitive analysis of short read sequence data obtained by NextGen sequencing techniques. R2R was developed in conjunction with data obtained on the Illumina GA platforms. R2R is written in simple Perl script and runs equally well under MS Windows, Mac OS and Linux/Unix operative systems.SNP detectionSequence alignmentPerl
R453Plus1ToolboxFacilitates analysis of data from 454 sequencer in R/Bioconductor.R
RACAReference-Assisted Chromosome Assembly (RACA)
RApiDTools for processing restriction site associated DNA sequencing.SNP detectionPerl
C++
GPLv3
RAPSearchFast protein similarity search tool for short reads that utilizes a reduced amino acid alphabet and suffix array to detect seeds of flexible length.MetagenomicsSequence alignmentC++GPLv3
RAST"RAST (Rapid Annotation using Subsystem Technology) is a fully-automated service for annotating complete or nearly complete bacterial and archaeal genomes. It provides high quality genome annotations for these genomes across the whole phylogenetic tree."Genomics
Phylogenetics
Genomics
Annotation
Rayde novo genome assembly is now a challenge because of the overwhelming amount of data produced by sequencers. Ray assembles reads obtained with new sequencing technologies (Illumina, 454, SOLiD) using MPI 2.2 -- a message passing inferface standard.Sequence assembly (de novo assembly)Sequence assemblyMPI 2.2
de Bruijn graphs
parallel
Illumina data
C++GPLLinux
POSIX
RazerSRazerS allows the user to align sequencing reads of arbitrary length using either the Hamming distance or the edit distance. The tool can work either lossless or with a user-defined loss rate at higher speeds.MappingRead mapping
Local sequence alignment
Gapped alignment
paired-end mapping
C++GPLv3UNIX
Mac OS X
Windows
RDiffrDiff is an open source tool for accurate detection of differential RNA processing from RNA-Seq data. It implements two statistical tests to detect changes of the RNA processing between two samples. rDiff.parametric is a powerful test, which can be applied for well annotated organisms to detect changes in the relative abundance of isoforms. rDiff.nonparametric is an alternative when the annotation is incomplete or missing.Transcriptomics
RNA-Seq
Alternative splicing
Sequence alignment
Differential expression analysis
Python
Matlab
Open SourceLinux
Mac OS X
RDP Pyrosequencing PipelineThe Ribosomal Database Project's Pyrosequencing Pipeline aims to simplify the processing of large 16s rRNA sequence libraries obtained through pyrosequencing. This site processes and converts the data to formats suitable for common ecological and statistical packages such as SPADE, EstimateS, and R.MetagenomicsSequence alignment
Formatting
browser based
ReadalignerA tool for mapping (short) DNA reads into reference sequences.Read mapping
ReadDepthDetects copy number aberrations in deep sequencing dataCopy number estimationRApache License 2.0
REALREad ALigner for Next-Generation sequencing readsMappingC++GPLv3Linux
ReaperReaper is a program for demultiplexing, trimming and filtering short read sequencing data.Next Generation SequencingAdapter removal
Sequencing quality control
Sequence contamination filtering
Sequence trimming
DNA barcoding
Memory efficient and fast.CGPLv3Linux
UNIX
Mac OS X
... further results