This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
Real-time quantitative polymerase-chain-reaction (qPCR) is a standard technique in most laboratories used for various applications in basic research. Analysis of qPCR data is a crucial part of the entire experiment, which has led to the development of a plethora of methods. The released tools either cover specific parts of the workflow or provide complete analysis solutions.
Here, we surveyed 27 open-access software packages and tools for the analysis of qPCR data. The survey includes 8 Microsoft Windows, 5 web-based, 9 R-based and 5 tools from other platforms. Reviewed packages and tools support the analysis of different qPCR applications, such as RNA quantification, DNA methylation, genotyping, identification of copy number variations, and digital PCR. We report an overview of the functionality, features and specific requirements of the individual software tools, such as data exchange formats, availability of a graphical user interface, included procedures for graphical data presentation, and offered statistical methods. In addition, we provide an overview about quantification strategies, and report various applications of qPCR.
Our comprehensive survey showed that most tools use their own file format and only a fraction of the currently existing tools support the standardized data exchange format RDML. To allow a more streamlined and comparable analysis of qPCR data, more vendors and tools need to adapt the standardized format to encourage the exchange of data between instrument software, analysis tools, and researchers.
Keywords: qPCR, Data analysis, MIQE, RDML, Software, ToolsSince its commercial introduction almost 2 decades ago, real-time quantitative polymerase-chain-reaction (qPCR) has come to play a prominent role in the life sciences. It provides the base for a plethora of applications in basic research, pathogen detection, and biomedical diagnostics. Furthermore, it is widely accepted as the gold standard for the analysis of gene expression. qPCR is a molecular biology technique, which allows amplification and simultaneous quantification of a targeted DNA molecule. The advancement compared to the original PCR method is the ability to measure the amplification of DNA as the reaction progresses in real time [1]. This allows quantifying initial amounts of template molecules by comparing the number of amplification cycles required for the response curves to reach a particular quantification threshold fluorescence signal level [2]. The more copies of a DNA template are present at the beginning of an experiment, the fewer PCR cycles are needed to make enough material for detection. In the past years, isothermal amplification strategies emerged as alternatives to PCR. Examples include the helicase-dependent isothermal DNA amplification (HDA), and the recombinase polymerase amplification (RPA). In contrast to traditional PCR, these methods do not require changing the reaction temperature and therefore use time-based measurements instead of cycle based measurements. However, isothermal amplification methods have less clinical use than conventional PCR [3], [4].
Since the commercialization of qPCR in 1996, the number of publications referring to qPCR has increased exponentially (see Fig. 1 ). To ensure quality of results and allow potential reproduction, the Minimum Information for Publication of Quantitative Real-Time PCR Experiments (MIQE) guidelines have been published [5]. They provide a basic set of quality criteria and detail 85 parameters ranging from experimental design over sample processing and assay validation to data analysis.
Number of publications in PubMed related to qPCR. This plot shows the number of publications in PubMed related to specific qPCR applications.
After successfully generating a high-quality qPCR run, the data needs to be correctly analyzed to get biological meaningful results. To facilitate sharing and exchanging experimental data, the Real-time PCR Data Markup Language (RDML) has been developed [6]. The data standard is based on XML and contains details about the experimental setup, information about the samples and targets, and all measured data. As the data analysis step is an essential part of the qPCR workflow, it should be performed in a standardized and reproducible way [7]. As a consequence, dedicated software packages and data analysis suites were created targeting different aspects of the qPCR analysis workflow. Usually, commercial qPCR systems are equipped with software for data analysis and visualization. However, in most of the cases these closed source software modules are black boxes and do not provide full control over the entire process. Recently, Nature [8] and Science [9] reminded the scientific community about the importance of open computer programs.
In this survey we reviewed 27 software packages and tools for the analysis of qPCR data. We targeted all available open-access tools capable of analyzing raw fluorescence or Cq (quantification cycle) values. First we provide an overview about the main application fields for qPCR. Next, we report an overview of the functionality, features and specific requirements of the individual tools. Finally, we briefly discuss the surveyed applications and provide a report on their provided functionalities.
The design of a qPCR experiment depends, amongst others, on biological (e.g., studied organism, given biological question) and technical (e.g., available platforms, choice of chemistry, primers) parameters. A thorough guideline and detailed procedures covering the technical setup of a qPCR experiment has been published elsewhere [10].
The quantification of target DNA in each cycle of a qPCR experiment is based on measuring the emission of a fluorescent reporter dye. Dyes that bind to double-stranded DNA and upon excitation emit light (e.g., SYBR Green) are the most widely used DNA dyes due to ease of use, cost efficiency, and generic detection [11]. Their disadvantage is that they bind to any double-stranded DNA, including non-specific reaction products, which might result in an overestimation of the target concentration. Probe-based methods, such as TaqMan probes, molecular beacons, or scorpion primers rely on the sequence-specific detection of a desired PCR product resulting in increased specificity and sensitivity [12], [13].
Furthermore, different probes can be used for multiplex PCR, in which multiple targets are amplified and detected in a single reaction tube. Unique probes or specific melting temperatures distinguish each PCR target amplified by a different set of primers [14]. Multiplex PCR reduces reagent consumption and allows studying more combinations of samples and target genes on one chip, but might reduce sensitivity as targets compete with each other for reaction resources. Currently, up to five different reactions can be detected simultaneously in one tube.
Based on the detected fluorescent signal and the chosen experimental setup several analysis steps are needed to obtain biological meaningful results (see Fig. 2 ). First, based on the raw fluorescence values a baseline is subtracted from the raw data, which is a crucial step in qPCR data analysis [15]. Next, the quantification cycle (Cq) value – previously known as the threshold cycle (Ct), crossing point (Cp), or take-off point (TOP) – can be calculated. In general, the Cq value represents the number of cycles needed to reach a set threshold fluorescence signal level. In addition, the raw fluorescence values can also be used for inferring the amplification efficiencies. Using the determined Cq value, quantification of nucleic acids can be performed by absolute quantification (via standard curve or digital PCR) or relative quantification (delta Cq) [16]. Finally, the quantification results can be tested for statistical significant differences and presented in a graphical way.
Flowchart of qPCR data analysis. This flowchart displays the different steps of qPCR data analysis. After a successful qPCR run, the raw fluorescence values can be used to calculate Cq and amplification efficiency values. Next either absolute or relative quantification is performed. Finally, statistical analysis can be performed on the generated quantification results and data can be displayed graphically.
Absolute quantification yields the exact number of target DNA molecules by comparison with DNA standards using a calibration curve. The curve is generated by using serially diluted standards of know concentrations and produces a linear relationship between Cq and the logarithm of the initial amount of total template DNA [17]. The reliability of the absolute quantification method depends on the amplification efficiencies for the target and the calibration curve, which needs to be considered in the analysis workflow.
Digital PCR (dPCR) is a novel method for precise quantification of nucleic acids [18]. Prior to PCR, the reaction mixture is divided into a very large number of separate tiny volumes, such that there is either zero or one target molecule present in any individual reaction. Currently available commercial systems can generate up to ten thousands (Bio-Rad, Life Technologies), fifty thousand (Fluidigm), and ten million (RainDance) droplets per experiment [19]. In effect, each reaction becomes binary and the discrete signals can be counted. After applying a Poisson correction to account for wells with more than one copy, the counts can be used to quantify the absolute number of DNA molecules in the sample [20]. dPCR can be used for absolute quantification without the need for standard curves [21], and has been reported to have a higher accuracy [21] as well as robustness to amplification efficiency [22] compared to qPCR.
Due to the rising interest in dPCR, separate digital MIQE guidelines have been published to assist researchers in designing, conducting, and analyzing their experiments [23]. Recently, a study systematically compared the performance of qPCR with that of dPCR. The authors could show by using a variety of targets in different sample backgrounds, that dPCR with droplets showed greater precision and more reproducible findings [24] than qPCR [25].
Relative quantification is based on internal reference genes to determine fold-differences in expression of the target gene. It avoids the need for a standard curve as the amount of the studied gene is compared to the amount of a reference gene. Various mathematical models are established to calculate the expression of a target gene in relation to an adequate reference gene [26]. Based on the method, they can perform normalization using single or multiple reference genes, include PCR efficiency, and allow calibration across multiple plates. The final calculated relative expression level relates the target gene in a sample relative to a calibrator sample. The generated relative quantities can be compared across multiple real-time RT-PCR experiments. Appropriate normalization strategies and selection of reference genes have been reviewed elsewhere [27], [28]. For large scale expression profiling studies it has been shown that mean expression value normalization outperforms the normalization strategy based on a few reference genes [29], [30]. As relative quantities are determined by computations based on several observables (e.g., replicates, amplification efficiency), it has been shown that the random error of the final result is influenced by error propagation. Consequently, methods have been developed to include error propagation into relative quantification calculations [31].
To calculate the Cq value from raw fluorescence data, selected software packages and tools perform preprocessing steps. These may include methods for the reduction of noise (e.g., introduced by technical components) curve smoothing, removal of outliers, normalization, curve fitting, and background subtraction [32]. In contrast to closed source software, open source software platforms provide full control over such tasks. A further important aspect is the handling of missing values (NA). There are in principle two levels to handle NAs.
The first level is on the raw fluorescence data of the amplification curves, which may occur randomly in developmental qPCR technologies, such as the VideoScan platform [33], but also in existing data sets. A common approach of handling missing values is by imputation. Methods include for example cubic spline interpolation or imputation of location parameters such as the mean or median [34].
The second level of NA handling is after initial Cq calculation. It might be the case that certain reactions are missing corresponding values. This might be due to different factors, such as technical difficulties, detector problems, or low quantity genes. Currently, several methods have been presented to handle missing values. First, they can be simply excluded from any downstream analysis. As long as enough technical replicates are available and not too many reactions are marked as excluded this is a legitimate approach. Second, they can be assigned the Cq value of the maximum number of cycles (often 40), which might deviate the mean of the analyzed samples due to outlier creation. Third, they can be imputed by using the Cq values of technical or biological replicates. Methods range from very simple median calculations to more sophisticated statistical models (e.g., hierarchical models) [35], [36]. These methods generally yield good results, but may lead to inflated stability values for reference genes [37].
After performing relative or absolute quantification, usually different groups (e.g., healthy vs. control, time series) are tested for their statistical significant difference. To allow generation of meaningful results, an experiment ideally should encompass at least three independent biological replicates of each treatment [38]. As budgetary considerations often hamper the use of multiple replicates, more focus should be given to biological replicates than technical replicates [39]. Various statistical methods can be used to test for significant differences. The analysis of variance (ANOVA) can be used to compare treatments using the previously calculated quantification results. If only two conditions are compared, a standard t-test can be applied [40].
Final results are often presented in a graphical way allowing easier interpretation and understanding of the obtained findings. The data may be presented as a heatmap and clustered similar to the presentation of microarray data [41]. Instead of creating the widely popular bar plots for reporting findings, we recommend the use of box plots as they represent both the summary statistics and the distribution of the primary data [42]. Bar plots with error bars are difficult to compare and encourages the perception that the mean is related to the height of the bar rather than the position of its top. They are best used for visualizing counts or proportions. Box plots require a sample size of only 5 and provide more information about samples than conventional error bars. Results obtained from very small sample sizes should be displayed using traditional mean-and-error or column scatter plots [43].
qPCR can be applied for several applications spanning a wide range of use cases. The most commonly used application is gene expression analysis. In addition to quantification experiments, qPCR can be used for the analysis of copy number variations, DNA methylation, and genomic variants.
qPCR was originally designed for DNA quantification and has been extensively developed in clinical microbiology laboratories for routine diagnosis of infectious diseases. It is applied for detecting bacterial, fungal, and viral pathogens, especially when testing to discover the source of an unknown infection [44].
In addition to DNA quantification, it has been widely used to quantify the expression of messenger RNA (mRNA) [45]. First, RNA is reverse transcribed into a cDNA template, which is then used in qPCR reactions to detect and quantify gene expression products. Amongst others, it has been successfully applied for the analysis of transcriptional biomarkers [46], [47], cancer [48], and Mendelian diseases [49].
Since their discovery in 1993, microRNAs (miRNAs) have been a valuable research target for scientist worldwide. miRNAs are small, regulatory, noncoding single-stranded RNAs, which are usually 20–25 nucleotides long [50]. The first qPCR assay to quantify miRNAs was developed almost a decade ago [51], and subsequent protocols for using relative and absolute quantification have been developed [52]. miRNAs have been successfully identified as possible biomarkers for cancer [53], autologous transfusion [54], diabetes [51] and cardiomyopathies [55]. In addition to miRNAs, a large number of different non-coding RNAs have been described, where many show regulatory functions. Amongst others, long non-coding RNAs (lncRNAs), small inhibiting RNA (siRNA), and small nuclear RNA (snRNA) can be quantified by qPCR methods.
DNA methylation is a heritable epigenetic modification in genomes which is known to be involved in biological processes such as regulation of gene expression, cell differentiation, DNA structure and disease states [58]. To date, several protocols for determining the methylation state have been developed such as real-time Quantitative Methylation-Specific PCR (QMSP) [59], methylation-sensitive restriction enzyme digestion PCR (MSRE-PCR) [60], MethyLight [61], and methylation-sensitive high resolution melting (MS-HRM) [62]. The analysis of methylation qPCR data depends on the selected protocol and involves absolute or relative quantification as well as high resolution melting (HRM) analysis.
The gene copy number is the number of copies of a particular gene in the genotype of an individual, e.g., diploid genomes usually possess two copies of each gene. Copy number variations (CNVs) in the genome are genetic variations where segments of DNA are present in a variable number of copies in comparison to the reference (e.g., diploid human) genome. CNVs can arise by a variety of processes, such as deletions, duplications and translocations during meiosis. They may influence both mRNA and protein levels and have recently been associated with several complex and common diseases [56]. The experimental setup and statistical methods to calculate copy number variations have been made accessible and were widely applied for studying CNVs in various diseases [57], [58], [59]. By using qPCR, the Cq values of the target gene can be compared to unrelated reference sequences. It has been shown that including multiple reference samples results in more accurate results. Furthermore, it allows combining references with a normal copy number and references with a known CNV, which can serve as reference point and as positive control for the detection of CNVs [60]. The generated ΔCq values are then used for CNV calculation. As a consequence, methods that support quantification analysis are applicable for CNV calculations [60].
Several methods have been developed that involve different PCR preparation techniques to discover and genotype small DNA sequence variations such as single nucleotide polymorphisms (SNPs) [61]. The created assays vary in sample preparation techniques, turnaround time, costs, and multiplexing capabilities [62]. Depending on the used protocol, different normalization and quantification strategies need to be applied after Cq value calculation.
In this review, we comprehensively surveyed 27 open-access packages, software tools, and web applications performing analysis on either raw fluorescence values (amplification curve data either before or after baseline subtraction) or predetermined Cq values (see Table 1 ). Tools and packages had to be accessible or available at the time of the survey. We excluded methods for designing PCR primers and determining the most stable reference genes. Furthermore, tools for advanced statistical calculation of quantified qPCR data were not included in this survey. Commercial tools such as GenEX (MultiD, Gothenburg, Sweden), Genevestigator (Nebion, Zürich, Switzerland), qbase + (Biogazelle, Zwijnaarde, Belgium), and the various software suites distributed by the vendors of thermocycler machines were not part of this review.
qPCR data analysis software packages and tools. Software packages and tools for the analysis of qPCR data are listed. For each tool its corresponding application area is specified, divided into: Cq calculation, normalization, quantification, CNV, and dPCR. The input type can either be precalculated Cq values (Cq) or raw fluorescence values (Raw). For each tool the supported operating system or the underlying framework is specified. Frameworks are often available on different operating systems allowing the package to run on several platforms. GUI specifies the existence of a graphical user interface for data input and output. ABI, Applied Biosystems format; ABT, Lightcycler export format; CSV, comma separates values, FLO, Lightcycler export format; REX, Rotor Gene export format; R format, encompasses all import and export formats provided by the default R installation and auxiliary R packages (e.g., PDF, SVG, HTML, and XLS).
Tool | Web | Feature | Cq/Raw | Input | Output | OS/Framework | GUI | Last update | Ref |
---|---|---|---|---|---|---|---|---|---|
CAmpER | [76] | Cq calculation, Normalization, Quantification | Raw | FLO, ABT, CSV, REX, TXT | CSV, TXT | Web based | Yes | 2009-06-01 | [77] |
chipPCR | [34] | Cq calculation | Raw | Native R format | Native R format | R based | Yes | 2014-06-25 | [34] |
CopyCaller | [78] | CNV | Cq | ABI | CSV, TXT, XLS | Windows | Yes | 2009-02-01 | [79] |
Cy0 Method | [80] | Cq calculation | Raw | XLS, TXT, DOC | XLS | Web based | Yes | 2010-01-01 | [81], [82] |
DART-PCR | [83] | Cq calculation, Normalization, Quantification | Raw | XLS | XLS | Windows, Excel based | Yes | 2002-12-16 | [84] |
ddCT | [85] | Normalization, Quantification | Cq | TXT, native R format | TXT, PDF, native R format | R based | No | 2013-10-14 | [86] |
Deconvolution | [87] | Quantification | Raw | TXT | TXT | Perl based | No | 2010-04-29 | [88] |
dpcR | [89] | dPCR, Quantification, CNV, Genotyping | Cq, Raw | TXT, CSV, native R format | TXT, native R format | R based | No | 2013-09-08 | [90] |
EasyqpcR | [91] | Normalization, Quantification | Cq | TXT, CSV | TXT | R based | Yes | 2013-11-24 | [92] |
FPK-PCR | [93] | Cq Calculation | Raw | CSV, TXT | TXT | R based | No | 2012-01-20 | [94] |
HTqPCR | [95] | Normalization, Quantification, Statistics | Cq | TXT, native R format | TXT, PDF, native R format | R based | No | 2013-10-14 | [96] |
LinRegPCR | [97] | Cq calculation, Quantification | Raw | XLS, RDML | XLS, RDML | Windows | Yes | 2014-02-19 | [98] |
LRE Analysis | [99] | Quantification | Raw | XLS | XLS | MATLAB based | Yes | 2012-02-21 | [100] |
LRE Analyzer | [101] | Quantification | Raw | XLS | XLS | Java based | Yes | 2014-01-07 | [102] |
MAKERGAUL | [103] | Cq calculation, Quantification | Raw | CSV | HTML | Web based | Yes | 2013-08-27 | [104] |
NormqPCR | [105] | Normalization, Quantification | Cq | TXT | TXT | R based | No | 2013-03-23 | [73] |
PCR-Miner | [106] | Cq calculation | Raw | TXT | TXT | Web based | Yes | 2011-10-21 | [107] |
pyQPCR | [108] | Normalization, Quantification | Cq | TXT, CSV | TXT, PDF | Python based | Yes | 2012-01-03 | [109] |
qBase | [110] | Normalization, Quantification | Cq | XLS, RDML | XLS, RDML | Windows, Excel based | Yes | 2007 | [26] |
qCalculator | [111] | Normalization, Quantification | Cq | XLS | XLS | Windows, Excel based | Yes | 2004-01-26 | [112] |
QPCR | [113] | Cq calculation, Normalization, Quantification, Statistics | Raw | CSV, RDML | CSV, RDML, XLS, SVG, PNG | Web based | Yes | 2013-06-10 | [114] |
qpcR | [115] | Cq calculation, Normalization, Quantification, Melting curve analysis | Cq, Raw | CSV, native R format | TXT, PDF, native R format | R based | No | 2014-06-02 | [116] |
qPCR-DAMS | [117] | Normalization, Quantification | Cq | XLS | XLS | Windows | Yes | 2006-02-18 | [118] |
qpcrNorm | [119] | Normalization, Statistics | Cq | CSV | TXT | R based | No | 2013-10-14 | [120] |
REST | [121] | Normalization, Quantification, Statistics | Cq | TXT | TXT | Windows 32 Bit | Yes | 2009 | [122] |
SARS | [123] | Normalization, Statistics | Cq | XLS, TXT | TXT | Windows | Yes | 2011-05-01 | [124] |
SASqPCR | [125] | Normalization, Quantification, Statistics | Cq | XLS, CSV | TXT | SAS based | No | 2011-06-01 | [126] |
The evaluated tools for qPCR data analysis were divided into five different types (see Fig. 3 ): quantification analysis (20 tools), Cq calculation (4 tools), normalization analysis (1 tool), CNV analysis (1 tool), and dPCR analysis (1 tool). Several tools assigned to quantification analysis are also capable of calculating Cq values and allow performing CNV analysis. In addition, we grouped the surveyed tools according to their platform: Microsoft Windows (8 tools), web-based (5 tools), R-based (9 tools), and other platforms (5 tools).
Quantification tools. This figure displays the relationship between analysis type (left side) and platform (right side). It can, for instance, be seen that tools for Cq calculation analysis are only available as R packages or Web-based applications.
We provide an overview about supported steps of the analysis workflow (see Fig. 2 ), accepted input files, used file formats, availability of a graphical user interface (GUI), date of last update, and number of citations (see Table 1 ). Based on the identified analysis workflow for qPCR data, six of the surveyed tools support the calculation of Cq values and eight applications allow determining the amplification efficiency values in addition to either absolute or relative quantification. Amongst the reviewed tools, eight tools allow performing of absolute and 15 tools support relative quantification. Only two tools are able to carry out relative as well as absolute quantification. Error propagation for relative quantification is not widely applied and only available in five tools. Many applications (15 tools) support some kind of graphical quantification result presentation; bar plots are the most common format. If downstream statistical analysis is provided (9 tools) it usually includes testing for significant quantification differences between groups.
In the next section, we briefly describe the tools and provide a report on their provided functionalities.
The standalone software CopyCaller performs relative quantification analysis of genomic DNA targets from copy number assay experiments. It can be applied to detect and measure copy number variations of specific genomic sequences without a known calibrator sample. It returns the calculated copy number and estimates the confidence of copy number calls.
DART-PCR, released over a decade ago, is a Microsoft Excel based tool for analyzing raw fluorescence data (before baseline subtraction). It calculates the Cq values and performs subsequent analysis for relative quantification and assay variability. Amplification efficiency is determined by the application and tested for outliers. Next, a user set efficiency value is used for normalization and calculation of relative quantities, which can be graphically displayed as bar plots. DART-PCR does not support normalization to reference genes.
LinRegPCR is a standalone program for analyzing raw fluorescence data (before baseline subtraction) using a graphical user interface. The recently updated application is RDML compliant and additionally accepts Excel as input format. LinRegPCR determines and subtracts baseline fluorescence, sets a Window-of-Linearity and calculates PCR efficiencies per sample. Next Cq value and the starting concentration per sample (reported in an arbitrary unit) are calculated. The program provides methods to visualize raw fluorescence value curves and is able to plot graphs for comparing efficiency values of different samples. The determined Cq and efficiency values can later be used for quantification analysis.
qBase is a Microsoft Excel based program for the relative quantification of qPCR data. Its analysis is based on Cq values and it supports the direct import of export files from various systems (ABI, Bio-Rad, Rotor-Gene, Stratagene). Furthermore, qBase supports the RDML format, and applies a normalization strategy that features error propagation and multiple reference genes. The standard curve, used for efficiency calculation, can be graphically displayed and quantification results can be visualized as bar plots. Since February 2008 the application has been superseded by qbase + , a commercially available application, and no updates for the Excel based version have been published.
qCalculator is a Microsoft Excel based Visual Basic application for the calculation of relative quantification values. The tool takes Cq values as input and uses multiple sheets for data manipulation and result presentation. qCalculator is able to calculate amplification efficiency from standard curves, which is used in subsequent analysis steps. It supports normalization to reference genes, where multiple reference genes are not combined but treated separately.
qPCR-DAMS is a database tool for qPCR based on Microsoft Access 2003. It is designed to analyze, manage, and store relative as well as absolute quantitative real-time PCR data. The analysis module of the system allows user to perform absolute and relative quantification including normalization to (multiple) reference genes. qPCR-DAMS takes predetermined Cq values as input and provides three quality control steps and a system to monitor data variation.
The Relative Expression Software Tool (REST, version 2009) is a GUI application for the relative quantification of qPCR data. It operates on Cq values (direct support for Rotor-Gene), includes different PCR efficiencies of the genes, and uses multiple reference genes for normalization. The tool provides methods to visualize raw fluorescence value curves and displays quantification results as box plots. Furthermore, expression ratios are calculated based on a pairwise reallocation (resampling) approach and can be tested for statistical significance.
SARS is a Python and R-based Windows application for relative quantification of qPCR data providing a graphical user interface. The standalone software suite needs to be installed and takes Cq and efficiency values in Excel or TXT format. In addition to normalization and quantification, SARS is able to run statistical tests on the generated results.
CAmpER is a web-based tool for the basic analysis of qPCR data. The system has been last updated in 2008 and consequently its user interface is not meeting current standards. CAmpER allows uploading raw fluorescence data in a generic CSV format and supports file formats from different vendors (ABI, Bio-Rad, Mastercycler, Opticon, Roche, Rotor-Gene, Stratagene). Reaction efficiency and Cq value can be calculated using two different methods: DART and FPLM. After relative quantification, which does not offer to normalize the data to reference genes, data can be exported in CSV format. Furthermore, the tool supports the display of quantification results as bar plots.
The Cy0 method is provided as a free web interface (requiring registration) that takes raw fluorescence data (before baseline subtraction) as input. The method uses nonlinear regression to obtain the best fit estimators of reaction parameters. Similar to traditional Cq methods it is a threshold-based method, but the returned Cy0 value depends on the threshold and the amplification kinetics and thus compensates for small variations among samples. Calculated results are not directly returned to the user, but sent to a specified e-mail address.
The web-application MAKERGAUL uses raw fluorescence data (before baseline subtraction) to calculate quantification data based on MAK2 [63]. Using a simple web interface, quantification based on the MAK2 method can be performed without the requirement for standard curves or normalization with reference genes. As the tool is not available online, it needs to be installed and configured on a local server, which is described as challenging.
PCR-Miner is a web-based application for calculating Cq and efficiency values. It takes raw fluorescence data (before baseline subtraction) as input and requires converting the values into a specific text input format. The tool can be used without registration and either returns the calculated values directly on the website or sends them to a provided e-mail address.
QPCR is a web-based application supporting storage, management, and analysis of qPCR data. It comprises a parser to import data from qPCR instruments (ABI, Bio-Rad, Roche) and supports the RDML format. QPCR incorporates a variety of analysis methods to calculate Cq and amplification efficiency values. The analysis pipeline includes replicate handling, normalization using multiple reference genes, inter-run calibration, error propagation and fold change calculation. Calculated quantification results can be displayed as bar plots. Installation and configuration requires several steps when the application is used on a private server.
R is a free, open source cross-platform (e.g., UNIX platforms, Windows, and MacOS) software environment for statistical computing, visualization, and report generation. This scripting language is considered to be the lingua franca in the academic sector and in business applications. R is distributed with many statistical routines, but most of the functionality is provided by “R packages”. R packages are collections of functions, data, documentation, and occasionally compiled code in a structured format. Most of the packages can be installed from the Comprehensive R Archive Network (CRAN) or Bioconductor [64]. By default R is command-line driven but tools have been published to provide a graphical user interface (GUI) or Integrated Development Environment (IDE) on top of the underlying framework [65], [66].
ddCT is an R package that collects, analyzes, and visualizes qPCR data using an improved ddCT [67] method. It takes predetermined Cq and optional efficiency values to perform relative quantification using (multiple) reference genes. Results are returned in text format and the package offers methods to generate quantification results as bar plots. The methods can be either invoked by a script or through a provided application programming interface (API). The package is well documented and sample R-code is provided.
The R package dpcR is the first open source package for the statistical analysis of digital PCR experiments. It is a comprehensive collection of functions for dPCR data analysis and simulation based on Poisson statistics. The package can be used for chamber based real-time digital PCR systems as well as droplet based digital PCR systems and contains methods derived from peer-review publications for quantification and zygosity determination. In addition, novel uncertainty calculations have been introduced for the analysis of dPCR experiments. Absolute quantification is performed by counting the number of positive compartments and relating it to the total number of compartments by means of Poisson statistics. Furthermore, dpcR is able to perform analysis of CNVs and rare mutations. The package includes many published statistical approaches, summary functions, and data sets for dPCR. In addition, it supports the generation of bar plot for quantification results, amplitude plots and density plots. dpcR uses the shiny [68] R packages to provide an interactive web or off-line graphical user interface application, which can be used by both R novices and experts.
The recently released R package EasyqpcR is based on the qBase algorithms [26] and calculates relative quantities using Cq values. It is able to determine amplification efficiencies using dilution series, and calculates normalization factors based on multiple reference genes. In general, the provided functionality is very similar to qBase. A graphical user interface is built upon R GUI Studio and the gWidgets package.
FPK-PCR is an R-based tool for the analysis of raw fluorescence data (before baseline subtraction). It calculates reaction efficiencies and Cq values by reconstructing the entire chain of cycle efficiencies. The R-code is released with a short documentation illustrating a general example.
HTqPCR is a newly published R package designed for the analysis of Cq values. It accepts tab-delimited text files and performs quality assessment, normalization (quantile normalization, rank-invariant, scale rank-invariant, housekeeping – deltaCq, geometric mean), and testing for statistical significance. Individual samples can be flagged as undetermined or unreliable, which is used throughout the analysis. The package supports principal component analysis (PCA) and clustering of analysis results. It offers multiple methods for data visualization, such as bar/box plots of quantification results, or Cq values as heatmap/scatter plot. HTqPCR is released with an extensive documentation illustrating several use cases.
The R package NormqPCR provides methods for the normalization and quantification of qPCR data. Its documentation contains several examples helping novice R-users to get used with the package. It supports technical replicate handling and offers methods for selecting the best genes for normalization. Furthermore, it uses the R-package ReadqPCR for data import and supports normalization using multiple reference genes. Relative quantification can be performed by assuming perfect amplification or by including predetermined reaction efficiencies.
The R-based package qpcrNorm includes three (quantile normalization, rank-invariant, housekeeping – deltaCq) algorithms for qPCR data normalization. It uses Cq values as input but can also deal with any measure of transcript abundance. The included normalization methods are especially applicable to any high-throughput qPCR setup or experiments where the stability of reference genes has not been validated. In addition to textual output, the package provides functionality to generate diagnostic plots (e.g., scatter plot for comparing the effects of normalization methods). Calculated quantification results can be displayed as box plots. Documentation explains the basic functionality but does not include sample code.
The qpcR library is an R-based package that assists researchers in the modeling and analysis of quantitative real-time PCR data. It includes many published methods to perform a variety of qPCR data analysis steps including different methods for replicate handling, Cq value calculation, normalization, and relative quantification. qpcR is able to perform melting curve analysis and allows visualization of the raw fluorescence curve (before as well as after baseline subtraction). Furthermore, it features customizable import functionality and plotting quantification results as bar plots, box plots, or Cq heatmaps. The package is well documented but does not provide much sample code for novice R-users. To provide a graphical user interface for the qpcR library, a plug-in for the RKWard [66], [69] system has been created. RKWard is a GUI and IDE for statistical analysis with R. The RKWard qPCR plugin uses core features (e.g., Cq value calculation) of the qpcR package while other functionality for preprocessing (e.g., removal of missing values, data transformation) and melting curve analysis is derived from the chipPCR [34] and the MBmca [70] packages. The chipPCR package contains functions to perform background subtraction, to simulate qPCRs, and to detect the start of an amplification reaction. Furthermore, it includes a Cq value quantification method based on the first and second derivative maximum method and supports calculation of the amplification efficiency of dilution experiments.
Deconvolution is a Perl based program for the efficiency-independent absolute quantification of qPCR data. The method depends on a computer-assisted deconvolution that determines amplification behavior between the unknown template and an amplicon standard. Deconvolution has been released in 2010, uses raw fluorescence values (before baseline subtraction) and outputs quantification results in text format.
LRE Analyzer is a Java-based, standalone, platform independent desktop application offering automated analysis and data storage capabilities. It presents a state-of-the art graphical interface and uses Microsoft Excel for import of raw fluorescence data (before baseline subtraction). The application allows users to conduct absolute quantification without construction of target-specific standard curves. Quantification results are presented in fluorescence units or in target molecules if a known quantity is amplified and used as a reference. The application allows users to visualize the raw fluorescence curve.
The MATLAB based tool LRE Analysis is based on the published LRE method [71]. It reads raw fluorescence values (after baseline subtraction) from an Excel file that needs to comply with a specific format. Currently, it supports one ABI system and performs data processing in a semi-automated way. Quantification results are exported in a new Excel file containing calculated values.
pyQPCR is a Python based application that provides a modern graphical user interface. It takes Cq values as input and supports data import from a few Eppendorf and ABI systems file formats. The methods implemented in pyQPCR are based on qBase [26], and the application is able to perform relative quantification and determination of efficiencies using standard curve calculation. Quantification results can be displayed as bar plots.
SASqPCR is a program that requires the commercial SAS software suite, but according to the authors extensive SAS programming knowledge is not required. The tool performs calculations on Cq values and incorporates functions for assessment of PCR efficiencies, validation of reference genes, and normalization using multiple reference genes. Furthermore, SASqPCR provides methods for the statistical comparisons of target gene expression of parallel samples.
In our review, we provide a comprehensive survey of software packages and tools for the analysis of qPCR data, including Cq calculation, efficiency determination, normalization, and quantification. In addition, we report tools dedicated for the analysis of dPCR experiments. The information we provide represents a valuable resource for researchers working with qPCR data and supports users with the selection of the appropriate analysis tool for a specific application.
Surveyed software packages and tools are manifold and vary in the supported methods and usability. The choice of the correct method to analyze the qPCR data at hand strongly depends on the experimental setup and assay. Therefore, a number of review articles were recently published [15], [72] to facilitate the choice of the most suitable analysis method for a particular application.
Windows tools for quantification analysis include standalone applications and tools based on Microsoft Excel. “LRE analyzer” is a recently developed, platform independent tool with an intuitive interface that can be used for absolute quantification without the need for construction of target-specific standard curves. REST is a popular tool available on the Microsoft Windows platform, which has been released in four versions so far. Amongst the reviewed tools, three applications are built on top of Microsoft Excel, where qBase has been cited the most. CopyCaller is the only tool dedicated to the analysis of CNVs.
Web-based tools include CAmpER, Cy0 Method, MAKERGAUL, PCR-Miner, and QPCR. QPCR offers the most comprehensive functionality and supports, in addition to storing and sharing of raw and processed data, relative quantification of qPCR data. MAKERGAUL and QPCR can also be installed on a private server where configuration and management may be challenging. PCR-Miner is a widely used tool for calculating Cq and efficiency values. The user interfaces of CAmpER, Cy0 Method, and PCR-Miner do not comply with current standards and thus the tools may be difficult to use.
Several R-packages are available based on the statistical computing and graphics environment R, which usually do not offer a graphical user interface and need to be used on command line. They provide an excellent choice for users who wish to keep the pipeline for the analysis transparent and highly customizable. The currently available packages enable the seamless integration of analysis strategies for experimental qPCR platforms and commercial platforms. Furthermore, it has been demonstrated in several occasions that custom-made cross-platform GUIs and report generators can be created with little programming effort. In order to facilitate data input, the package ReadqPCR has been developed to parse raw qPCR data from different platforms [73]. Moreover, most cyclers are able to export raw data in various forms, which can be imported by custom-made parsers via the powerful import and export facilities of R. Since all described R packages inherit the properties of the R computing environment, we would like to emphasize that parallelization of analysis steps and distribution as service (e.g., shiny) are easy to implement. Among the surveyed R-packages that support normalization and quantification, the script based qpcR library currently provides the most comprehensive functionality ranging from different methods for Cq and efficiency calculation to quantification and data visualization. HTqPCR provides different normalization techniques that can be used when reference genes are not present or not reliably expressed. dPCR is an R-package for the analysis of dPCR data providing complete functionality to analyze the respective experiment.
In addition to R-based packages, tools for other platforms using MATLAB, Perl, Python, and SAS have been published where the Python based tool pyQPCR offers the most extensive functionality. The tool offers a state-of-the art user interface and is based on the highly cited methods from [26]. A recent survey of publications mentioning qPCR revealed that methods used for analyzing the data were poorly reported [74]. Consequently, the MIQE guidelines specify that in addition to the analysis results detailed information about the applied methods need to be published. Furthermore, to allow repeatable and reproducible research the computer code used to analyze the data has to be made available to others [75]. Amongst others, the MIQE guidelines list essential information that has to be submitted with the manuscript: qPCR analysis program, Cq method, normalization methods, and statistical methods for results significance. Moreover, the exact software version and source code of newly developed methods need to be published as well. The use of specific qPCR analysis software allows providing detailed information on the methods of data analysis and confidence estimation. We therefore encourage using published, peer-reviewed software that proved to be suitable for qPCR data analysis.
In addition, the MIQE guidelines provide specific recommendations, which need be considered when performing qPCR data analysis. Emphasis is placed on the importance of PCR efficiency, especially when reporting mRNA concentrations for target genes relative to those of reference genes. Researchers need to either ensure that the genes were amplified with comparable efficiencies or include the specific amplification efficiencies in the analysis. Furthermore, when data is normalized against reference genes at least two reference genes have to be included in the analysis, unless data is presented that shows that a single reference gene is invariantly expressed under the experimental conditions described. Table 2 includes information about tools that adhere to these MIQE recommendations.
Quantification software packages and tools for qPCR. Listed are software packages and tools capable of performing quantification analysis. The Cq column was ticked if the tool is able to perform Cq calculation from raw fluorescence data. If the tool is able to calculate amplification efficiencies (based on dilution series or on raw fluorescence data) the appropriate column was ticked. Tools are marked for absolute quantification (Abs quant) if they are able to output absolute quantification values. In order to qualify for relative quantification (Rel quant) fold change values after relative quantification need to be returned. Error propagation denotes that ability of the tool to propagate the error throughout the various analysis steps. The normalization column was ticked if the tool implements one or several normalization techniques (not limited to normalization to reference genes). NA handling describes the possibility to deal with missing values after initial Cq calculation. Graphs were checked if the tools included methods for graphical (quantification) result presentation. Statistics was ticked, if the tool includes methods for statistical analysis or direct calls to underlying statistical frameworks after obtaining quantification or normalization results.
Tool | Cq | Efficiency | Abs quant | Rel quant | Error propagation | Normalization | NA handling | Graphs | Statistics | Compliant with MIQE recommendations a |
---|---|---|---|---|---|---|---|---|---|---|
CAmpER | + | + | − | + | − | − | − | + | − | − |
chipPCR | + | + | − | − | − | − | + | + | − | + |
Cy0 Method | + | − | − | − | − | − | − | − | − | + |
DART-PCR | + | − | − | + | − | + | − | + | − | − |
ddCT | − | − | − | + | − | + b , c | − | + | + | + |
Deconvolution | − | − | + | − | − | − | − | − | − | + |
dpcR | − | − | + | − | − | − | + | + | + | + |
EasyqpcR | − | + | − | + | − | + b , c | + | − | − | + |
HTqPCR | − | − | − | + | − | + b | + | + | + | − |
FPK-PCR | + | + | − | − | − | − | − | − | − | − |
LinRegPCR | + | + | + | − | − | − | − | + | − | + |
LRE Analysis | − | − | + | − | − | − | − | − | − | + |
LRE Analyzer (lreqcpr) | − | − | + | − | − | − | − | + | − | + |
MAKERGAUL | + | − | + d | − | − | − | − | − | − | + |
NormqPCR | − | − | − | + | − | + b , c | + | − | − | + |
PCR-Miner | + | + | − | − | − | − | − | − | − | + |
pyQPCR | − | + | − | + | + | + b , c | + | + | − | + |
qBase | − | − | − | + | + | + b , c | + | + | + | + |
qCalculator | − | + | − | + | − | + c | + | + | − | − |
QPCR | + | + | − | + | + | + b , c | + | + | + | + |
qpcR | + | + | + | + | + | + b , c | + | + | − | + |
qPCR-DAMS | − | − | + | + | − | + b , c | + | − | − | + |
qpcrNorm | − | − | − | − | − | + b | − | + | + | − |
REST | − | − | − | + | + | + b , c | − | + | + | + |
SARS | − | − | − | + | − | + b , c | − | − | + | + |
SASqPCR | − | + | − | + | − | + b , c | − | − | + | + |
a MIQE compliant: rel quant → includes PCR efficiency, normalization against multiple reference genes.