Objective Comparison of High Throughput qPCR Data Analysis Methods
For the past 25 years, real time quantitative PCR (qPCR) has been the method of choice to measure gene expression in biological research and diagnosis using semi-automated data analysis methods adapted to low throughput. Recently, modern platforms have largely increased the throughput of samples and a single HTqPCR experiment can produce up to ten thousand reaction curves, requiring full automation. An extensive comparison of these methods in this new context has not yet been performed. In consequence, the community has not yet named a reference method to analyze HT-qPCR data. In this work, we aim to evaluate and compare available qPCR data analysis methods when ported to high throughput. In order to perform such comparison on a common ground, we developed a preprocessing approach based on the design of a robust high throughput fitting to correct bias and baseline prior to further quantification of gene expression by individual methods. Using four quantitative criteria, we then compared results obtained on high throughput experiments using five reference methods designed for low throughput qPCR data analysis (Cq, Cy0, logistic5p, LRE, LinReg) as well as deep learning. The advantages and disadvantages of these methods in this new context were discussed. While deep learning presents the advantage of not requiring any preprocessing steps, we nevertheless conclude that Cq, one of the oldest method, because of its simplicity, its robustness to dataset variability, and the fact that it doesn’t require a large training set, should be preferred over other approaches to analyze high throughput qPCR experiments.