PuSH - Publication Server of Helmholtz Zentrum München

Probabilistic PCA of censored data: Accounting for uncertainties in the visualisation of high-throughput single-cell qPCR data.

Bioinformatics 30, 1867-1875 (2014)
Publishers Version DOI PMC
Free by publisher
Open Access Green as soon as Postprint is submitted to ZB.
MOTIVATION: High-throughput single-cell qPCR is a promising technique allowing for new insights in complex cellular processes. However, the PCR reaction can only be detected up to a certain detection limit, while failed reactions could be due to low or absent expression and the true expression level is unknown. As this censoring can occur for high proportions of the data, it is one of the main challenges when dealing with single-cell qPCR data. PCA is an important tool for visualising the structure of high-dimensional data as well as for identifying sub-populations of cells. However, to date it is not clear how to perform a PCA of censored data. We present a probabilistic approach which accounts for the censoring and evaluate it for two typical data-sets containing single-cell qPCR data. RESULTS: We use the Gaussian Process Latent Variable Model (GPLVM) framework to account for censoring by introducing an appropriate noise model and allowing a different kernel for each dimension. We evaluate this new approach for two typical qPCR data-sets (of mouse embryonic stem cells and blood stem/progenitor cells respectively) by performing linear and non-linear probabilistic PCA. Taking the censoring into account results in a 2D representation of the data which better reflects its known structure: in both data-sets our new approach results in a better separation of known cell types and is able to reveal subpopulations in one data-set which could not be resolved using standard PCA. AVAILABILITY: The implementation was based on the existing GPLVM toolbox(1); extensions for noise models and kernels accounting for censoring are available from http://icb.helmholtz-muenchen.de/censgplvm.
Additional Metrics?
Icb_Latent Causes Icb_ML
Edit extra informations Login
Publication type Article: Journal article
Document type Scientific Article
Keywords Gene-expression Analysis; Principal Component; Detection Limits; Stem; Heterogeneity; Hematopoiesis; Blastocyst; Zygote
Reviewing status