PuSH - Publikationsserver des Helmholtz Zentrums München

Probabilistic PCA of censored data: Accounting for uncertainties in the visualisation of high-throughput single-cell qPCR data.

Bioinformatics 30, 1867-1875 (2014)
Verlagsversion DOI PMC
Free by publisher
MOTIVATION: High-throughput single-cell qPCR is a promising technique allowing for new insights in complex cellular processes. However, the PCR reaction can only be detected up to a certain detection limit, while failed reactions could be due to low or absent expression and the true expression level is unknown. As this censoring can occur for high proportions of the data, it is one of the main challenges when dealing with single-cell qPCR data. PCA is an important tool for visualising the structure of high-dimensional data as well as for identifying sub-populations of cells. However, to date it is not clear how to perform a PCA of censored data. We present a probabilistic approach which accounts for the censoring and evaluate it for two typical data-sets containing single-cell qPCR data. RESULTS: We use the Gaussian Process Latent Variable Model (GPLVM) framework to account for censoring by introducing an appropriate noise model and allowing a different kernel for each dimension. We evaluate this new approach for two typical qPCR data-sets (of mouse embryonic stem cells and blood stem/progenitor cells respectively) by performing linear and non-linear probabilistic PCA. Taking the censoring into account results in a 2D representation of the data which better reflects its known structure: in both data-sets our new approach results in a better separation of known cell types and is able to reveal subpopulations in one data-set which could not be resolved using standard PCA. AVAILABILITY: The implementation was based on the existing GPLVM toolbox(1); extensions for noise models and kernels accounting for censoring are available from http://icb.helmholtz-muenchen.de/censgplvm.
Altmetric
Weitere Metriken?
Tags
Icb_Latent Causes Icb_ML
Zusatzinfos bearbeiten [➜Einloggen]
Publikationstyp Artikel: Journalartikel
Dokumenttyp Wissenschaftlicher Artikel
Schlagwörter Gene-expression Analysis; Principal Component; Detection Limits; Stem; Heterogeneity; Hematopoiesis; Blastocyst; Zygote
ISSN (print) / ISBN 1367-4803
Zeitschrift Bioinformatics
Quellenangaben Band: 30, Heft: 13, Seiten: 1867-1875 Artikelnummer: , Supplement: ,
Verlag Oxford University Press
Verlagsort Oxford
Begutachtungsstatus