Open Access Green möglich sobald Postprint bei der ZB eingereicht worden ist.
Learning Tn5 sequence bias from ATAC-seq on naked chromatin.
Lect. Notes Comput. Sc. 12396 LNCS, 105-114 (2020)
DOI Verlagsversion bestellen
Technological advances in the last decade resulted in an explosion of biological data. Sequencing methods in particular provide large-scale data sets as resource for incorporation of machine learning in the biological field. By measuring DNA accessibility for instance, enzymatic hypersensitivity assays facilitate identification of regions of open chromatin in the genome, marking potential locations of regulatory elements. ATAC-seq is the primary method of choice to determine these footprints. It allows measurements on the cellular level, complementing the recent progress in single cell transcriptomics. However, as the method-specific enzymes tend to bind preferentially to certain sequences, the accessibility profile is confounded by binding specificity. The inference of open chromatin should be adjusted for this bias. To enable such corrections, we built a deep learning model that learns the sequence specificity of ATAC-seq’s enzyme Tn5 on naked DNA. We found binding preferences and demonstrate that cleavage patterns specific to Tn5 can successfully be discovered by the means of convolutional neural networks. Such models can be combined with accessibility analysis in the future in order to predict bias on new sequences and furthermore provide a better picture of the regulatory landscape of the genome.
Zusatzinfos bearbeiten [➜Einloggen]
Publikationstyp Artikel: Journalartikel
Dokumenttyp Wissenschaftlicher Artikel
Schlagwörter Convolutional Neural Networks ; Deep Learning ; Regulatory Element Discovery ; Sequence Preference Bias ; Single-cell Atac-seq
ISSN (print) / ISBN 0302-9743
Zeitschrift Lecture Notes in Computer Science
Quellenangaben Band: 12396 LNCS, Seiten: 105-114
Verlagsort Berlin [u.a.]
Institut(e) Institute of Computational Biology (ICB)