Open Access Green as soon as Postprint is submitted to ZB.
Learning Tn5 sequence bias from ATAC-seq on naked chromatin.
Lect. Notes Comput. Sc. 12396 LNCS, 105-114 (2020)
Technological advances in the last decade resulted in an explosion of biological data. Sequencing methods in particular provide large-scale data sets as resource for incorporation of machine learning in the biological field. By measuring DNA accessibility for instance, enzymatic hypersensitivity assays facilitate identification of regions of open chromatin in the genome, marking potential locations of regulatory elements. ATAC-seq is the primary method of choice to determine these footprints. It allows measurements on the cellular level, complementing the recent progress in single cell transcriptomics. However, as the method-specific enzymes tend to bind preferentially to certain sequences, the accessibility profile is confounded by binding specificity. The inference of open chromatin should be adjusted for this bias. To enable such corrections, we built a deep learning model that learns the sequence specificity of ATAC-seq’s enzyme Tn5 on naked DNA. We found binding preferences and demonstrate that cleavage patterns specific to Tn5 can successfully be discovered by the means of convolutional neural networks. Such models can be combined with accessibility analysis in the future in order to predict bias on new sequences and furthermore provide a better picture of the regulatory landscape of the genome.
Edit extra informations Login
Publication type Article: Journal article
Document type Scientific Article
Keywords Convolutional Neural Networks ; Deep Learning ; Regulatory Element Discovery ; Sequence Preference Bias ; Single-cell Atac-seq
ISSN (print) / ISBN 0302-9743
Quellenangaben Volume: 12396 LNCS, Pages: 105-114
Publishing Place Berlin [u.a.]
Institute(s) Institute of Computational Biology (ICB)