Skip to main content

Rigorous integration of single-cell ATAC-seq data using regularized barycentric mapping

Project description

Single-cell assay for transposase-accessible chromatin using sequencing (scATAC-seq) deciphers genome-wide chromatin accessibility, providing profound insights into gene regulation mechanisms. With the rapid advance of sequencing technology, scATAC-seq data typically encompasses numerous samples generated under various conditions, leading to complex and multifactorial batch effects, thus necessitating reliable batch integration tools. Although numerous batch integration tools exist for single-cell RNA sequencing (scRNA-seq) data, their effectiveness on scATAC-seq data has proven limited due to the characteristic differences between scRNA-seq and scATAC-seq data. Existing integration methods for scATAC-seq data suffer from several fundamental limitations, such as disrupting the biological heterogeneity and focusing solely on low-dimensional correction, which can lead to data distortion and hinder downstream analysis. Here we propose Fountain, a deep learning framework for scATAC-seq data integration via rigorous barycentric mapping. Fountain regularizes barycentric mapping with geometric data information to achieve biological heterogeneity-preserving integration. Through comprehensive experiments on multiple datasets involving various laboratory protocols, sample composition and species, we demonstrate the advantages of Fountain over existing methods in batch correction and biological conservation. Additionally, the trained Fountain model can integrate data from new batches alongside already integrated data without retraining, thereby facilitating the incorporation of additional data and enabling continuous online data integration. Moreover, we provide a reconstruction strategy to obtain batch-corrected ATAC profiles, which has been proven to better capture cellular heterogeneity and reveal cell type-specific implications such as expression enrichment analysis and partitioned heritability analysis.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scFountain-0.0.3.tar.gz (11.5 kB view hashes)

Uploaded Source

Built Distribution

scFountain-0.0.3-py3-none-any.whl (13.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page