Skip to main content

Module for quality control of ATAC-seq data

Project description

Release Coverage Pipeline

drawing

Periodicity Evaluation in scATAC-seq data for quality assessment

A python tool for ATAC-seq quality control in single cells. On the bulk level quality control approaches rely on four key aspects:

- signal-to-noise ratio 
- library complexity
- mitochondrial DNA nuclear DNA ratio 
- fragment length distribution 

Hereby relies PEAKQC on the evaluation of the fragment length distribution. While on the bulk level the evaluation is done visually, it is not possible to do that on the single cell level. PEAKQC solves this constraint with an convolution based algorithmic approach.

API Documentation

A detailed API documentation is provided by our read the docs page: https://loosolab.pages.gwdg.de/software/peakqc/

Workflow

To execute the tool an anndata object and fragments, corresponding to the cells in the anndata have to be provided. The fragments can be either determined from a bamfile directly or by an fragments file in the bed format. If a fragments bedfile is available this is recommended to shorten the runtime.

Installation

PyPi

pip install peakqc

From Source

1. Enviroment & Package Installation

  1. Download the repository. This will download the repository to the current directory
git@gitlab.gwdg.de:loosolab/software/peakqc.git
  1. Change the working directory to the newly created repository directory.
cd sc_framework
  1. Install analysis environment. Note: using mamba is faster than conda, but this requires mamba to be installed.
mamba env create -f peakqc_env.yml
  1. Activate the environment.
conda activate peakqc
  1. Install PEAKQC into the enviroment.
pip install .

2. Package Installation

  1. Download the repository. This will download the repository to the current directory
git@gitlab.gwdg.de:loosolab/software/peakqc.git
  1. Change the working directory to the newly created repository directory.
cd sc_framework
  1. Install PEAKQC into the enviroment.
pip install .

Quickstart

Below is a minimal example showing how to integrate FLD scoring into a Jupyter Notebook. A fully worked example is available at paper/example_notebook.ipynb.

  1. Load your AnnData object
   import scanpy as sc

   # replace with your path to the .h5ad file
   anndata = sc.read_h5ad('path/to/your_data.h5ad')

Note: We recommend storing your cell barcodes as the .obs index in adata. If your barcodes are instead in a specific .obs column, you can override this via the barcode_col parameter (see below).

  1. Import FLD scoring function
from peakqc.fld_scoring import add_fld_metrics
  1. Prepare fragment files

    • Provide either a BED or BAM file via fragments=.

    • BED files are recommended for faster runtime.

    • Example:

fragments = 'path/to/fragments.bed'      # or .bam
  1. Run FLD scoring
adata = add_fld_metrics(adata=anndata,
                        fragments=fragments,
                        barcode_col=None,
                        plot=True,
                        save_density=None,
                        save_overview=None,
                        sample=0,
                        n_threads=8,
                        sample_size=5000,
                        mc_seed=42,
                        mc_samples=1000
                        )
  1. Filter on PEAKQC scores In our experience, PEAKQC scores above 100 are generally effective for filtering out low-quality cells. Hereby PEAKQC scores positively correlate with improving FLD patterns. However, it is important to note that optimal thresholds can vary between datasets and should be tuned to achieve reliable results.

    Threshold selection may also depend on the specific requirements of your downstream analysis, and should be adjusted accordingly.

For a step-by-step walkthrough along with plotting examples, see the example notebook at paper/example_notebook.ipynb

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

peakqc-0.1.6.tar.gz (95.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

peakqc-0.1.6-py3-none-any.whl (20.5 kB view details)

Uploaded Python 3

File details

Details for the file peakqc-0.1.6.tar.gz.

File metadata

  • Download URL: peakqc-0.1.6.tar.gz
  • Upload date:
  • Size: 95.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for peakqc-0.1.6.tar.gz
Algorithm Hash digest
SHA256 cd41369c31caaa0dedd38080dae56de599ebebb9b1b7b2829ca805c7d2f2044c
MD5 2b416c385a08f3de285dfa577e362fae
BLAKE2b-256 0a0c9b6d79c013b54b4878906fafc1c1a09ac89cc50c2aa81bba54d70d371d2d

See more details on using hashes here.

File details

Details for the file peakqc-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: peakqc-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 20.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for peakqc-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 d4138d073870674ce673658dda98ec0d370a8bd6d3e73a3f3d015d55a00f3c36
MD5 2a4dc9222190bd8fa6672e31344f5094
BLAKE2b-256 883a099a7673f6c0ca0ed00ffecb44abd35c3c79e29930ec5d9689f1b6c133be

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page