Skip to main content

RAD-seq EM mixture with logistic prior: scan, annotate, model, diff

Project description

accmix logo
Under active development — interfaces may evolve
License: CC BY-NC-ND 4.0 PyPI version

accmix: Accessibility Mixture Model

CLI toolkit to (1) scan PWMs genome-wide, (2) compute accessibility-derived site scores s_l, (3) annotate with TSS/conservation/TPM, and (4) fit and evaluate a Gaussian EM with a logistic prior.

Installation

Install from github repository

pip install -e .

Install from pypi

pip install accmix

Documentation

Full user and API documentation is available at:

Data layout

Example inputs are referenced under data/...:

  • data/fasta/test.fa – small test genome FASTA.
  • data/pwms/M00124_example.txt – example PWM for scanning.
  • data/clipseq/ELAVL1_HeLa.bed – example CLIP-seq peaks.

CLI overview

The main entry point is accmix with subcommands:

1. Scan PWM and GC

accmix scan \
  -f data/fasta/test.fa \
  -p data/pwms/M00124_example.txt \
  -o results/M00124_example

Outputs (for example PWM):

  • results/M00124_example_topA.tsv.gz
  • results/M00124_example_botB.tsv.gz

2. Annotate accessibility (compute s_l)

accmix annotate-acc \
  -n data/AS/ANC1C.hisat3n_table.bed6 \
  -f data/AS/ANC1xC.hisat3n_table.bed6 \
  -t results/M00124_example_topA.tsv.gz \
  -o results/M00124_example_sl.parquet

Important options:

  • -M / --M – inner flank size (default: 50).
  • -N / --N – outer flank size (default: 500). Outer flank length is N - M.

3. Annotate TSS / conservation / TPM

accmix annotate-tss \
  -i results/M00124_example_sl.parquet \
  -o results/M00124_example_annotated.parquet \
  -r data/evaluation/RNAseq_HeLa_TPM.parquet \
  -c data/evaluation/phastCons100way.bed.gz \
  -p data/evaluation/phastCons100way.parquet \
  -y data/evaluation/phyloP100way.parquet \
  -R data/fasta/test.fa

4. Fit EM model (Gaussian + logistic prior)

accmix model \
  -i results/M00124_example_annotated.parquet \
  -o results/RBP_Motif \
  -r ExampleRBP \
  -m M00124

Outputs:

  • results/RBP_Motif.XXXXXX.model.parquet – input data with prior_p and posterior_r.
  • results/RBP_Motif.XXXXXX.model.json – fitted model parameters.

Notes

  • Dependencies: polars, pyranges, metagene, numpy, scipy, pyarrow, numba, tqdm, typer[all], scikit-learn, matplotlib, seaborn, pandas.
  • The CLI options mirror the underlying scripts; run accmix <command> --help for full details.

All files under data/ can be downloaded from:

https://(university of chicago's short name, with 8 letters).box.com/s/(remove following first nine letters and final 6 letters)meipiannitsmywbgimvl47w9nynq66hur3c8dnirvzhende

If you use this package, please cite:

Li Y., accmix: Accessibility Mixture Model, GitHub repository, https://github.com/yangli04/RAD-seq_EM_model

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

accmix-0.2.1.tar.gz (24.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

accmix-0.2.1-py3-none-any.whl (26.2 kB view details)

Uploaded Python 3

File details

Details for the file accmix-0.2.1.tar.gz.

File metadata

  • Download URL: accmix-0.2.1.tar.gz
  • Upload date:
  • Size: 24.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for accmix-0.2.1.tar.gz
Algorithm Hash digest
SHA256 1f791058cbb9616261a09cbe97ed51aa525ed42781c5d0490269ac701363b2c6
MD5 5e1d2cd3d70d527ba8886e056cdc5f82
BLAKE2b-256 00d07efb621aca97889050fa113c1c323cd8384772b059d8920665c8d65c922f

See more details on using hashes here.

File details

Details for the file accmix-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: accmix-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 26.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for accmix-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ed045ea429af668ab357e5408f8bccf23478819eb8a7fa6e142d20c0d31dec44
MD5 6a7f59c1a084f74b915af4d93f6db9d8
BLAKE2b-256 b83d0ba6798672c0a04eb53839caf92b8a938fc108f44e4a27948bca92e935e4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page