Skip to main content

Phenotype-based prioritization of variants with CADA

Project description

CI codecov Documentation Status Pypi

CADA: The Next Generation

This is a re-implementation of the CADA method for phenotype-similarity prioritization.

Running Hyperparameter Tuning

Install with tune feature enabled:

pip install cada-prio[tune]

Run tuning, e.g., on the "classic" model. Thanks to optuna, you can run this in parallel as long as the database is shared. Each run will use 4 CPUs in the example below and perform 1 trial.

cada-prio tune run-optuna \
    sqlite:///local_data/cada-tune.sqlite \
    --path-hgnc-json data/classic/hgnc_complete_set.json \
    --path-hpo-genes-to-phenotype data/classic/genes_to_phenotype.all_source_all_freqs_etc.txt \
    --path-hpo-obo data/classic/hp.obo \
    --path-clinvar-phenotype-links data/classic/cases_train.jsonl \
    --path-validation-links data/classic/cases_validate.jsonl \
    --n-trials 1 \
    --cpus=4

Managing GitHub Project with Terraform

# export GITHUB_OWNER=bihealth
# export GITHUB_TOKEN=ghp_<thetoken>

# cd utils/terraform

# terraform init
# terraform import github_repository.cada-prio cada-prio
# terraform validate
# terraform fmt
# terraform plan
# terraform apply

Changelog

0.7.0 (2024-08-29)

Features

0.6.1 (2023-11-16)

Bug Fixes

  • pinning python to 3.11 for build so we have setuptools (#36) (54d4e8c)

0.6.0 (2023-11-16)

Features

  • adding API prefix, OpenAPI and docs to REST server (#35) (1a2f605)
  • adding classic and current model (#25) (44ddf24)

0.5.0 (2023-09-18)

Features

  • adding "tune run-optuna" command (#23) (6cc753b)
  • re-useable implementation of "tune train-eval" (#21) (c80c4bf)

0.4.0 (2023-09-14)

Features

  • adding dump-graph to cli (#18) (3aace31)
  • adding param-opt command with single parameter evaluation (#20) (83141c6)
  • allow running with legacy model/graph data (#16) (9d3cc7c)
  • embedding parameters can be provided via CLI and contains seeds (#19) (bbd5d86)

0.3.1 (2023-09-13)

Bug Fixes

  • add missing line endings to hgnc_info.jsonl (#13) (aa14b9b)
  • properly parsing comma-separated list on REST API (#14) (97fdfee)

0.3.0 (2023-09-11)

Features

  • also adding gene-to-phen edges from HPO (#9) (d5a8337)

0.2.1 (2023-09-08)

Bug Fixes

  • removing spurious debug print statement (#7) (98e7443)

0.2.0 (2023-09-08)

Features

  • gene to phenotype links file can be gziped (#5) (66c48bf)

0.1.0 (2023-09-07)

Features

  • adding REST API server for prediction (#4) (8bb7516)
  • initial training implementation (#1) (10d3a7c)
  • prioritization prediction with model (#3) (48d504c)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cada-prio-0.7.0.tar.gz (24.3 kB view details)

Uploaded Source

File details

Details for the file cada-prio-0.7.0.tar.gz.

File metadata

  • Download URL: cada-prio-0.7.0.tar.gz
  • Upload date:
  • Size: 24.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.12.5

File hashes

Hashes for cada-prio-0.7.0.tar.gz
Algorithm Hash digest
SHA256 520a0b35c72baa8ddc12d1d9b39061ebbe1bc17ba8d0f2c885b70c24966d12ca
MD5 b74f1c75364234c08b0f3d91f0dda694
BLAKE2b-256 8053164b7de478ec4da71966fa527a4c4359b66c1ff999c482f80a0f07db019c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page