Skip to main content

Phenotype-based prioritization of variants with CADA

Project description

CI codecov Documentation Status Pypi

CADA: The Next Generation

This is a re-implementation of the CADA method for phenotype-similarity prioritization.

Running Hyperparameter Tuning

Install with tune feature enabled:

pip install cada-prio[tune]

Run tuning, e.g., on the "classic" model. Thanks to optuna, you can run this in parallel as long as the database is shared. Each run will use 4 CPUs in the example below and perform 1 trial.

cada-prio tune run-optuna \
    sqlite:///local_data/cada-tune.sqlite \
    --path-hgnc-json data/classic/hgnc_complete_set.json \
    --path-hpo-genes-to-phenotype data/classic/genes_to_phenotype.all_source_all_freqs_etc.txt \
    --path-hpo-obo data/classic/hp.obo \
    --path-clinvar-phenotype-links data/classic/cases_train.jsonl \
    --path-validation-links data/classic/cases_validate.jsonl \
    --n-trials 1 \
    --cpus=4

Managing GitHub Project with Terraform

# export GITHUB_OWNER=bihealth
# export GITHUB_TOKEN=ghp_<thetoken>

# cd utils/terraform

# terraform init
# terraform import github_repository.cada-prio cada-prio
# terraform validate
# terraform fmt
# terraform plan
# terraform apply

Changelog

0.6.1 (2023-11-16)

Bug Fixes

  • pinning python to 3.11 for build so we have setuptools (#36) (54d4e8c)

0.6.0 (2023-11-16)

Features

  • adding API prefix, OpenAPI and docs to REST server (#35) (1a2f605)
  • adding classic and current model (#25) (44ddf24)

0.5.0 (2023-09-18)

Features

  • adding "tune run-optuna" command (#23) (6cc753b)
  • re-useable implementation of "tune train-eval" (#21) (c80c4bf)

0.4.0 (2023-09-14)

Features

  • adding dump-graph to cli (#18) (3aace31)
  • adding param-opt command with single parameter evaluation (#20) (83141c6)
  • allow running with legacy model/graph data (#16) (9d3cc7c)
  • embedding parameters can be provided via CLI and contains seeds (#19) (bbd5d86)

0.3.1 (2023-09-13)

Bug Fixes

  • add missing line endings to hgnc_info.jsonl (#13) (aa14b9b)
  • properly parsing comma-separated list on REST API (#14) (97fdfee)

0.3.0 (2023-09-11)

Features

  • also adding gene-to-phen edges from HPO (#9) (d5a8337)

0.2.1 (2023-09-08)

Bug Fixes

  • removing spurious debug print statement (#7) (98e7443)

0.2.0 (2023-09-08)

Features

  • gene to phenotype links file can be gziped (#5) (66c48bf)

0.1.0 (2023-09-07)

Features

  • adding REST API server for prediction (#4) (8bb7516)
  • initial training implementation (#1) (10d3a7c)
  • prioritization prediction with model (#3) (48d504c)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cada-prio-0.6.1.tar.gz (24.0 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page