Skip to main content

Data validation library for GWAS Catalog summary statistics

Project description

gwascatalog.sumstatlib

This library contains Pydantic models for new types of genetic variation stored in the GWAS Catalog ("beyond SNPs").

It doesn't provide any user facing applications. Its purpose is to contain validation and domain logic for summary statistic validation applications (gwascatalog.sumstatapp).

Developer notes

Data model overview

This library contains Pydantic models which support validating Gene-based GWAS and Copy Number Variant (CNV) GWAS. In the future new types of data might be validated (e.g. SNPs).

classDiagram
    direction TB

    class BaseSumstatModel {
        <<abstract>>

        %% configuration (conceptual)
        +p_value : PValue | None
        +neg_log10_p_value : NegLog10pValue | None

        +z_score : ZScore | None
        +odds_ratio : OddsRatio | None
        +beta : Beta | None
        +hazard_ratio : HazardRatio | None
        +standard_error : StandardError | None

        +confidence_interval_lower : ConfidenceIntervalLower | None
        +confidence_interval_upper : ConfidenceIntervalUpper | None

        +n : SampleSizePerVariant | None

        --
        -_primary_effect_size : Literal["beta","z_score","hazard_ratio","odds_ratio"] | None

        -_allow_zero_pvalues : bool
    }

    class CNVSumstatModel {
        <<final>>

        +MIN_RECORDS : int
        +FIELD_MAP : Mapping[str,int]
        +VALID_FIELD_NAMES : list[str]

        +chromosome : Chromosome
        +base_pair_start : BasePairStart
        +base_pair_end : BasePairEnd
        +statistical_model_type : StatisticalModelTypeField

        --
        -_assembly : GenomeAssembly
    }

    class GeneSumstatModel {
        <<final>>

        +MIN_RECORDS : int
        +FIELD_MAP : Mapping[str,int]
        +VALID_FIELD_NAMES : list[str]

        +ensembl_gene_id : EnsemblGeneID | None
        +hgnc_symbol : HGNCGeneSymbol | None

        +chromosome : Chromosome | None
        +base_pair_start : BasePairStart | None
        +base_pair_end : BasePairEnd | None
    }

    BaseSumstatModel <|-- GeneSumstatModel
    BaseSumstatModel <|-- CNVSumstatModel

Implementing a new data model

If you want to implement a new data model you should:

  1. Create a new Python package inside src/gwascatalog/sumstatlib
  2. Set up annotated types for each field in the new model, importing and reusing types from the core package where possible
  3. Compose a new data model from the annotated types, inheriting from the abstract BaseSumstatModel class
  4. Add tests for your new types and model
  5. Add your model to __all__ in the library's root __init__.py

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gwascatalog_sumstatlib-1.0.0.tar.gz (20.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gwascatalog_sumstatlib-1.0.0-py3-none-any.whl (21.3 kB view details)

Uploaded Python 3

File details

Details for the file gwascatalog_sumstatlib-1.0.0.tar.gz.

File metadata

  • Download URL: gwascatalog_sumstatlib-1.0.0.tar.gz
  • Upload date:
  • Size: 20.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for gwascatalog_sumstatlib-1.0.0.tar.gz
Algorithm Hash digest
SHA256 2d6f7423420af8ab93b9af43ad85712cefc49713a980d920e3a48fc653217db3
MD5 6f601512516e0064c6b99cb093dbed1a
BLAKE2b-256 1931b6cdeb5ffe3ca6b2e5f25c3d1691497a0ed1030cbaeb765dea0a23ade88c

See more details on using hashes here.

File details

Details for the file gwascatalog_sumstatlib-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: gwascatalog_sumstatlib-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 21.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for gwascatalog_sumstatlib-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0598d5d64be7b9d9bec19c5a7cfbbbf55fe8ab78c726579fca7e297c2723a9ae
MD5 126e1b48a94063591381ecd88d959583
BLAKE2b-256 f24ce2a47f8c4bff2b1f5c403994de26dbb194d9cb0c8f3f1720489d39a2d05b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page