Skip to main content

Library-first extraction helpers for bioinformatics resource snapshots.

Project description

bioextract

Library-first extraction helpers for bioinformatics resource snapshots.

Install

  • pip install bioextract

STRINGdb

from bioextract.stringdb import StringDb, StringResourceLimits

selection = (
    StringDb.from_files(
        file_aliases="9606.protein.aliases.v12.0.txt.gz",
        file_links="9606.protein.links.v12.0.txt.gz",
        limits=StringResourceLimits(num_input_ids_max=50_000),
    )
    .select_ids(["P04637", "EGFR", "CDK2"])
    .with_score_min(400)
)

df_mapping = selection.extract_string_mapping()
df_unmapped = selection.extract_unmapped_input_ids()
df_edges = selection.extract_edges()

print(df_mapping)
print(df_unmapped)
print(df_edges)
from bioextract.stringdb import StringDb

df_group_edges = (
    StringDb.from_files(
        file_aliases="9606.protein.aliases.v12.0.txt.gz",
        file_links="9606.protein.links.v12.0.txt.gz",
    )
    .select_groups(
        {
            "TumorA": ["TP53", "EGFR"],
            "TumorB": ["CDK2", "TP53"],
        }
    )
    .with_score_min(400)
    .extract_edges()
)

OmniPath

from bioextract.omnipath import OmniPathDb

selection = (
    OmniPathDb.from_files(
        file_enzsub="enzsub.tsv.gz",
        file_interactions="interactions.tsv.gz",
    )
    .select_ids(["P31749", "AKT1", "BAD"])
    .with_enzsub()
)

df_enzsub = selection.extract_enzsub()
df_unmapped = selection.extract_unmapped_input_ids()

print(df_enzsub)
print(df_unmapped)
from bioextract.omnipath import OmniPathDb

df_group_interactions = (
    OmniPathDb.from_files(file_interactions="interactions.tsv.gz")
    .select_groups(
        {
            "TumorA": ["AKT1", "MTOR"],
            "TumorB": ["EGFR", "ERBB2"],
        }
    )
    .with_interactions()
    .extract_interactions()
)

Development

  • PYTHONPATH=src pytest
  • PYTHONPATH=src python scripts/benchmark_stringdb.py

Release

  • GitHub Actions now provides:
    • .github/workflows/py-ci.yml for test-and-build checks on push and pull request
    • .github/workflows/publish.yml for tag-triggered PyPI publishing
  • Release tags must be canonical PEP 440 versions such as 0.1.1
  • The publish workflow expects PyPI trusted publishing to be configured for the pypi environment

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bioextract-0.0.4.tar.gz (18.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bioextract-0.0.4-py3-none-any.whl (17.4 kB view details)

Uploaded Python 3

File details

Details for the file bioextract-0.0.4.tar.gz.

File metadata

  • Download URL: bioextract-0.0.4.tar.gz
  • Upload date:
  • Size: 18.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for bioextract-0.0.4.tar.gz
Algorithm Hash digest
SHA256 41a50b8dd543271759bfad1f41930c2a1fec7586f3c898e48176090b0874fa02
MD5 ee7b05fd617dd9abf59ff5dbea54a4d5
BLAKE2b-256 4a1250380c62f5a2dae6d2d555d164cda7eb0fcabd148b9d0f18e664f322e283

See more details on using hashes here.

Provenance

The following attestation bundles were made for bioextract-0.0.4.tar.gz:

Publisher: publish.yml on FuqingZh/bioextract

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file bioextract-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: bioextract-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 17.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for bioextract-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 10760534b8606853a64aa901b660e70c21ba5dc424c5ad1b4f05805b6a09c452
MD5 bf1585fa04460a507a2d3545f25add96
BLAKE2b-256 8fe4a4801323e654e4ca142564d6e83797f20771afde667571fe280533ddb2c4

See more details on using hashes here.

Provenance

The following attestation bundles were made for bioextract-0.0.4-py3-none-any.whl:

Publisher: publish.yml on FuqingZh/bioextract

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page