Skip to main content

Library-first extraction helpers for bioinformatics resource snapshots.

Project description

bioextract

Library-first extraction helpers for bioinformatics resource snapshots.

Install

  • pip install bioextract

STRINGdb

from bioextract.stringdb import StringDb, StringResourceLimits

selection = (
    StringDb.from_files(
        file_aliases="9606.protein.aliases.v12.0.txt.gz",
        file_links="9606.protein.links.v12.0.txt.gz",
        limits=StringResourceLimits(num_input_ids_max=50_000),
    )
    .select_ids(["P04637", "EGFR", "CDK2"])
    .with_score_min(400)
)

df_mapping = selection.extract_string_mapping()
df_unmapped = selection.extract_unmapped_input_ids()
df_edges = selection.extract_edges()

print(df_mapping)
print(df_unmapped)
print(df_edges)
from bioextract.stringdb import StringDb

df_group_edges = (
    StringDb.from_files(
        file_aliases="9606.protein.aliases.v12.0.txt.gz",
        file_links="9606.protein.links.v12.0.txt.gz",
    )
    .select_groups(
        {
            "TumorA": ["TP53", "EGFR"],
            "TumorB": ["CDK2", "TP53"],
        }
    )
    .with_score_min(400)
    .extract_edges()
)

OmniPath

from bioextract.omnipath import OmniPathDb

selection = (
    OmniPathDb.from_files(
        file_enzsub="enzsub.tsv.gz",
        file_interactions="interactions.tsv.gz",
    )
    .select_ids(["P31749", "AKT1", "BAD"])
    .with_enzsub()
)

df_enzsub = selection.extract_enzsub()
df_unmapped = selection.extract_unmapped_input_ids()

print(df_enzsub)
print(df_unmapped)
from bioextract.omnipath import OmniPathDb

df_group_interactions = (
    OmniPathDb.from_files(file_interactions="interactions.tsv.gz")
    .select_groups(
        {
            "TumorA": ["AKT1", "MTOR"],
            "TumorB": ["EGFR", "ERBB2"],
        }
    )
    .with_interactions()
    .extract_interactions()
)

Development

  • PYTHONPATH=src pytest
  • PYTHONPATH=src python scripts/benchmark_stringdb.py

Release

  • GitHub Actions now provides:
    • .github/workflows/py-ci.yml for test-and-build checks on push and pull request
    • .github/workflows/publish.yml for tag-triggered PyPI publishing
  • Release tags must be canonical PEP 440 versions such as 0.1.1
  • The publish workflow expects PyPI trusted publishing to be configured for the pypi environment

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bioextract-0.0.2.tar.gz (18.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bioextract-0.0.2-py3-none-any.whl (17.4 kB view details)

Uploaded Python 3

File details

Details for the file bioextract-0.0.2.tar.gz.

File metadata

  • Download URL: bioextract-0.0.2.tar.gz
  • Upload date:
  • Size: 18.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for bioextract-0.0.2.tar.gz
Algorithm Hash digest
SHA256 b59be4ae131b33b3436db2bfa84bf4323b32c80c4ced05ed597df267d4c73141
MD5 8451d26b7739e78f7293fce5626d8a30
BLAKE2b-256 6d1f69f39e45a3ef0516752dc1962aab88e2fdea5af3f5f0bf49bf8c058c7f5a

See more details on using hashes here.

Provenance

The following attestation bundles were made for bioextract-0.0.2.tar.gz:

Publisher: publish.yml on FuqingZh/bioextract

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file bioextract-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: bioextract-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 17.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for bioextract-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 202ed4873304a442df01f784620810afd8f0cd6ca34916fd6109633af12a2642
MD5 1d3f8414bb45c98e7142e9eed0d30f40
BLAKE2b-256 b31d2873082a15ccb3d85cf586e275bf58b50bd6dd416e6c13d78c3c7fc9ecb8

See more details on using hashes here.

Provenance

The following attestation bundles were made for bioextract-0.0.2-py3-none-any.whl:

Publisher: publish.yml on FuqingZh/bioextract

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page