Skip to main content

Modernized InterMine WebService client (Python 3.14+)

Project description

intermine314

CI PyPI version Python versions supported License: MIT

Modern InterMine client for Python 3.14+ with:

  • query execution (Service + Query)
  • parallel export with bounded memory (ParallelOptions)
  • ELT workflows to Parquet, DuckDB, and Polars (fetch_from_mine)
  • Tor-safe transport defaults (socks5h:// policy in strict Tor mode)

Repository: https://github.com/karikris/intermine314

Install

pip install intermine314

Optional extras:

pip install "intermine314[speed]"   # orjson
pip install "intermine314[proxy]"   # PySocks

Quick Start

from intermine314.service import Service

service = Service("https://maizemine.rnet.missouri.edu/maizemine/service")
query = service.select("Gene.primaryIdentifier", "Gene.symbol")

for row in query.rows(size=5):
    print(row)

Parallel export uses ParallelOptions only:

from intermine314.query.builder import ParallelOptions

query.to_parquet(
    "/tmp/genes_parts",
    batch_size=5000,
    parallel_options=ParallelOptions(
        max_workers=8,
        profile="large_query",
        ordered="unordered",
        inflight_limit=8,
        max_inflight_bytes_estimate=64 * 1024 * 1024,
    ),
)

API Migration Notes

Compatibility aliases were removed to keep the runtime API minimal and explicit:

  • service.query(...) -> use service.select(...)
  • Service.get_mine_info(...) -> use Registry(...).info(...)
  • Service.get_all_mines(...) -> use Registry(...).all_mines(...)
  • Query aliases removed: filter, add_column*, add_views, order_by, all, size, summarize, c Use canonical Query methods (where, add_view, add_sort_order, count, column).

Minimal high-level ELT workflow:

from intermine314 import fetch_from_mine

result = fetch_from_mine(
    mine_url="https://maizemine.rnet.missouri.edu/maizemine/service",
    root_class="Gene",
    views=["Gene.primaryIdentifier", "Gene.symbol"],
    parquet_path="/tmp/genes.parquet",
    page_size=2_000,
    max_workers=8,
    inflight_limit=8,
    max_inflight_bytes_estimate=64 * 1024 * 1024,
)

managed_result = fetch_from_mine(
    mine_url="https://maizemine.rnet.missouri.edu/maizemine/service",
    root_class="Gene",
    views=["Gene.primaryIdentifier", "Gene.symbol"],
    parquet_path="/tmp/genes.parquet",
    max_workers=8,
    managed=True,
)
with managed_result["duckdb_connection"] as con:
    count = con.execute(
        f'SELECT COUNT(*) FROM "{managed_result["duckdb_table"]}"'
    ).fetchone()[0]
    print(count)

Development

make lint
make test
make docs

Repository-only support directories:

  • docs/, samples/, and scripts/ are for development and examples.
  • They are intentionally excluded from published package artifacts.

Test Modes

Default pytest runs the offline invariant suite (fast and deterministic):

  • Tor strict DNS-safe proxy enforcement (socks5h:// requirement).
  • Streaming response closure on early iterator termination.
  • Session ownership lifecycle (close() closes only owned resources).
  • Executor lifecycle closure under early parallel termination.
  • Runtime defaults validation for parallel/query behavior.
  • Storage policy single-source checks (Parquet compression + DuckDB identifier validation).
  • DuckDB managed connection lifecycle closure.

Live network smoke tests are opt-in by filename (live_*.py):

INTERMINE314_RUN_LIVE_TESTS=1 pytest -q tests/live_*.py

Benchmark commands and benchmark-specific docs live in benchmarks/README.md. Benchmarks are runner-script based (python -m benchmarks...); benchmark pytest globs are not part of CI or the default test workflow.

License

MIT (see LICENSE).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

intermine314-0.1.7.tar.gz (70.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

intermine314-0.1.7-py3-none-any.whl (86.3 kB view details)

Uploaded Python 3

File details

Details for the file intermine314-0.1.7.tar.gz.

File metadata

  • Download URL: intermine314-0.1.7.tar.gz
  • Upload date:
  • Size: 70.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for intermine314-0.1.7.tar.gz
Algorithm Hash digest
SHA256 18974b4013016ac524cf6e784ee565e5cee19022c63c8349453582b29c9e2ee6
MD5 a263498bc2907cfd3871487567d9cd5e
BLAKE2b-256 c7b70949a1ea960bb93eb370b420fc8f2656a4c665c221ae4aed0c07ff450107

See more details on using hashes here.

Provenance

The following attestation bundles were made for intermine314-0.1.7.tar.gz:

Publisher: publish-pypi.yml on karikris/intermine314

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file intermine314-0.1.7-py3-none-any.whl.

File metadata

  • Download URL: intermine314-0.1.7-py3-none-any.whl
  • Upload date:
  • Size: 86.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for intermine314-0.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 31f64151cb65bc3ff9b5d3418327e06d048bbb38e5d4ffa45480086c08167ee5
MD5 cce64cc143597a4ed9ae01656ad9c5c5
BLAKE2b-256 f20c36c5bed62535c556976fabe27f584acb0d7e322b2349de384259bbc3c845

See more details on using hashes here.

Provenance

The following attestation bundles were made for intermine314-0.1.7-py3-none-any.whl:

Publisher: publish-pypi.yml on karikris/intermine314

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page