Skip to main content

Modernized InterMine WebService client (Python 3.14+)

Project description

intermine314

CI PyPI version Python versions supported Downloads (Pepy) OpenSSF Scorecard License: MIT

Python 3.14+ client for InterMine web services.

Modern InterMine client focused on reliable, high-throughput research workflows.

Ownership and Credit

Copyright (c) 2026 Monash University, Plant Energy and Biotechnology Lab.

Owners:

  • Kris Kari
  • Dr. Maria Ermakova
  • Plant Energy and Biotechnology Lab, Monash University
  • Contact: toffe.kari@gmail.com

Original credit:

  • Original InterMine team and community contributors.

License

Licensed under the MIT License (see LICENSE-LGPL, which now contains the active MIT license text and notice).

Requirements

  • Python 3.14+
  • Core workflow dependencies are required by default: polars, duckdb (Parquet path).

Supported Mines

Priority support is focused on:

  • MaizeMine
  • ThaleMine
  • LegumeMine
  • OakMine
  • WheatMine

WheatMine service endpoint for API clients:

  • https://urgi.versailles.inrae.fr/WheatMine/service (no trailing slash)

MaizeMine service endpoint for API clients:

  • https://maizemine.rnet.missouri.edu/maizemine/service (no trailing slash)
  • fallback: http://maizemine.rnet.missouri.edu:8080/maizemine/service

Installation

pip install intermine314

Optional extras:

# Faster JSON decode path
pip install "intermine314[speed]"

Repository: https://github.com/karikris/intermine314

Quick Example

from intermine314 import fetch_from_mine

result = fetch_from_mine(
    mine_url="https://maizemine.rnet.missouri.edu/maizemine/service",
    root_class="Gene",
    views=["Gene.primaryIdentifier", "Gene.symbol", "Gene.length"],
    joins=[],
    size=50_000,
    workflow="elt",                  # elt | etl
    production_profile="auto",       # mine-aware profile resolution
    parquet_path="/tmp/maize_genes.parquet",
    duckdb_table="genes",
)

duckdb_con = result["duckdb_connection"]
print(duckdb_con.execute("select count(*) from genes").fetchall())

Parallel Worker Defaults

intermine314 uses adaptive defaults when max_workers is omitted:

  • LegumeMine: 4 workers.
  • MaizeMine: 8 workers.
  • ThaleMine, OakMine, WheatMine: 16 workers.
  • Unknown mines: fallback to 16 workers.

Production workflows use six named profiles:

  • ELT: elt_default_w4, elt_server_limited_w8, elt_full_w16
  • ETL: etl_default_w4, etl_server_limited_w8, etl_full_w16

Parallel query APIs default to pagination="auto". Tune by hardware/network: 4-8 for constrained systems, 16-32 for high-core systems, and lower workers if the mine rate-limits.

Throughput tip: for highest raw throughput, use ordered=False (or ordered="unordered").

Presets: config/parallel-profiles.toml. Mine policies: config/mine-parallel-preferences.toml.

Configuration Files

  • config/runtime-defaults.toml
    • Runtime defaults for omitted query parameters.
    • Override path: INTERMINE314_RUNTIME_DEFAULTS_PATH=/abs/path/to/runtime-defaults.toml.
  • config/mine-parallel-preferences.toml
    • Mine registry, production profile policies, and benchmark profile policies.
    • Shared defaults in [defaults.mine]; per-mine overrides in [mines.<name>].
  • config/parallel-profiles.toml
    • Parallel profile presets.

Settable Parameters

1) Package Runtime Defaults (config/runtime-defaults.toml)

Loaded at import time and used when arguments are omitted:

  • default_parallel_workers
  • default_parallel_page_size
  • default_parallel_pagination (auto|offset|keyset)
  • default_parallel_profile (default|large_query|unordered|mostly_ordered)
  • default_parallel_ordered_mode (ordered|unordered|window|mostly_ordered)
  • default_large_query_mode (true|false)
  • default_parallel_prefetch (integer or "auto")
  • default_parallel_inflight_limit (integer or "auto")
  • default_order_window_pages
  • default_keyset_batch_size
  • keyset_auto_min_size

2) Service Constructor

Set on Service(...):

  • root
  • username, password
  • token
  • prefetch_depth
  • prefetch_id_only

3) Per-call Query Parameters

Set on query calls (run_parallel, iter_batches, dataframe, to_parquet, to_duckdb):

  • start, size, page_size
  • max_workers
  • ordered
  • prefetch
  • inflight_limit
  • ordered_window_pages
  • profile
  • large_query_mode
  • pagination
  • keyset_path
  • keyset_batch_size
  • batch_size (batch helpers / exporters)
  • compression (Parquet: zstd|snappy|gzip|brotli|lz4|uncompressed)

4) Mine Profile Parameters (config/mine-parallel-preferences.toml)

  • production_profile_switch_rows
  • production_elt_small_profile
  • production_elt_large_profile
  • production_etl_small_profile
  • production_etl_large_profile
  • benchmark_small_profile
  • benchmark_large_profile

Benchmark and performance-optimization workflows are documented in BENCHMARK.md.

Testing

Run unit tests:

python -m pytest -q tests

Run dataframe/parquet compatibility smoke check:

python setup.py analyticscheck

Run live tests (if endpoint/test credentials are available):

INTERMINE314_RUN_LIVE_TESTS=1 TESTMODEL_URL="https://<mine>/service" python -m pytest -q tests

Run via tox (if installed):

python -m tox -e py314
python -m tox -e py314-analytics
python -m tox -e lint

Notes

Legacy upstream doc/tutorial links are intentionally omitted while this Python 3.14 line is being stabilized. Published sdist is slimmed to runtime-relevant package/config files (docs/tests/samples/benchmarking excluded).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

intermine314-0.1.5.tar.gz (94.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

intermine314-0.1.5-py3-none-any.whl (96.5 kB view details)

Uploaded Python 3

File details

Details for the file intermine314-0.1.5.tar.gz.

File metadata

  • Download URL: intermine314-0.1.5.tar.gz
  • Upload date:
  • Size: 94.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for intermine314-0.1.5.tar.gz
Algorithm Hash digest
SHA256 a23166622b5a1cc9b0e6d55aa67a7b2120c272bf21d650cbb127ad1033f46d9c
MD5 3475decd2b9e6a0f69f45a1197a5a3f3
BLAKE2b-256 1a13265f0c99d19cc0be3acd7f280e2ae1ab50eea2542049c68a3a6b7c0296ff

See more details on using hashes here.

Provenance

The following attestation bundles were made for intermine314-0.1.5.tar.gz:

Publisher: publish-pypi.yml on karikris/intermine314

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file intermine314-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: intermine314-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 96.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for intermine314-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 f3aaa8fb417bc08a41fba9baee36cba259dec79c9a692cb7609762e205b804ae
MD5 0b32ad70da4c40aaa8d0cdec8730bb90
BLAKE2b-256 ffaee6d0e89191c6bacdbc6ad607f94b8ad9903921c4cab8d412e02427b77010

See more details on using hashes here.

Provenance

The following attestation bundles were made for intermine314-0.1.5-py3-none-any.whl:

Publisher: publish-pypi.yml on karikris/intermine314

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page