Modernized InterMine WebService client (Python 3.14+)
Project description
intermine314
Python 3.14+ client for InterMine web services.
Modern InterMine client focused on reliable, high-throughput research workflows.
Ownership and Credit
Copyright (c) 2026 Monash University, Plant Energy and Biotechnology Lab.
Owners:
- Kris Kari
- Dr. Maria Ermakova
- Plant Energy and Biotechnology Lab, Monash University
- Contact: toffe.kari@gmail.com
Original credit:
- Original InterMine team and community contributors.
License
Licensed under the MIT License (see LICENSE-LGPL, which now contains the active MIT license text and notice).
Requirements
- Python 3.14+
- Core workflow dependencies are required by default:
polars,duckdb(Parquet path).
Supported Mines
Priority support is focused on:
- MaizeMine
- ThaleMine
- LegumeMine
- OakMine
- WheatMine
WheatMine service endpoint for API clients:
https://urgi.versailles.inrae.fr/WheatMine/service(no trailing slash)
MaizeMine service endpoint for API clients:
https://maizemine.rnet.missouri.edu/maizemine/service(no trailing slash)- fallback:
http://maizemine.rnet.missouri.edu:8080/maizemine/service
Installation
pip install intermine314
Optional extras:
# Faster JSON decode path
pip install "intermine314[speed]"
# Benchmark script dependencies
pip install "intermine314[benchmark,speed]"
Repository: https://github.com/kriskari/intermine314
Quick Example
from intermine314.webservice import Service
service = Service("https://maizemine.rnet.missouri.edu/maizemine/service")
query = service.new_query("Gene")
query.add_view("Gene.primaryIdentifier", "Gene.symbol", "Gene.length")
parallel_options = {
"pagination": "auto",
"profile": "large_query",
"ordered": "unordered",
"inflight_limit": 8,
}
for row in query.run_parallel(row="dict", **parallel_options):
process(row)
Parallel Worker Defaults
intermine314 uses adaptive defaults when max_workers is omitted:
- LegumeMine:
4workers. - MaizeMine, ThaleMine, OakMine, WheatMine:
16workers up to50,000rows, then12. - Unknown mines: fallback to
16workers.
Parallel query APIs default to pagination="auto".
Tune by hardware/network: 4-8 for constrained systems, 16-32 for high-core systems, and lower workers if the mine rate-limits.
Throughput tip: for raw max throughput benchmarking, use ordered=False (or ordered="unordered").
Presets: config/parallel-profiles.toml. Mine policies: config/mine-parallel-preferences.toml.
Configuration Files
config/runtime-defaults.toml- Runtime defaults for omitted query parameters.
- Override path:
INTERMINE314_RUNTIME_DEFAULTS_PATH=/abs/path/to/runtime-defaults.toml.
config/mine-parallel-preferences.toml- Mine registry, production worker policies, benchmark profile mapping.
- Shared defaults in
[defaults.mine]; per-mine overrides in[mines.<name>].
config/benchmark-targets.toml- Benchmark endpoints, matrix sizes, targeted export table specs.
- Shared defaults in
[defaults.target]and[defaults.targeted_exports]. - Add custom targets under
[targets.<name>].
config/parallel-profiles.toml- Parallel profile presets.
Settable Parameters
1) Package Runtime Defaults (config/runtime-defaults.toml)
Loaded at import time and used when arguments are omitted:
default_parallel_workersdefault_parallel_page_sizedefault_parallel_pagination(auto|offset|keyset)default_parallel_profile(default|large_query|unordered|mostly_ordered)default_parallel_ordered_mode(ordered|unordered|window|mostly_ordered)default_large_query_mode(true|false)default_parallel_prefetch(integer or"auto")default_parallel_inflight_limit(integer or"auto")default_order_window_pagesdefault_keyset_batch_sizekeyset_auto_min_size
2) Service Constructor
Set on Service(...):
rootusername,passwordtokenprefetch_depthprefetch_id_only
3) Per-call Query Parameters
Set on query calls (run_parallel, iter_batches, dataframe, to_parquet, to_duckdb):
start,size,page_sizemax_workersorderedprefetchinflight_limitordered_window_pagesprofilelarge_query_modepaginationkeyset_pathkeyset_batch_sizebatch_size(batch helpers / exporters)compression(Parquet:zstd|snappy|gzip|brotli|lz4|uncompressed)
OakMine Large Export Pattern
For OakMine-scale pulls, avoid one wide join across GO/domains/other collections. Use chunked core + edge exports:
core_protein(entity table)edge_goedge_domain
scripts/benchmarks.py --benchmark-target oakmine now runs this targeted strategy by default (--oakmine-targeted-exports).
Benchmark Matrix Defaults
scripts/benchmarks.py now defaults to a 6-scenario fetch matrix:
10k,25k,50kwithbenchmark_profile_1(intermine+intermine314w2-w18)100k,250k,500kwithbenchmark_profile_2(intermine314w4,w8,w12,w16)
This can be tuned with --matrix-* flags or disabled with --no-matrix-six.
All targets use /service as API root (no trailing slash).
For large MaizeMine retrievals, use the benchmark target preset and template/list-driven core+edge exports:
python scripts/benchmarks.py --benchmark-target maizemine --workers auto --benchmark-profile auto
LegumeMine profile mapping:
<= 50krows:benchmark_profile_4(intermine+intermine314w4,w6,w8)> 50krows:benchmark_profile_3(intermine314w4,w6,w8)
Script Slimming and Memory Optimization
Use these replacements for lighter, lower-memory scripts:
- Skip CSV intermediates unless CSV parity is required. Heavy path:
scripts/bench_fetch.pyCSV exportscripts/bench_io.pycsv_to_parquet(...)
Lightweight replacement:
query.to_parquet(
"results.parquet",
single_file=True,
parallel=True,
pagination="auto",
profile="large_query",
ordered="unordered",
inflight_limit=8,
)
- Prefer lazy scans over eager full-file loads.
import polars as pl
out = (
pl.scan_parquet("results.parquet")
.select(pl.len().alias("rows"))
.collect()
)
- Replace wide multi-join views with targeted core + edge tables.
- Keep only required columns in
query.add_view(...). - Use target presets from
config/benchmark-targets.toml. - Prefer template/list-driven chunking for OakMine/ThaleMine/WheatMine/MaizeMine.
- Keep in-flight work bounded.
- Keep
--auto-chunkingenabled. - Tune
--inflight-limitandprefetchto prevent unbounded memory growth. - Use
ordered="unordered"for throughput runs (unless strict order is required).
Testing
Run unit tests:
python -m pytest -q tests
Run dataframe/parquet compatibility smoke check:
python setup.py analyticscheck
Run live tests (if endpoint/test credentials are available):
INTERMINE314_RUN_LIVE_TESTS=1 TESTMODEL_URL="https://<mine>/service" python -m pytest -q tests
Run via tox (if installed):
python -m tox -e py314
python -m tox -e py314-analytics
python -m tox -e lint
Notes
Legacy upstream doc/tutorial links are intentionally omitted while this Python 3.14 line is being stabilized. Published sdist is slimmed to runtime-relevant package/config files (docs/tests/samples/scripts excluded).
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file intermine314-0.1.3.tar.gz.
File metadata
- Download URL: intermine314-0.1.3.tar.gz
- Upload date:
- Size: 90.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c769cf129041de91b450a4d1050c1c2ef7b65f2d83130813dca2a9351085590a
|
|
| MD5 |
1f5a853cc9bccc9a16d0f241580665bd
|
|
| BLAKE2b-256 |
9fee5d0b43898ed7079f74687410a6a25bcba438e583369ffab735e216ee6bfe
|
Provenance
The following attestation bundles were made for intermine314-0.1.3.tar.gz:
Publisher:
publish-pypi.yml on karikris/intermine314
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
intermine314-0.1.3.tar.gz -
Subject digest:
c769cf129041de91b450a4d1050c1c2ef7b65f2d83130813dca2a9351085590a - Sigstore transparency entry: 927300128
- Sigstore integration time:
-
Permalink:
karikris/intermine314@975f3e322ba6f77a59989cfca43aa3755b35ec93 -
Branch / Tag:
refs/tags/v0.1.3 - Owner: https://github.com/karikris
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@975f3e322ba6f77a59989cfca43aa3755b35ec93 -
Trigger Event:
push
-
Statement type:
File details
Details for the file intermine314-0.1.3-py3-none-any.whl.
File metadata
- Download URL: intermine314-0.1.3-py3-none-any.whl
- Upload date:
- Size: 92.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2b62bffefc1a0f25fc2cd02e2b3d5edae1b5996ae4fa60884c9db026bd2336a5
|
|
| MD5 |
0c0d42001ac53a4c7dad06b57693f383
|
|
| BLAKE2b-256 |
66d61a9be67a5758044043f867ca0c409f20bdef52a98ec1ff61f448572abe60
|
Provenance
The following attestation bundles were made for intermine314-0.1.3-py3-none-any.whl:
Publisher:
publish-pypi.yml on karikris/intermine314
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
intermine314-0.1.3-py3-none-any.whl -
Subject digest:
2b62bffefc1a0f25fc2cd02e2b3d5edae1b5996ae4fa60884c9db026bd2336a5 - Sigstore transparency entry: 927300129
- Sigstore integration time:
-
Permalink:
karikris/intermine314@975f3e322ba6f77a59989cfca43aa3755b35ec93 -
Branch / Tag:
refs/tags/v0.1.3 - Owner: https://github.com/karikris
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@975f3e322ba6f77a59989cfca43aa3755b35ec93 -
Trigger Event:
push
-
Statement type: