Skip to main content

Reproducible benchmark suite for memory/QA systems — June + pluggable competitors.

Project description

june-bench

A pip-installable, reproducible benchmark suite for memory / QA systems — June + pluggable competitors — over LoCoMo, LongMemEval, HotpotQA/2Wiki/MuSiQue, and FinanceBench, with the same data and the same scorer.

pip install june-bench
june-bench list
june-bench run --system echo --dataset smoke --split smoke    # offline, no key, no download

A benchmark is run(system, dataset) → records → score. Two typed ports are the only extension points:

  • System — the thing benchmarked. JuneApiSystem (default; a thin HTTP client to June's /v1/answer, so no June source is shipped), JuneLocalSystem ([june-local] extra; a source-protected compiled wheel), CogneeSystem ([cognee] extra), or any future system as one adapter.
  • Dataset — what it runs on. The four benchmarks behind a registry.

The scorer is the canonical SQuAD/HotpotQA EM/F1 + selective-accuracy/coverage/cost — Cognee-comparable. Tiny smoke fixtures ship in the wheel (offline wiring proof); full splits are fetched, sha-verified, from a pinned release. No score is ever baked into the package — every result row records dataset + scorer + system + model + cost, so a published number is reproducible by a stranger.

Status: SB0 (contracts + no-deps smoke + skeleton). Datasets, June/Cognee systems, and the full CLI land in SB1–SB6.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

june_bench-0.0.1.tar.gz (24.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

june_bench-0.0.1-py3-none-any.whl (22.7 kB view details)

Uploaded Python 3

File details

Details for the file june_bench-0.0.1.tar.gz.

File metadata

  • Download URL: june_bench-0.0.1.tar.gz
  • Upload date:
  • Size: 24.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for june_bench-0.0.1.tar.gz
Algorithm Hash digest
SHA256 6e4b85337fec55ee90d1ae1f320dce5f9a56f97bb3c02c82e54988b4c5a341e0
MD5 780c73607b6e1046437be7500e722438
BLAKE2b-256 fff5d88e590a025cefb41f1164d1c13f32d33c2c512619d0a0995b6fcd469080

See more details on using hashes here.

Provenance

The following attestation bundles were made for june_bench-0.0.1.tar.gz:

Publisher: publish-bench.yml on Junemind/june-brain

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file june_bench-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: june_bench-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 22.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for june_bench-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 db9b3e1123ed1130695ec4a6454f30fcaa0eb8954af95382c1222ae5dccb09a5
MD5 6fd1752f1a4156c487fac0a1b5d873a2
BLAKE2b-256 f4d98ea5a39d0b1e6125442eb583e277a65aae79464be2397bdb69610a12886d

See more details on using hashes here.

Provenance

The following attestation bundles were made for june_bench-0.0.1-py3-none-any.whl:

Publisher: publish-bench.yml on Junemind/june-brain

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page