Local open-source evaluation tooling for rubric validation, linting, and deterministic scoring.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

auraoneai

These details have not been verified by PyPI

Project links

Project description

AuraOne EvalKit

AuraOne EvalKit is a standalone local Python package for rubric validation, rubric linting, and deterministic scoring. It installs as auraone-evalkit, imports as auraone_evalkit, and exposes the evalkit CLI.

EvalKit does not require an AuraOne account, API key, hosted tenant, database, or private reviewer pool. The files in examples/tutorial/ are synthetic tutorial data only. They are not expert-authored, human-validated, benchmark-grade, safety certifications, or claims about model quality.

Package Distinction

AuraOne has separate hosted SDKs:

Tool	Package or binary	Purpose
EvalKit	`auraone-evalkit`, `auraone_evalkit`, `evalkit`	Local open-source rubric tools. No API key.
Hosted Python SDK	`auraone-sdk`	Hosted AuraOne API client. Uses hosted services.
Hosted TypeScript SDK	`@auraone/sdk`	Hosted AuraOne API client for Node/TypeScript. Uses hosted services.
Hosted API CLI	`aura`	Hosted AuraOne command line workflows. Separate from `evalkit`.

Use evalkit for local files and tutorial workflows. Use auraone-sdk, @auraone/sdk, or aura only when you intend to call hosted AuraOne services.

Install

From this repository:

cd opensource/evalkit
python -m pip install -e .

After install:

evalkit --help
evalkit --version

Five-Minute Quickstart

Validate the synthetic tutorial rubric:

evalkit validate-rubric examples/tutorial/rubric.jsonl

Lint the same rubric:

evalkit lint-rubric examples/tutorial/rubric.jsonl

Score the synthetic tutorial model outputs. If --labels is omitted, EvalKit looks for labels.jsonl next to the responses file.

evalkit score \
  --rubric examples/tutorial/rubric.jsonl \
  --responses examples/tutorial/model_outputs.jsonl \
  --out /tmp/evalkit-tutorial-scores.json

Expected summary for the bundled tutorial data:

{
  "average_score": 0.645833,
  "pass_rate": 0.666667,
  "scored_outputs": 3
}

The full deterministic expected output is stored in examples/tutorial/expected_scores.json.

Commands

`evalkit validate-rubric`

Validates JSONL or JSON-array rubric files against the AuraOne EvalKit rubric contract.

evalkit validate-rubric examples/tutorial/rubric.jsonl --format json

Validation errors include row number, field, message, and a suggested fix.

`evalkit lint-rubric`

Runs rubric quality checks that catch common authoring problems before scoring.

evalkit lint-rubric examples/tutorial/rubric.jsonl --format json

The v0.1 linter includes rules for compound criteria, vague wording, missing examples, missing weight, duplicate IDs, duplicate text, inconsistent severity, unscorable language, unavailable context, unclear scoring boundaries, and weight totals.

`evalkit score`

Aggregates per-criterion labels into deterministic weighted scores.

evalkit score \
  --rubric examples/tutorial/rubric.jsonl \
  --responses examples/tutorial/model_outputs.jsonl \
  --labels examples/tutorial/labels.jsonl \
  --format json \
  --out /tmp/evalkit-tutorial-scores.json

Supported output formats are json, jsonl, csv, and report-json.

Data Contracts

Rubric rows are JSON objects with required fields:

criterion_id
domain
task_type
criterion
weight
severity
scoring_type
examples
edge_cases
disagreement_risk

See docs/schema/rubric-schema.md for the full schema and examples.

Scoring labels use:

output_id
criterion_id
score
optional applicable
optional rationale

Scores are normalized by scoring type, multiplied by criterion weight, and divided by the applicable rubric weight. Missing labels are reported in every output record. In --strict mode, missing labels fail the command.

Documentation

docs/architecture/two-package-architecture.md
docs/schema/rubric-schema.md
Repository roadmap context: ../../opensource.md
Public AuraOne open resources: https://auraone.ai/open

Limitations

v0.1 ships local tooling and synthetic tutorial fixtures only.
The tutorial data is not a benchmark and should not be used to compare vendors or publish model claims.
The linter is a deterministic authoring aid, not a replacement for domain review.
The scorer aggregates labels supplied by the user. It does not generate labels, call LLM judges, or contact AuraOne hosted services.

Development

Run focused checks from opensource/evalkit:

python -m pytest tests/test_package_imports.py tests/schema/test_rubric_schema.py tests/scoring/test_score_cli.py tests/linting/test_rules.py tests/examples/test_tutorial_dataset.py
python -m pip wheel . --no-deps -w /tmp/evalkit-wheel

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

auraoneai

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.0

May 12, 2026

0.1.1

May 11, 2026

This version

0.1.0

May 11, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

auraone_evalkit-0.1.0.tar.gz (45.4 kB view details)

Uploaded May 11, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

auraone_evalkit-0.1.0-py3-none-any.whl (58.5 kB view details)

Uploaded May 11, 2026 Python 3

File details

Details for the file auraone_evalkit-0.1.0.tar.gz.

File metadata

Download URL: auraone_evalkit-0.1.0.tar.gz
Upload date: May 11, 2026
Size: 45.4 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for auraone_evalkit-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`bfeed3de5d73260d5f60244add83e2bdc1711f3d748b32f142ddb07be6514059`
MD5	`e78edf873f3737a5257556c2f75cf913`
BLAKE2b-256	`a9b21ed0d770bd91a64e83de9fc80d7cec4b631130ef5dd5b7152609217dc24e`

See more details on using hashes here.

Provenance

The following attestation bundles were made for auraone_evalkit-0.1.0.tar.gz:

Publisher: release-python.yml on auraoneai/open

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: auraone_evalkit-0.1.0.tar.gz
- Subject digest: bfeed3de5d73260d5f60244add83e2bdc1711f3d748b32f142ddb07be6514059
- Sigstore transparency entry: 1509593962
- Sigstore integration time: May 11, 2026
Source repository:
- Permalink: auraoneai/open@b4b3c994fcd36155414c7af1fc002607b5abb607
- Branch / Tag: refs/heads/main
- Owner: https://github.com/auraoneai
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release-python.yml@b4b3c994fcd36155414c7af1fc002607b5abb607
- Trigger Event: workflow_dispatch

File details

Details for the file auraone_evalkit-0.1.0-py3-none-any.whl.

File metadata

Download URL: auraone_evalkit-0.1.0-py3-none-any.whl
Upload date: May 11, 2026
Size: 58.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for auraone_evalkit-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2f2f1f0bc7d228a2d707c755ccb5db99a98b4d464a1975fac3939b2f59b24af8`
MD5	`d5f2c408f45f81bf26a60d8f6527b4c2`
BLAKE2b-256	`cc697006facc9da51e631601d0867e1d4ba3326ff0bded554289e055b3d53d7e`

See more details on using hashes here.

Provenance

The following attestation bundles were made for auraone_evalkit-0.1.0-py3-none-any.whl:

Publisher: release-python.yml on auraoneai/open

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: auraone_evalkit-0.1.0-py3-none-any.whl
- Subject digest: 2f2f1f0bc7d228a2d707c755ccb5db99a98b4d464a1975fac3939b2f59b24af8
- Sigstore transparency entry: 1509594039
- Sigstore integration time: May 11, 2026
Source repository:
- Permalink: auraoneai/open@b4b3c994fcd36155414c7af1fc002607b5abb607
- Branch / Tag: refs/heads/main
- Owner: https://github.com/auraoneai
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release-python.yml@b4b3c994fcd36155414c7af1fc002607b5abb607
- Trigger Event: workflow_dispatch

auraone-evalkit 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

AuraOne EvalKit

Package Distinction

Install

Five-Minute Quickstart

Commands

evalkit validate-rubric

evalkit lint-rubric

evalkit score

Data Contracts

Documentation

Limitations

Development

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

`evalkit validate-rubric`

`evalkit lint-rubric`

`evalkit score`