A graph-based toolkit for evaluating LLM and RAG outputs with repeatable quality metrics and reporting.
Project description
neXa-gauge
A graph-based evaluation toolkit for LLM and RAG systems with repeatable quality checks, upfront cost visibility, cache for reusability and clean per-case outputs for analysis.
- Graph-native evaluation flow (scan -> claims -> metrics -> eval)
- Cost visibility before runtime with estimate-first execution
- Cache-aware runs to avoid duplicate spend and recomputation
- Coverage across relevance, grounding, redteam, GEval, and reference scoring
- Production-friendly CLI for run, estimate, and cache management
- Scales with control across utility and metric nodes
- Bring your own model: Ollama support comming!!
Install
PyPI (recommended)
pip install nexa-gauge
With Hugging Face adapter support:
pip install "nexa-gauge[huggingface]"
From source (development)
git clone git@github.com:Sardhendu/nexa-gauge.git
cd nexa-gauge
pip install -e .
Quick Start
# set your provider key
export OPENAI_API_KEY="<your-key>"
# inspect CLI
nexagauge --help
# estimate first
nexagauge estimate grounding --input sample.json --limit 5
# run and write reports
nexagauge run eval --input sample.json --limit 5 --output-dir ./report
CLI Overview
nexagauge run <target_node> --input <source> [flags]nexagauge estimate <target_node> --input <source> [flags]
Most-used flags:
- data:
--input,--adapter,--split,--start,--end,--limit - model routing:
--model,--llm-model,--llm-fallback - cache:
--force,--no-cache,--cache-dir - execution:
--max-workers,--max-in-flight,--continue-on-error - debug:
--debug(enables node logs; hides progress bar) - output (
run):--output-dir
Node Topology
Canonical nodes:
scanchunkclaimsdedupgeval_stepsrelevancegroundingredteamgevalreferenceevalreport
Typical paths:
grounding:scan -> chunk -> claims -> dedup -> groundingrelevance:scan -> chunk -> claims -> dedup -> relevancegeval:scan -> geval_steps -> gevaleval: full graph execution and aggregation
Configuration
See .env.example for environment settings.
Minimum for LLM-backed runs:
OPENAI_API_KEY(or alternative provider key)LLM_MODEL(default available)
Per-node overrides are supported:
LLM_{NODE}_MODELLLM_{NODE}_FALLBACK_MODELLLM_{NODE}_TEMPERATURE
For Maintainers
uv sync
make lint
make test
make ci
Releases are automated with release-please:
- use Conventional Commit PR titles (
feat:,fix:,deps:,chore:, etc.) so merged commits are parseable - if using merge commits, ensure the merge message includes a conventional title (or use squash merge with a conventional PR title)
- a
Release PRis created/updated automatically and auto-merged after required checks pass - release bump scope is repo-level (
nexa-gaugeroot version), not every package-level file - publish runs from
.github/workflows/release.ymlafter release creation
Build distributions:
uv build
Expected artifacts:
dist/nexa_gauge-<version>-py3-none-any.whldist/nexa_gauge-<version>.tar.gz
Project Standards
- License: MIT
- Security policy: SECURITY.md
- Contributing guide: CONTRIBUTING.md
- Code of conduct: CODE_OF_CONDUCT.md
Documentation
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nexa_gauge-0.1.3.tar.gz.
File metadata
- Download URL: nexa_gauge-0.1.3.tar.gz
- Upload date:
- Size: 119.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
492c8c0a8b08add0dcca13542eb564dab91f3b15342098138ad6311d74d31c9d
|
|
| MD5 |
3960f9b755a9995a8f44e6145be453f7
|
|
| BLAKE2b-256 |
7e66891ee98a2377d61eafffb06278ab39a2518481250f7e42dd3edee64bfc5a
|
Provenance
The following attestation bundles were made for nexa_gauge-0.1.3.tar.gz:
Publisher:
release.yml on harneXa/nexa-gauge
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
nexa_gauge-0.1.3.tar.gz -
Subject digest:
492c8c0a8b08add0dcca13542eb564dab91f3b15342098138ad6311d74d31c9d - Sigstore transparency entry: 1361209460
- Sigstore integration time:
-
Permalink:
harneXa/nexa-gauge@f29d7765ab909076a10a96f6673c2c7c2fd67c67 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/harneXa
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@f29d7765ab909076a10a96f6673c2c7c2fd67c67 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file nexa_gauge-0.1.3-py3-none-any.whl.
File metadata
- Download URL: nexa_gauge-0.1.3-py3-none-any.whl
- Upload date:
- Size: 98.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9c30b7ea76e18e49c8ef72f1dc7247c7763f2c76f7c4d25c9c8b61129364dff8
|
|
| MD5 |
18a5d1507cb4e60dd5ba2aa212f47230
|
|
| BLAKE2b-256 |
50aafe3c26058c48d50a54d38b8812ef052b19c921a10006617b54ca2a32fd3d
|
Provenance
The following attestation bundles were made for nexa_gauge-0.1.3-py3-none-any.whl:
Publisher:
release.yml on harneXa/nexa-gauge
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
nexa_gauge-0.1.3-py3-none-any.whl -
Subject digest:
9c30b7ea76e18e49c8ef72f1dc7247c7763f2c76f7c4d25c9c8b61129364dff8 - Sigstore transparency entry: 1361209490
- Sigstore integration time:
-
Permalink:
harneXa/nexa-gauge@f29d7765ab909076a10a96f6673c2c7c2fd67c67 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/harneXa
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@f29d7765ab909076a10a96f6673c2c7c2fd67c67 -
Trigger Event:
workflow_dispatch
-
Statement type: