CLI toolkit for Gemini Enterprise Connector evaluation — init, check, run, and evaluate RAG pipelines against golden datasets.
Project description
Gemini Enterprise Connector — Evaluation Toolkit
CLI toolkit for evaluating Gemini Enterprise Connector RAG pipelines. Compares actual responses against a golden dataset, producing pass/fail grades, root cause analysis, and an interactive HTML dashboard.
Installation
Using uv (recommended)
uv pip install ge-eval
Using pip
pip install ge-eval
From source
git clone https://github.com/yapweiyih/gemini-enterprise-connector-eval.git
cd gemini-enterprise-connector-eval
uv sync
Quick Start
# Step 1: Scaffold a working directory with sample inputs
ge-eval init
# Step 2: Edit .env with your Google Cloud project settings
# Step 3: Validate configuration
ge-eval check
# Step 4: Query the API (sends questions, gets responses)
ge-eval run
# Step 5: Run LLM-judge evaluation
ge-eval eval
# Step 6: View results in the browser
ge-eval serve
# Then open http://localhost:8080
Note: Run
ge-eval initto generate anINSTRUCTION.mdfile with a comprehensive setup guide covering authentication, input structure, and command details.
What the Pipeline Does
ge-eval eval orchestrates 6 steps in sequence:
| Step | Action | Description |
|---|---|---|
| 1 | Validate Inputs | Checks that golden dataset, CSV, and HTML folder exist; validates question alignment |
| 2 | Generate Summaries | Extracts agent trajectories from dolphin debug HTML files → outputs/summaries/ |
| 3 | Run LLM Judge | Calls Gemini to evaluate all questions |
| 4 | Enrich Golden Source | Adds expected citations from golden dataset to the CSV |
| 5 | Normalize Columns | Reorders CSV to final 15-column schema |
| 6 | Generate Report | Creates outputs/G_REPORT.md with stats and detailed RCA |
CLI Commands
| Command | Description |
|---|---|
ge-eval init |
Scaffold a working directory with sample inputs, .env, and INSTRUCTION.md |
ge-eval check |
Validate config, env vars, and input file alignment |
ge-eval run |
Batch-query the GE streamAssist API |
ge-eval eval |
Run the full LLM-judge evaluation pipeline |
ge-eval serve |
Start a local HTTP server for the evaluation viewer |
ge-eval clean |
Remove all files from inputs/, outputs/, and INSTRUCTION.md |
Documentation
Full documentation is available at: https://yapweiyih.github.io/gemini-enterprise-connector-eval/
After running ge-eval init, see the generated INSTRUCTION.md for a
detailed setup guide including authentication, input structure, and output
column schema.
Contributing
Contributions are welcome! See CONTRIBUTING.md for
guidelines on how to get started.
Development Setup
# Clone and install dev dependencies
git clone https://github.com/yapweiyih/gemini-enterprise-connector-eval.git
cd gemini-enterprise-connector-eval
uv sync
# Run tests
uv run pytest tests/ -v
# Build documentation locally
uv run mkdocs serve
Running Tests
# Run all tests
uv run pytest tests/ -v
# Run specific test module
uv run pytest tests/test_ge_eval_cli.py -v
# Run with coverage
uv run pytest tests/ --cov=ge_eval -v
License
This project is licensed under the Apache License 2.0 — see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ge_eval-0.1.2.tar.gz.
File metadata
- Download URL: ge_eval-0.1.2.tar.gz
- Upload date:
- Size: 61.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b166e41103a8ac1d4519913176c5adb4b1ed851cf55fa9703ff90955aeb840b9
|
|
| MD5 |
af1897f89a7c85fe4dd3782c28bcdde0
|
|
| BLAKE2b-256 |
4bdde35373bc53c409c54324f184e7fdd01161d59fbfdf4df64c13d518f94ed6
|
File details
Details for the file ge_eval-0.1.2-py3-none-any.whl.
File metadata
- Download URL: ge_eval-0.1.2-py3-none-any.whl
- Upload date:
- Size: 92.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
adefffc54952506afba06a0088cffd8d9c47a57a849d61f1876347816336473f
|
|
| MD5 |
70f850c48f7ffe8130e9d1f427509464
|
|
| BLAKE2b-256 |
af9b538e6cec26c30b9e1eb142be535772f7643498aa1446ccb0b34b1d64043e
|