Automatically generate evals for every AI change
Project description
Parity
Parity analyzes behavior-defining AI changes in pull requests, discovers the most relevant existing eval target, validates the real coverage gaps, and proposes native eval additions that fit the target suite.
Parity is not an eval runner. It does not create or mutate hosted evaluator infrastructure. It reuses the eval system you already have.
What Parity Does
For each PR that changes prompts, instructions, guardrails, judges, validators, or similar behavior-defining assets, Parity:
- Detects the behavioral change.
- Resolves the best matching eval target and method.
- Validates which gaps are actually uncovered.
- Synthesizes native eval additions for that target.
- Writes only
native_readyevals after explicit approval.
Support
| Path | Status | Notes |
|---|---|---|
| Promptfoo | Strong | Best fully native path. Assertions are row-local and writeback is straightforward. |
| LangSmith | Strong | Strong dataset discovery and writeback. Evaluator reuse is supported; evaluator mutation is out of scope. |
| Braintrust | Supported with limitations | Writeback works. Target discovery is weaker and evaluator recovery depends more on repo assets. |
| Arize Phoenix | Supported with limitations | Dataset read/write works. Evaluator discovery is weaker than Promptfoo and LangSmith. |
| Bootstrap mode | Built in | If no safe target is found, Parity proposes starter evals and abstains from unsafe writeback. |
More detail: docs/platforms.md
Public Commands
These are the commands most users need:
parity init— scaffoldparity.yaml, the GitHub Actions workflow, andcontext/stubsparity doctor— verify your setup and environmentparity run-stage 1— detect behavioral artifact changes in a PRparity run-stage 2— analyze coverage gaps against existing evalsparity run-stage 3— synthesize native eval proposalsparity write-evals— write approved evals to your platform after mergeparity setup-mcp— generate an MCP server config fromparity.yaml(for local agent tooling)
Internal CI commands (post-comment, resolve-run-id, etc.) are used by the generated workflow and are not intended to be called directly.
Quick Start
pip install parity-ai
parity init
Then:
- Fill in the generated
context/files. - Add GitHub secrets:
ANTHROPIC_API_KEY,OPENAI_API_KEY, and any platform keys you use. - Commit
parity.yaml,.github/workflows/parity.yml, andcontext/. - Open a PR that changes AI behavior.
- Add the fixed approval label
parity:approvebefore merging if you want Parity to write approved evals back after merge.
Docs
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file parity_ai-0.1.16.tar.gz.
File metadata
- Download URL: parity_ai-0.1.16.tar.gz
- Upload date:
- Size: 101.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
259d24bd14c310323fd26326f9fb4b021a598396c674bca409c030649d92f2a4
|
|
| MD5 |
5420faa39c4d8b44da9ceee377af9415
|
|
| BLAKE2b-256 |
5983a261850b4ca3e85dfd33f6984f0a2177eb9cf93ae612d7fcf4c5ef167241
|
File details
Details for the file parity_ai-0.1.16-py3-none-any.whl.
File metadata
- Download URL: parity_ai-0.1.16-py3-none-any.whl
- Upload date:
- Size: 123.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
56edaab8e50a97fa2f0c65186b4bdf39559db99f86f0b8bee105061aa268d3b4
|
|
| MD5 |
d1494a0a769fc1100390c9370951d8eb
|
|
| BLAKE2b-256 |
45c3b0eec13934a9712c7a6436a6eeb1a6dc1ab19c84a809fae8c5c3ef6ef4ff
|