Skip to main content

Automatically generate evals for every AI change

Project description

Parity

PyPI License: MIT Python 3.11+

Parity analyzes behavior-defining AI changes in pull requests, discovers the most relevant existing eval target, validates the real coverage gaps, and proposes native eval additions that fit the target suite.

Parity is not an eval runner. It does not create or mutate hosted evaluator infrastructure. It reuses the eval system you already have.

What Parity Does

For each PR that changes prompts, instructions, guardrails, judges, validators, or similar behavior-defining assets, Parity:

  1. Detects the behavioral change.
  2. Resolves the best matching eval target and method.
  3. Validates which gaps are actually uncovered.
  4. Synthesizes native eval additions for that target.
  5. Writes only native_ready evals after explicit approval.

Support

Path Status Notes
Promptfoo Strong Best fully native path. Assertions are row-local and writeback is straightforward.
LangSmith Strong Strong dataset discovery and writeback. Evaluator reuse is supported; evaluator mutation is out of scope.
Braintrust Supported with limitations Writeback works. Target discovery is weaker and evaluator recovery depends more on repo assets.
Arize Phoenix Supported with limitations Dataset read/write works. Evaluator discovery is weaker than Promptfoo and LangSmith.
Bootstrap mode Built in If no safe target is found, Parity proposes starter evals and abstains from unsafe writeback.

More detail: docs/platforms.md

Public Commands

These are the commands most users need:

  • parity init — scaffold parity.yaml, the GitHub Actions workflow, and context/ stubs
  • parity doctor — verify your setup and environment
  • parity run-stage 1 — detect behavioral artifact changes in a PR
  • parity run-stage 2 — analyze coverage gaps against existing evals
  • parity run-stage 3 — synthesize native eval proposals
  • parity write-evals — write approved evals to your platform after merge
  • parity setup-mcp — generate an MCP server config from parity.yaml (for local agent tooling)

Internal CI commands (post-comment, resolve-run-id, etc.) are used by the generated workflow and are not intended to be called directly.

Quick Start

pip install parity-ai
parity init

Then:

  1. Fill in the generated context/ files.
  2. Add GitHub secrets: ANTHROPIC_API_KEY, OPENAI_API_KEY, and any platform keys you use.
  3. Commit parity.yaml, .github/workflows/parity.yml, and context/.
  4. Open a PR that changes AI behavior.
  5. Add the fixed approval label parity:approve before merging if you want Parity to write approved evals back after merge.

Docs

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

parity_ai-0.1.16.tar.gz (101.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

parity_ai-0.1.16-py3-none-any.whl (123.8 kB view details)

Uploaded Python 3

File details

Details for the file parity_ai-0.1.16.tar.gz.

File metadata

  • Download URL: parity_ai-0.1.16.tar.gz
  • Upload date:
  • Size: 101.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for parity_ai-0.1.16.tar.gz
Algorithm Hash digest
SHA256 259d24bd14c310323fd26326f9fb4b021a598396c674bca409c030649d92f2a4
MD5 5420faa39c4d8b44da9ceee377af9415
BLAKE2b-256 5983a261850b4ca3e85dfd33f6984f0a2177eb9cf93ae612d7fcf4c5ef167241

See more details on using hashes here.

File details

Details for the file parity_ai-0.1.16-py3-none-any.whl.

File metadata

  • Download URL: parity_ai-0.1.16-py3-none-any.whl
  • Upload date:
  • Size: 123.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for parity_ai-0.1.16-py3-none-any.whl
Algorithm Hash digest
SHA256 56edaab8e50a97fa2f0c65186b4bdf39559db99f86f0b8bee105061aa268d3b4
MD5 d1494a0a769fc1100390c9370951d8eb
BLAKE2b-256 45c3b0eec13934a9712c7a6436a6eeb1a6dc1ab19c84a809fae8c5c3ef6ef4ff

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page