Automated quality assurance for AI applications
Project description
pixie-qa
An agent skill that make coding agent the QA engineer for LLM applications.
What the Skill Does
The qa-eval skill guides your coding agent through the full eval-based QA loop for LLM applications:
- Understand the code — read the codebase, trace the data flow, learn what the code is supposed to do
- Instrument it — add
enable_storage()and@observeso every run is captured to a local SQLite database - Build a dataset — save representative traces as test cases with
pixie dataset save - Write eval tests — generate
test_*.pyfiles withassert_dataset_passand appropriate evaluators - Validate datasets —
pixie dataset validate [dir_or_dataset_path]to catch schema/config errors early - Run the tests —
pixie testto run all evals and report per-case scores - Analyse results —
pixie analyze <test_id>to get LLM-generated analysis of test results - Investigate failures — look up the stored trace for each failure, diagnose, fix, repeat
Getting Started
1. Add the skill to your coding agent
npx skills add yiouli/pixie-qa
The accompanying python package would be installed by the skill automatically when it's used.
2. Ask coding agent to set up evals
Open a conversation and say something like when developing a python based AI project:
"setup QA for my agent"
Your coding agent will read your code, instrument it, build a dataset from a few real runs, write and run eval-based tests, investigate failures and fix.
Python Package
The pixie-qa Python package (imported as pixie) is what Claude installs and uses inside your project. For the package API and CLI reference, see docs/package.md.
Web UI
View all eval artifacts (results, markdown docs, datasets, and legacy scorecards) in a live-updating local web UI:
pixie start # initializes pixie_qa/ (if needed) and opens http://localhost:7118
pixie start my_dir # use a custom artifact root
pixie init # scaffolds pixie_qa/ without starting the server
The web UI provides tabbed navigation for results, scorecards (legacy), datasets, and markdown files. Changes to artifacts are pushed to the browser in real time via SSE.
The server writes a server.lock file to the artifact root directory on startup (containing the port number) and removes it on shutdown, allowing other processes to discover whether the server is already running.
Configuration
Pixie reads configuration from environment variables and a local .env file through a single central config layer. Existing process env vars win over .env values.
Useful settings include:
PIXIE_ROOTto move all generated artefacts under a different root directoryPIXIE_RATE_LIMIT_ENABLED=trueto enable evaluator throttling forpixie testPIXIE_RATE_LIMIT_RPS,PIXIE_RATE_LIMIT_RPM,PIXIE_RATE_LIMIT_TPS, andPIXIE_RATE_LIMIT_TPMto tune request and token throughput for LLM-as-judge evaluators
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pixie_qa-0.4.0.tar.gz.
File metadata
- Download URL: pixie_qa-0.4.0.tar.gz
- Upload date:
- Size: 334.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5acdb8e4b42fbecc5d17e3c8ffeed90a1f31ceadc374aa5ce852d44a3763356a
|
|
| MD5 |
434ceae07643d0a8fcd3f068a64e3b4a
|
|
| BLAKE2b-256 |
e5708a7c2df76bd2a1bc9c9217d62ac9e4af5a9416302422a02979f2fe922cf9
|
Provenance
The following attestation bundles were made for pixie_qa-0.4.0.tar.gz:
Publisher:
publish.yml on yiouli/pixie-qa
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pixie_qa-0.4.0.tar.gz -
Subject digest:
5acdb8e4b42fbecc5d17e3c8ffeed90a1f31ceadc374aa5ce852d44a3763356a - Sigstore transparency entry: 1239407894
- Sigstore integration time:
-
Permalink:
yiouli/pixie-qa@f082454ce84f5e74c1fa5589b3a5554de09ac882 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/yiouli
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@f082454ce84f5e74c1fa5589b3a5554de09ac882 -
Trigger Event:
push
-
Statement type:
File details
Details for the file pixie_qa-0.4.0-py3-none-any.whl.
File metadata
- Download URL: pixie_qa-0.4.0-py3-none-any.whl
- Upload date:
- Size: 356.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9c4844da701527cb81590384c18503d76dc5837c4ffc6e59c8fdd3ce6d6ed6ad
|
|
| MD5 |
302d825f99e28fc1ca0f6d215fa7f178
|
|
| BLAKE2b-256 |
7b1dafb8d8173183f8e0207cd9807e2099d891031d7db1c5f4dca9e1636c7160
|
Provenance
The following attestation bundles were made for pixie_qa-0.4.0-py3-none-any.whl:
Publisher:
publish.yml on yiouli/pixie-qa
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pixie_qa-0.4.0-py3-none-any.whl -
Subject digest:
9c4844da701527cb81590384c18503d76dc5837c4ffc6e59c8fdd3ce6d6ed6ad - Sigstore transparency entry: 1239407899
- Sigstore integration time:
-
Permalink:
yiouli/pixie-qa@f082454ce84f5e74c1fa5589b3a5554de09ac882 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/yiouli
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@f082454ce84f5e74c1fa5589b3a5554de09ac882 -
Trigger Event:
push
-
Statement type: