Skip to main content

Automated quality assurance for AI applications

Project description

pixie-qa

An agent skill that makes coding agents the QA engineer for LLM applications.

What the Skill Does

The qa-eval skill guides your coding agent through the full eval-based QA loop for LLM applications:

  1. Understand the code — read the codebase, trace the data flow, learn what the code is supposed to do
  2. Instrument it — use wrap() for data-object tracing and OpenInference auto-instrumentation for LLM span capture
  3. Build a dataset — create JSON datasets of representative inputs and expected outputs
  4. Write eval tests — generate test_*.py files with appropriate evaluators
  5. Run the testspixie test to run all evals and report per-case scores
  6. Analyse resultspixie analyze <test_id> to get LLM-generated analysis of test results
  7. Investigate failures — diagnose failures, fix, repeat

Getting Started

1. Add the skill to your coding agent

npx skills add yiouli/pixie-qa

The accompanying python package would be installed by the skill automatically when it's used.

2. Ask coding agent to set up evals

Open a conversation and say something like when developing a python based AI project:

"setup QA for my agent"

Your coding agent will read your code, instrument it, build a dataset from a few real runs, write and run eval-based tests, investigate failures and fix.

Python Package

The pixie-qa Python package (imported as pixie) is what Claude installs and uses inside your project. API docs are auto-generated by pdoc3 into docs/pixie/index.md via pre-commit. The markdown renderer uses scripts/pdoc_templates/text.mako so async functions and methods are explicitly shown as async def in signatures.

Install hooks once per clone:

uv run pre-commit install

Web UI

View all eval artifacts (results, markdown docs, datasets, and legacy scorecards) in a live-updating local web UI:

pixie start              # initializes pixie_qa/ (if needed) and opens http://localhost:7118
pixie start my_dir       # use a custom artifact root
pixie init               # scaffolds pixie_qa/ without starting the server

The web UI provides tabbed navigation for results, scorecards (legacy), datasets, and markdown files. Changes to artifacts are pushed to the browser in real time via SSE.

The server writes a server.lock file to the artifact root directory on startup (containing the port number) and removes it on shutdown, allowing other processes to discover whether the server is already running.

Configuration

Pixie reads configuration from environment variables and a local .env file through a single central config layer. Existing process env vars win over .env values.

Useful settings include:

  • PIXIE_ROOT to move all generated artefacts under a different root directory
  • PIXIE_RATE_LIMIT_ENABLED=true to enable evaluator throttling for pixie test
  • PIXIE_RATE_LIMIT_RPS, PIXIE_RATE_LIMIT_RPM, PIXIE_RATE_LIMIT_TPS, and PIXIE_RATE_LIMIT_TPM to tune request and token throughput for LLM-as-judge evaluators

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pixie_qa-0.5.0.tar.gz (315.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pixie_qa-0.5.0-py3-none-any.whl (328.0 kB view details)

Uploaded Python 3

File details

Details for the file pixie_qa-0.5.0.tar.gz.

File metadata

  • Download URL: pixie_qa-0.5.0.tar.gz
  • Upload date:
  • Size: 315.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pixie_qa-0.5.0.tar.gz
Algorithm Hash digest
SHA256 f8d3354b8d77c7dd0940fe8c5be58274c1bd35e5a623852e463e30c83eadaf5c
MD5 bd7250aeb47030a83d34f7f7110c56d8
BLAKE2b-256 eae18c0f9dba9a522fc3151573d09163e531cb168fa8f5b809204db6a5b3427e

See more details on using hashes here.

Provenance

The following attestation bundles were made for pixie_qa-0.5.0.tar.gz:

Publisher: publish.yml on yiouli/pixie-qa

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pixie_qa-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: pixie_qa-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 328.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pixie_qa-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 96c2dd801d54ca76f4aaad2d8edc331c1f5c0dd46f612bfde7149b77fb784c71
MD5 bc334306bfd68a59d2b9b439a4c363e2
BLAKE2b-256 9f850a3f2de8366beb904e4fe7ee85650eec2b271559b14cca853a78f11654bb

See more details on using hashes here.

Provenance

The following attestation bundles were made for pixie_qa-0.5.0-py3-none-any.whl:

Publisher: publish.yml on yiouli/pixie-qa

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page