Skip to main content

pytest for AI agents

Project description

TestThread 🧵

pytest for AI agents.

The open-source testing framework that tells you if your AI agent is actually working — or quietly breaking.

License API Dashboard


The Problem

You build an AI agent. It works in testing. You ship it.

Then it starts hallucinating. Returning wrong formats. Calling the wrong tools. Breaking your pipeline.

You find out when something downstream crashes — not before.

TestThread fixes that.


What TestThread Does

Define what your agent should do. TestThread runs it, checks the output, and tells you exactly what passed and what failed.

  • ✅ Define test suites per agent
  • ✅ Add test cases with expected outputs
  • ✅ Run suites against your live agent endpoint
  • ✅ Get pass/fail results with reasons
  • ✅ Track pass rate over time
  • ✅ Catch regressions before they hit production

Quick Start

pip install requests
import requests

BASE = "https://test-thread-production.up.railway.app"

# Create a test suite
suite = requests.post(f"{BASE}/suites", json={
    "name": "My Agent Tests",
    "description": "Testing my AI agent",
    "agent_endpoint": "https://your-agent.com/run"
}).json()

# Add a test case
requests.post(f"{BASE}/suites/{suite['id']}/cases", json={
    "name": "Basic response check",
    "input": "What is 2 + 2?",
    "expected_output": "4",
    "match_type": "contains"
})

# Run the suite
result = requests.post(f"{BASE}/suites/{suite['id']}/run").json()
print(f"Passed: {result['passed']} | Failed: {result['failed']}")

Match Types

Type Description
contains Output contains the expected string
exact Output matches exactly
regex Output matches a regex pattern

Live Dashboard

View your test results visually at test-thread.lovable.app


API Reference

Full docs at test-thread-production.up.railway.app/docs

Method Endpoint Description
GET / Health check
POST /suites Create test suite
GET /suites List all suites
POST /suites/{id}/cases Add test case
GET /suites/{id}/cases List test cases
POST /suites/{id}/run Run suite
GET /runs List all runs
GET /runs/{id} Get run details
GET /dashboard/stats Dashboard stats

Part of the Thread Suite

TestThread is part of a suite of open-source reliability tools for AI agents.

Tool What it does
Iron-Thread Validates AI output structure before it hits your database
TestThread Tests whether your agent behaves correctly across runs
PromptThread (coming soon) Versions and tracks prompt performance over time

Self-Host

git clone https://github.com/eugene001dayne/test-thread.git
cd test-thread
pip install -r requirements.txt
uvicorn main:app --reload

License

Apache 2.0 — free to use, modify, and distribute.


Built for developers who ship AI agents and need to know they work.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

testthread-0.7.0.tar.gz (3.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

testthread-0.7.0-py3-none-any.whl (4.0 kB view details)

Uploaded Python 3

File details

Details for the file testthread-0.7.0.tar.gz.

File metadata

  • Download URL: testthread-0.7.0.tar.gz
  • Upload date:
  • Size: 3.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for testthread-0.7.0.tar.gz
Algorithm Hash digest
SHA256 be545166a1cb9903aa67aef6165be9e8f3b548e4b95e0e238e363791aa7bf99c
MD5 7251c25de74a78ad4cc10ea4cb0d7886
BLAKE2b-256 090f45c31d996ce8470b1e700408d61b4438c2d6952faf7493857403c3be8bb3

See more details on using hashes here.

File details

Details for the file testthread-0.7.0-py3-none-any.whl.

File metadata

  • Download URL: testthread-0.7.0-py3-none-any.whl
  • Upload date:
  • Size: 4.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for testthread-0.7.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2c9c96a4e7975521cbfb6c0203477bae180e6d0f08e4c16fbb437de7ede0ad95
MD5 826068515dee65723fe069c0e8d378bd
BLAKE2b-256 bcbbec5263483e05c98329d45f396826b072fa22ed4639e10f27edf3d6982e68

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page