pytest for AI agents
Project description
TestThread 🧵
pytest for AI agents.
The open-source testing framework that tells you if your AI agent is actually working — or quietly breaking.
The Problem
You build an AI agent. It works in testing. You ship it.
Then it starts hallucinating. Returning wrong formats. Calling the wrong tools. Breaking your pipeline.
You find out when something downstream crashes — not before.
TestThread fixes that.
What TestThread Does
Define what your agent should do. TestThread runs it, checks the output, and tells you exactly what passed and what failed.
- ✅ Define test suites per agent
- ✅ Add test cases with expected outputs
- ✅ Run suites against your live agent endpoint
- ✅ Get pass/fail results with reasons
- ✅ Track pass rate over time
- ✅ Catch regressions before they hit production
Quick Start
pip install requests
import requests
BASE = "https://test-thread-production.up.railway.app"
# Create a test suite
suite = requests.post(f"{BASE}/suites", json={
"name": "My Agent Tests",
"description": "Testing my AI agent",
"agent_endpoint": "https://your-agent.com/run"
}).json()
# Add a test case
requests.post(f"{BASE}/suites/{suite['id']}/cases", json={
"name": "Basic response check",
"input": "What is 2 + 2?",
"expected_output": "4",
"match_type": "contains"
})
# Run the suite
result = requests.post(f"{BASE}/suites/{suite['id']}/run").json()
print(f"Passed: {result['passed']} | Failed: {result['failed']}")
Match Types
| Type | Description |
|---|---|
contains |
Output contains the expected string |
exact |
Output matches exactly |
regex |
Output matches a regex pattern |
Live Dashboard
View your test results visually at test-thread.lovable.app
API Reference
Full docs at test-thread-production.up.railway.app/docs
| Method | Endpoint | Description |
|---|---|---|
| GET | / |
Health check |
| POST | /suites |
Create test suite |
| GET | /suites |
List all suites |
| POST | /suites/{id}/cases |
Add test case |
| GET | /suites/{id}/cases |
List test cases |
| POST | /suites/{id}/run |
Run suite |
| GET | /runs |
List all runs |
| GET | /runs/{id} |
Get run details |
| GET | /dashboard/stats |
Dashboard stats |
Part of the Thread Suite
TestThread is part of a suite of open-source reliability tools for AI agents.
| Tool | What it does |
|---|---|
| Iron-Thread | Validates AI output structure before it hits your database |
| TestThread | Tests whether your agent behaves correctly across runs |
| PromptThread (coming soon) | Versions and tracks prompt performance over time |
Self-Host
git clone https://github.com/eugene001dayne/test-thread.git
cd test-thread
pip install -r requirements.txt
uvicorn main:app --reload
License
Apache 2.0 — free to use, modify, and distribute.
Built for developers who ship AI agents and need to know they work.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file testthread-0.4.0.tar.gz.
File metadata
- Download URL: testthread-0.4.0.tar.gz
- Upload date:
- Size: 3.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4422a2ece37379a45340adbf79fc1accb9d826219fd3c963058dabd014397b81
|
|
| MD5 |
28eef852dc8a3c021900e7bf41432994
|
|
| BLAKE2b-256 |
f6253b972a070a29b9c034660433e678bf50966d5c315b2a62fc07ca3c1ad539
|
File details
Details for the file testthread-0.4.0-py3-none-any.whl.
File metadata
- Download URL: testthread-0.4.0-py3-none-any.whl
- Upload date:
- Size: 4.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
43eb3cd1f5a72c527b43efc1aeb93df1f75c127c4a3fcf09d655ce472dc3771d
|
|
| MD5 |
978349be5e1f9879dc7fbce2eb776a5e
|
|
| BLAKE2b-256 |
6394c9654f9ea1d75f3efd663d99c5cea9b14c1f22cfdf68042467f3b421425b
|