Skip to main content

pytest for AI agents

Project description

TestThread 🧵

pytest for AI agents.

The open-source testing framework that tells you if your AI agent is actually working — or quietly breaking.

License API Dashboard


The Problem

You build an AI agent. It works in testing. You ship it.

Then it starts hallucinating. Returning wrong formats. Calling the wrong tools. Breaking your pipeline.

You find out when something downstream crashes — not before.

TestThread fixes that.


What TestThread Does

Define what your agent should do. TestThread runs it, checks the output, and tells you exactly what passed and what failed.

  • ✅ Define test suites per agent
  • ✅ Add test cases with expected outputs
  • ✅ Run suites against your live agent endpoint
  • ✅ Get pass/fail results with reasons
  • ✅ Track pass rate over time
  • ✅ Catch regressions before they hit production

Quick Start

pip install requests
import requests

BASE = "https://test-thread-production.up.railway.app"

# Create a test suite
suite = requests.post(f"{BASE}/suites", json={
    "name": "My Agent Tests",
    "description": "Testing my AI agent",
    "agent_endpoint": "https://your-agent.com/run"
}).json()

# Add a test case
requests.post(f"{BASE}/suites/{suite['id']}/cases", json={
    "name": "Basic response check",
    "input": "What is 2 + 2?",
    "expected_output": "4",
    "match_type": "contains"
})

# Run the suite
result = requests.post(f"{BASE}/suites/{suite['id']}/run").json()
print(f"Passed: {result['passed']} | Failed: {result['failed']}")

Match Types

Type Description
contains Output contains the expected string
exact Output matches exactly
regex Output matches a regex pattern

Live Dashboard

View your test results visually at test-thread.lovable.app


API Reference

Full docs at test-thread-production.up.railway.app/docs

Method Endpoint Description
GET / Health check
POST /suites Create test suite
GET /suites List all suites
POST /suites/{id}/cases Add test case
GET /suites/{id}/cases List test cases
POST /suites/{id}/run Run suite
GET /runs List all runs
GET /runs/{id} Get run details
GET /dashboard/stats Dashboard stats

Part of the Thread Suite

TestThread is part of a suite of open-source reliability tools for AI agents.

Tool What it does
Iron-Thread Validates AI output structure before it hits your database
TestThread Tests whether your agent behaves correctly across runs
PromptThread (coming soon) Versions and tracks prompt performance over time

Self-Host

git clone https://github.com/eugene001dayne/test-thread.git
cd test-thread
pip install -r requirements.txt
uvicorn main:app --reload

License

Apache 2.0 — free to use, modify, and distribute.


Built for developers who ship AI agents and need to know they work.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

testthread-0.6.0.tar.gz (3.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

testthread-0.6.0-py3-none-any.whl (4.0 kB view details)

Uploaded Python 3

File details

Details for the file testthread-0.6.0.tar.gz.

File metadata

  • Download URL: testthread-0.6.0.tar.gz
  • Upload date:
  • Size: 3.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for testthread-0.6.0.tar.gz
Algorithm Hash digest
SHA256 9858bb083fdc805724fb6e8c805a0db44e863466fd2cfcbe8904872c1752e2e8
MD5 1eb2dab926b04f2053c5ef98298308f0
BLAKE2b-256 d0d0b39a35cc833f7092c023e9ac14a30c6bd6b4ec3d035b5a830b92691050a8

See more details on using hashes here.

File details

Details for the file testthread-0.6.0-py3-none-any.whl.

File metadata

  • Download URL: testthread-0.6.0-py3-none-any.whl
  • Upload date:
  • Size: 4.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for testthread-0.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2d251d8b60f5973d0ad7a5ca555106eebac01f02086916cf75beba4281f4f53d
MD5 a8569af5e158dbc08dcf3d35b296d6ec
BLAKE2b-256 8c729b28701194f750a318255408f5975872c8fe3239fd9a73d7cf77e43784fe

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page