Skip to main content

pytest for AI agents

Project description

TestThread 🧵

pytest for AI agents.

The open-source testing framework that tells you if your AI agent is actually working — or quietly breaking.

License API Dashboard


The Problem

You build an AI agent. It works in testing. You ship it.

Then it starts hallucinating. Returning wrong formats. Calling the wrong tools. Breaking your pipeline.

You find out when something downstream crashes — not before.

TestThread fixes that.


What TestThread Does

Define what your agent should do. TestThread runs it, checks the output, and tells you exactly what passed and what failed.

  • ✅ Define test suites per agent
  • ✅ Add test cases with expected outputs
  • ✅ Run suites against your live agent endpoint
  • ✅ Get pass/fail results with reasons
  • ✅ Track pass rate over time
  • ✅ Catch regressions before they hit production

Quick Start

pip install requests
import requests

BASE = "https://test-thread-production.up.railway.app"

# Create a test suite
suite = requests.post(f"{BASE}/suites", json={
    "name": "My Agent Tests",
    "description": "Testing my AI agent",
    "agent_endpoint": "https://your-agent.com/run"
}).json()

# Add a test case
requests.post(f"{BASE}/suites/{suite['id']}/cases", json={
    "name": "Basic response check",
    "input": "What is 2 + 2?",
    "expected_output": "4",
    "match_type": "contains"
})

# Run the suite
result = requests.post(f"{BASE}/suites/{suite['id']}/run").json()
print(f"Passed: {result['passed']} | Failed: {result['failed']}")

Match Types

Type Description
contains Output contains the expected string
exact Output matches exactly
regex Output matches a regex pattern

Live Dashboard

View your test results visually at test-thread.lovable.app


API Reference

Full docs at test-thread-production.up.railway.app/docs

Method Endpoint Description
GET / Health check
POST /suites Create test suite
GET /suites List all suites
POST /suites/{id}/cases Add test case
GET /suites/{id}/cases List test cases
POST /suites/{id}/run Run suite
GET /runs List all runs
GET /runs/{id} Get run details
GET /dashboard/stats Dashboard stats

Part of the Thread Suite

TestThread is part of a suite of open-source reliability tools for AI agents.

Tool What it does
Iron-Thread Validates AI output structure before it hits your database
TestThread Tests whether your agent behaves correctly across runs
PromptThread (coming soon) Versions and tracks prompt performance over time

Self-Host

git clone https://github.com/eugene001dayne/test-thread.git
cd test-thread
pip install -r requirements.txt
uvicorn main:app --reload

License

Apache 2.0 — free to use, modify, and distribute.


Built for developers who ship AI agents and need to know they work.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

testthread-0.9.0.tar.gz (3.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

testthread-0.9.0-py3-none-any.whl (4.0 kB view details)

Uploaded Python 3

File details

Details for the file testthread-0.9.0.tar.gz.

File metadata

  • Download URL: testthread-0.9.0.tar.gz
  • Upload date:
  • Size: 3.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for testthread-0.9.0.tar.gz
Algorithm Hash digest
SHA256 81d0e4053d6003b75c2f6e68ed24260f772623a88b85fe30eed6358ce66b2dea
MD5 88f283097787fd3be124344a12c29d52
BLAKE2b-256 d1e60502992e96e7ae045bb915bee8d3705e2b86acbdf2d114d669a53ed27432

See more details on using hashes here.

File details

Details for the file testthread-0.9.0-py3-none-any.whl.

File metadata

  • Download URL: testthread-0.9.0-py3-none-any.whl
  • Upload date:
  • Size: 4.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for testthread-0.9.0-py3-none-any.whl
Algorithm Hash digest
SHA256 90cbe31bee62d0c31be684b2e3587bcd9803c7aaabc972679d8349ab43b16877
MD5 3ba5c553456039d67b8a0777217ce2b3
BLAKE2b-256 588f11bffd078d944483202d9b3863fdafa37ea1a50fdd7dc3d88ca86094386c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page