Skip to main content

pytest for AI agents

Project description

TestThread 🧵

pytest for AI agents.

The open-source testing framework that tells you if your AI agent is actually working — or quietly breaking.

License API Dashboard


The Problem

You build an AI agent. It works in testing. You ship it.

Then it starts hallucinating. Returning wrong formats. Calling the wrong tools. Breaking your pipeline.

You find out when something downstream crashes — not before.

TestThread fixes that.


What TestThread Does

Define what your agent should do. TestThread runs it, checks the output, and tells you exactly what passed and what failed.

  • ✅ Define test suites per agent
  • ✅ Add test cases with expected outputs
  • ✅ Run suites against your live agent endpoint
  • ✅ Get pass/fail results with reasons
  • ✅ Track pass rate over time
  • ✅ Catch regressions before they hit production

Quick Start

pip install requests
import requests

BASE = "https://test-thread-production.up.railway.app"

# Create a test suite
suite = requests.post(f"{BASE}/suites", json={
    "name": "My Agent Tests",
    "description": "Testing my AI agent",
    "agent_endpoint": "https://your-agent.com/run"
}).json()

# Add a test case
requests.post(f"{BASE}/suites/{suite['id']}/cases", json={
    "name": "Basic response check",
    "input": "What is 2 + 2?",
    "expected_output": "4",
    "match_type": "contains"
})

# Run the suite
result = requests.post(f"{BASE}/suites/{suite['id']}/run").json()
print(f"Passed: {result['passed']} | Failed: {result['failed']}")

Match Types

Type Description
contains Output contains the expected string
exact Output matches exactly
regex Output matches a regex pattern

Live Dashboard

View your test results visually at test-thread.lovable.app


API Reference

Full docs at test-thread-production.up.railway.app/docs

Method Endpoint Description
GET / Health check
POST /suites Create test suite
GET /suites List all suites
POST /suites/{id}/cases Add test case
GET /suites/{id}/cases List test cases
POST /suites/{id}/run Run suite
GET /runs List all runs
GET /runs/{id} Get run details
GET /dashboard/stats Dashboard stats

Part of the Thread Suite

TestThread is part of a suite of open-source reliability tools for AI agents.

Tool What it does
Iron-Thread Validates AI output structure before it hits your database
TestThread Tests whether your agent behaves correctly across runs
PromptThread (coming soon) Versions and tracks prompt performance over time

Self-Host

git clone https://github.com/eugene001dayne/test-thread.git
cd test-thread
pip install -r requirements.txt
uvicorn main:app --reload

License

Apache 2.0 — free to use, modify, and distribute.


Built for developers who ship AI agents and need to know they work.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

testthread-0.2.0.tar.gz (3.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

testthread-0.2.0-py3-none-any.whl (4.0 kB view details)

Uploaded Python 3

File details

Details for the file testthread-0.2.0.tar.gz.

File metadata

  • Download URL: testthread-0.2.0.tar.gz
  • Upload date:
  • Size: 3.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for testthread-0.2.0.tar.gz
Algorithm Hash digest
SHA256 bde169c77e5fade74ebeeb251fae5d3b847239c7a07659500a8eee70fd21ef1c
MD5 f9bc613823d0a9a726a8a39e1b5096d0
BLAKE2b-256 44c0b7d751ea9427ddfe5febd2507c601972230834c1454bc04f551748421099

See more details on using hashes here.

File details

Details for the file testthread-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: testthread-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 4.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for testthread-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4b65e290baa4e107ab81b6f5d1ce7d18037ed057f13934c6ad21dc647f39ddbf
MD5 c4ae100420bd50e231c765a4f43ce780
BLAKE2b-256 18a36cd20bf6c321a3edefccf16abb05f96f48170197f28c7c0f196647e80ddb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page