Declarative test framework for Intent API applications
Project description
intent-api-test
Declarative test framework for Intent API applications.
Tests are declarations, not functions. AI agents write them. Humans read them. Dashboards display them.
from intent_api_test import test, chain, step, expect, ref, configure
test("Create brand",
model="Brand",
action="create",
payload={"name": "Acme Corp"},
actor="member",
expect=expect.success({"name": "Acme Corp", "id": expect.any(int)}),
)
chain("Brand lifecycle",
steps=[
step("Create", model="Brand", action="create",
payload={"name": "Test"}, expect=expect.success(), save_as="brand"),
step("Read back", model="Brand", action="read",
id=ref("brand.id"), expect=expect.success({"name": "Test"})),
step("Delete", model="Brand", action="delete",
id=ref("brand.id"), expect=expect.success()),
],
actor="member",
)
Install
pip install intent-api-test # v0.2.0+
# With cloud reporter + HTTPBackend
pip install 'intent-api-test[cloud]' # adds httpx
# Local dev (from product repo)
pip install -e ../intent-api-test
How It Works
The DirectBackend calls IntentRouter._dispatch() in-process — no HTTP server needed. Tests run against the real services with a real database session, then roll back. Full auth/policy/quota pipeline applies if the router has a runtime configured.
The HTTPBackend (Phase 2) sends real HTTP requests to a running server — same test declarations, different transport. Used for staging/production smoke testing.
Setup
1. Create tests/conftest.py
This file runs when the CLI imports it. Call configure() at module level — not inside a pytest fixture (fixtures only run under pytest; intent-test run imports this as a plain module).
# tests/conftest.py
import uuid
from sqlalchemy.dialects.postgresql import insert as pg_insert
# Fixed UUIDs — referenced in actor definitions and seed data
SEED_USER_ID = uuid.UUID("00000000-0000-0000-0000-000000000001")
SEED_TEAM_ID = uuid.UUID("00000000-0000-0000-0000-000000000002")
def seed_db():
"""
Create test entities via raw SQLAlchemy.
ON CONFLICT DO NOTHING makes this idempotent — safe to run on every import.
Use fixed UUIDs so actor definitions can reference them at configure() time.
"""
from app.database import SessionLocal
from app.models import Team, TeamMembership, User
db = SessionLocal()
try:
db.execute(
pg_insert(User.__table__).values(
id=SEED_USER_ID,
email="test@myapp-test.dev",
clerk_user_id="test_clerk_001",
first_name="Test", last_name="User",
).on_conflict_do_nothing(index_elements=["id"])
)
db.execute(
pg_insert(Team.__table__).values(
id=SEED_TEAM_ID, name="Test Team", owner_id=SEED_USER_ID,
).on_conflict_do_nothing(index_elements=["id"])
)
db.execute(
pg_insert(TeamMembership.__table__).values(
id=uuid.UUID("00000000-0000-0000-0000-000000000010"),
user_id=SEED_USER_ID, team_id=SEED_TEAM_ID,
role="owner", status="active",
).on_conflict_do_nothing(index_elements=["user_id", "team_id"])
)
db.commit()
except Exception:
db.rollback()
raise
finally:
db.close()
def _setup_configure():
from app.database import get_db
from app.main import intent_router
from intent_api_test import configure
configure(
product="my-app",
router=intent_router,
get_db=get_db,
actors={
"admin": {
"role": "admin",
"surface": "standard",
"plan": "pro",
"id": str(SEED_USER_ID), # overrides "test-admin" default
"team_id": str(SEED_TEAM_ID),
},
"free_user": {
"role": "member",
"surface": "standard",
"plan": "free",
"id": str(SEED_USER_ID),
"team_id": str(SEED_TEAM_ID),
},
"machine": {
"surface": "machine",
"team_id": str(SEED_TEAM_ID),
},
},
seed=None, # seed handled by seed_db() above
)
seed_db() # ← module level: runs under both pytest and intent-test CLI
_setup_configure() # ← module level: same
2. Write test files
Any file matching tests/test_*.py is discovered automatically.
# tests/test_brand.py
from tests.conftest import SEED_TEAM_ID
from intent_api_test import test, chain, step, expect, ref
test("List brands returns list",
model="Brand", action="list",
actor="admin",
expect=expect.success(expect.list()),
)
test("Create brand",
model="Brand", action="create",
payload={"name": "Acme"},
actor="admin",
expect=expect.success({"name": "Acme", "id": expect.any(int)}),
)
chain("Brand lifecycle",
steps=[
step("Create", model="Brand", action="create",
payload={"name": "Test Brand"}, expect=expect.success(), save_as="brand"),
step("Read back", model="Brand", action="read",
id=ref("brand.id"), expect=expect.success({"name": "Test Brand"})),
step("Update", model="Brand", action="update",
id=ref("brand.id"), payload={"name": "Updated"},
expect=expect.success({"name": "Updated"})),
step("Delete", model="Brand", action="delete",
id=ref("brand.id"), expect=expect.success()),
],
actor="admin",
)
3. Run
cd my-app-backend
intent-test run
Actors
Actors map a name to a user identity. The surface field routes the dispatch; all other fields become attributes on a SimpleNamespace user object passed to the service handler.
actors={
# Standard Clerk-authenticated user
"admin": {
"role": "admin",
"surface": "standard", # dispatch surface
"plan": "pro", # user.plan
"id": str(SEED_USER_ID), # user.id — REQUIRED if service calls get_user_team()
"team_id": str(SEED_TEAM_ID),# user.team_id + IntentContext.team_id
},
# Guest — no auth, user=None
"guest": {
"surface": "guest",
},
# Machine surface (API key auth)
"machine": {
"surface": "machine",
"team_id": str(SEED_TEAM_ID),
"gateway_id": str(SEED_GW_ID), # any extra field → user.gateway_id etc.
},
}
The "id" override: By default the framework generates user.id = "test-{actor_name}". If your service does get_user_team(db, user) which queries TeamMembership.user_id == user.id, you must override "id" with a real seeded User UUID. Set "id": str(REAL_UUID) in the actor dict — since "id" is not in the exclusion set, it replaces the default.
Context override per test: Pass context={"team_id": "other-team"} to a specific test() or step() call to merge those fields on top of the actor defaults.
expect API
# Success — response has no error
expect.success()
# Success with partial shape match (extra keys in response are ignored)
expect.success({"name": "Acme", "id": expect.any(int)})
# Success — response data is a list
expect.success(expect.list())
# Type matcher — use inside shape dicts
expect.any(int) # isinstance check
expect.any(str)
expect.any(float)
# Error matchers
expect.denied() # PERMISSION_DENIED
expect.error() # any error
expect.error("QUOTA_EXCEEDED") # specific error code
expect.upgrade_required() # PLAN_UPGRADE_REQUIRED
expect.upgrade_required(feature="exports") # + message contains feature name
expect.quota_exceeded() # QUOTA_EXCEEDED
# Custom assertion
expect.custom(lambda r: r.data["count"] > 0)
# With chain context:
expect.custom(lambda r, ctx: r.data["brand_id"] == ctx["brand"]["id"])
ref() and Chains
ref("key.path") is a lazy reference resolved at step execution time from the chain context dict. Supports dot-separated paths and numeric list indices.
chain("Create post under brand",
steps=[
step("Create brand", model="Brand", action="create",
payload={"name": "Acme"}, expect=expect.success(), save_as="brand"),
# ref() resolves "brand.id" from chain context after step 1
step("Create post", model="BlogPost", action="create",
payload={"brand_id": ref("brand.id"), "title": "Hello"},
expect=expect.success(), save_as="post"),
# numeric index: ref("brand.tags.0")
step("Read post", model="BlogPost", action="read",
id=ref("post.id"), expect=expect.success({"title": "Hello"})),
],
actor="admin",
)
save_as stores the full response data dict in the chain context under that key.
Chain isolation: All steps share one DB session. Uncommitted changes from step N are visible to step N+1. The session is rolled back at chain completion.
Important: If your service calls db.commit() inside the handler, that commit persists to the database even with the framework's rollback in some environments. Place destructive tests (delete, create-then-verify-deleted) last in the file. Seeds should use ON CONFLICT DO NOTHING so they self-heal on the next run if a previous run's delete committed permanently.
Custom actions
Always use action="custom", command="command_name":
test("Generate blog post",
model="BlogPost",
action="custom",
command="generate",
payload={"topic": "SEO basics"},
actor="admin",
expect=expect.success({"generated": expect.any(bool)}),
)
Smoke Suites
Smoke suites are curated lists of test labels that run as a deploy gate.
# tests/test_smoke.py
from intent_api_test import smoke
smoke("pre-deploy", [
"List brands returns list",
"Create brand",
"Brand lifecycle",
])
intent-test smoke "pre-deploy" # run the suite
intent-test smoke "pre-deploy" --output out.json # + JSON report
intent-test smoke "pre-deploy" --allow-missing # warn on unresolved labels
intent-test smoke --list # list all defined suites
Runs in the order listed in smoke() — not file insertion order. Unresolved labels fail the run with a clear error. --allow-missing downgrades to a warning. Exit code 0/1 same as run.
Coverage
intent-test coverage introspects the ServiceRegistry, lists every action implemented by each service, and shows how many tests cover it.
intent-test coverage # print coverage report
intent-test coverage --output cov.json # write JSON coverage report
intent-test coverage --fail-under 80 # exit 1 if coverage < 80%
Output:
intent-api-test — clawtrail — Coverage
Gateway (10 declarations)
✅ list — 2 tests
✅ read — 2 tests
✅ health_check — 2 tests (standard + machine)
ApiKey (0 declarations)
❌ create — no tests
❌ list — no tests
Coverage: 1/2 services fully covered · 3/5 actions tested · 60%
Chain steps count toward coverage — a chain step on Gateway.read increments that action's count.
Tags
Tag tests and filter by tag at run time:
test("Critical path", model="Brand", action="list",
actor="admin", expect=expect.success(), tags=["critical"])
chain("Full lifecycle", steps=[...], actor="admin", tags=["smoke", "critical"])
intent-test run --tag critical # OR: runs either tagged
intent-test run --tag critical --tag smoke
intent-test run --model Brand --tag critical # AND: Brand tests that are critical
Multiple --tag flags combine with OR. Combined with --model or --actor they add an AND condition.
HTTPBackend
Test against a real running server with the same declarations:
# tests/conftest.py — staging environment
configure(
product="my-app",
backend="http",
base_url="https://staging.my-app.com",
actors={
"admin": {
"surface": "standard",
"token": os.environ["STAGING_JWT"], # → Authorization: Bearer {token}
},
"machine": {
"surface": "machine",
"api_key": os.environ["STAGING_API_KEY"], # → Authorization: Bearer {key}
},
"guest": {
"surface": "guest", # no Authorization header
},
},
)
Surface routing: standard → /api/intent, admin → /api/admin-intent, guest → /api/guest-intent, machine → /api/machine-intent.
No DB isolation — HTTPBackend sends real HTTP requests. Creates/deletes persist on the target server. Designed for staging smoke tests, not for unit testing.
Requires pip install 'intent-api-test[cloud]' (adds httpx).
Cloud Reporter
Send results to Intent API Cloud after a run:
# Set key once
export INTENT_CLOUD_API_KEY=ct_live_...
intent-test run --report intent-cloud
intent-test smoke "pre-deploy" --report intent-cloud
intent-test coverage --report intent-cloud
# Or pass key inline
intent-test run --report intent-cloud --api-key ct_live_...
# Override endpoint (self-hosted)
intent-test run --report intent-cloud --report-url https://my-cloud.example.com/api/machine-intent
Sends a TestReport.create or TestCoverage.create intent after the run. Best-effort: network failures log a warning to stderr and never change the exit code.
Requires pip install 'intent-api-test[cloud]'.
CLI Reference
Commands:
intent-test run Run all tests
intent-test smoke Run a named smoke suite
intent-test coverage Show coverage across registered services
intent-test run [OPTIONS]
--output FILE Write JSON report to FILE
--model MODEL Only run tests for MODEL
--actor ACTOR Only run tests for ACTOR
--tag TAG Only run tests with TAG (repeatable, OR logic)
--report TARGET Report target: "intent-cloud"
--api-key KEY API key for --report intent-cloud
--report-url URL Override cloud endpoint
--version / --help
intent-test smoke SUITE_NAME [OPTIONS]
--allow-missing Warn instead of failing on unresolved labels
--list List all defined smoke suites
--output FILE Write JSON report
--report / --api-key / --report-url (same as run)
intent-test coverage [OPTIONS]
--output FILE Write JSON coverage report
--fail-under PCT Exit 1 if coverage < PCT
--report / --api-key / --report-url (same as run)
Exit codes:
0 All tests passed (coverage above threshold if --fail-under used)
1 Any failure / unresolved smoke label / coverage below threshold
The CLI:
- Adds CWD to
sys.path(sofrom tests.conftest import Xworks) - Imports
tests/conftest.pyfirst — runsseed_db()andconfigure() - Imports
tests/test_*.pyfiles alphabetically - Executes all declarations in file insertion order
Filtering chains: --model Brand includes a chain if ANY step has model="Brand". The full chain runs — steps are not trimmed. Tag filtering only matches chain-level tags, not individual step tags.
Tips
Idempotent seeds: Use ON CONFLICT DO NOTHING (or ON CONFLICT DO UPDATE) so tests can be re-run after failures without "unique constraint" errors.
Declaration order = execution order: Tests and chains run in the exact order they appear in the file. If test A creates data that test B needs, declare A before B. If test C deletes data, declare C after all tests that need that data.
seed_ctx from seeds: If you use configure(seed=[...]) (the framework's tuple-based seed mechanism), save_as values from seeds are available as ref() targets in all tests and chains. Use this for data that must exist before any test runs.
Multi-surface actors: The same product can have actors for different surfaces. machine actor tests machine-intent paths. admin actor tests standard paths. These can coexist in the same test file.
ClickHouse / non-transactional databases: Tests against services that read from ClickHouse work normally — ClickHouse queries are read-only from the framework's perspective. Assert shapes (expect.any(int), expect.list()) rather than exact values since ClickHouse data varies by environment.
Pydantic model responses: Services that return MutationResponse or other Pydantic models from intent_api.service work transparently. The framework converts them to dicts automatically before shape matching and before storing in chain context. expect.success({"success": True, "id": expect.any(str)}) and ref("brand.id") both work correctly against Pydantic model responses (v0.1.2+).
Smoke suites run in list order: The order of labels in smoke("suite", ["A", "B", "C"]) is the execution order. This differs from run, which executes in file insertion order.
HTTPBackend vs DirectBackend: DirectBackend (default) is for unit testing — in-process, transaction-isolated. HTTPBackend is for integration testing — real HTTP, no isolation, changes persist. Use backend="http" in your staging conftest, keep the default in your dev conftest.
License
IACL v1.0 — free for all use including commercial. No competing framework or hosted service.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file intent_api_test-0.3.0.tar.gz.
File metadata
- Download URL: intent_api_test-0.3.0.tar.gz
- Upload date:
- Size: 62.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
243df841699532727c5203d9d56390fe5dcdfd2a95862d18d766568181d00b9a
|
|
| MD5 |
d1a91e48a482e4ddbe1292e336e94d99
|
|
| BLAKE2b-256 |
c12265b09396784892b5610ea2fd48707109b6bb39b87feff022ae40850ceca8
|
File details
Details for the file intent_api_test-0.3.0-py3-none-any.whl.
File metadata
- Download URL: intent_api_test-0.3.0-py3-none-any.whl
- Upload date:
- Size: 47.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b44c220b1e5086388a4a3af0e28df8430510710f1b0a405baec036337b841f03
|
|
| MD5 |
94d3bffef2403a95f48d0b1795e62b94
|
|
| BLAKE2b-256 |
fecc128918a3134887ab8aa56e50a784a6301cd5beee8dc7124dca75f18e3708
|