Agentic Test Data Harness: memory, multi-agent swarms, permission gates, coverage analysis. Provider-agnostic (Gemini, OpenAI, Anthropic, Ollama).
Project description
FixtureForge
Agentic Test Data Harness for Python.
Generate realistic, context-aware fixtures — deterministic in CI, AI-powered in development.
The Problem
# This is what most test data looks like:
user = User(name="Test User", email="test@test.com", bio="Lorem ipsum...")
# It doesn't catch real-world edge cases.
# It doesn't feel like production data.
# And writing 500 of them by hand? Not happening.
FixtureForge solves this in two modes:
# CI mode — deterministic, zero AI, seed-controlled. Same seed = same data. Always.
forge = Forge(use_ai=False, seed=42)
users = forge.create_batch(User, count=500)
# Dev mode — AI-generated, context-aware, realistic
forge = Forge()
reviews = forge.create_batch(Review, count=50, context="angry holiday customers")
Installation
pip install fixtureforge
With your preferred AI provider:
pip install "fixtureforge[anthropic]" # Claude
pip install "fixtureforge[openai]" # GPT
pip install "fixtureforge[gemini]" # Google Gemini
pip install "fixtureforge[all]" # All providers
Quick Start
from fixtureforge import Forge
from pydantic import BaseModel
class User(BaseModel):
id: int
name: str
email: str
bio: str
forge = Forge() # auto-detects provider from env vars
users = forge.create_batch(User, count=50, context="SaaS platform users")
That's it. FixtureForge:
- Assigns sequential IDs automatically
- Generates
nameandemailwith Faker (zero API cost) - Sends only
bioto the AI — in a single batch call for all 50 records
Core Concepts
Intelligent Field Routing
Every field is classified into a tier. Only semantic fields hit the AI:
| Tier | Fields | Generator | Cost |
|---|---|---|---|
| Structural | id, user_id, order_id |
Internal counters / FK registry | Free |
| Standard | name, email, phone, address, date |
Faker | Free |
| Computed | @computed_field properties |
Pydantic | Free |
| Semantic | bio, description, review, message |
LLM (batched) | API tokens |
100 users with 2 semantic fields = 2 API calls, not 200.
CI Mode vs Dev Mode
# CI — fully deterministic, no network, reproducible
forge = Forge(use_ai=False, seed=42)
# Dev — AI-powered, realistic context
forge = Forge(provider_name="anthropic", model="claude-haiku-4-5-20251001")
# Large datasets — seed+interpolation, constant cost regardless of count
forge.create_large(Order, count=100_000, seed_ratio=0.01) # pays for ~1k, delivers 100k
Verbose Mode
See exactly where each value comes from:
forge = Forge(use_ai=False, seed=42, verbose=True)
user = forge.create(User)
# [structural] id = 1
# [faker] name = 'Allison Hill'
# [faker] email = 'donaldgarcia@example.net'
# [ai] bio = 'Passionate developer with 8 years...'
Providers
FixtureForge auto-detects your provider from environment variables:
export ANTHROPIC_API_KEY=... # → Claude (default: claude-haiku-4-5-20251001)
export OPENAI_API_KEY=... # → GPT (default: gpt-4o-mini)
export GOOGLE_API_KEY=... # → Gemini (default: gemini-2.0-flash)
export GROQ_API_KEY=... # → Groq (default: llama-3.3-70b-versatile)
# No key? → Ollama (localhost:11434) → Deterministic-only
Or be explicit:
forge = Forge(provider_name="anthropic", model="claude-sonnet-4-6")
forge = Forge(provider_name="ollama", model="llama3.2")
forge = Forge(use_ai=False) # zero cost, zero network
Foreign Key Relationships
Register parent records first — child FKs resolve automatically:
# Step 1: generate customers
customers = forge.create_batch(Customer, count=10)
# Step 2: orders automatically reference real customer IDs
orders = forge.create_batch(Order, count=100)
# order.customer_id → always a valid customer.id
DataSwarms — Parallel Multi-Model Generation
Generate multiple models in parallel with shared AI cache.
The first model warms the cache; every subsequent model inherits it (~90% cheaper per model).
results = forge.swarm(
models=[User, Order, Product, Payment],
counts=[10, 50, 100, 30],
contexts=["SaaS users", "E-commerce orders", None, None],
)
# returns:
# {
# "User": [...10 users...],
# "Order": [...50 orders...],
# "Product": [...100 products...],
# "Payment": [...30 payments...],
# }
5 models ≈ cost of 1.5 models.
Permission Gates
FixtureForge classifies models by data sensitivity and gates dangerous operations:
class SafeUser(BaseModel):
id: int
name: str # SAFE — auto-approved
class CustomerProfile(BaseModel):
id: int
ssn: str # SENSITIVE — requires FORGE_ALLOW_PII=1
salary: float # SENSITIVE
class SecurityTest(BaseModel):
id: int
sql_injection: str # DANGEROUS — requires interactive confirmation
# PII auto-approved
forge = Forge(allow_pii=True)
# CI/headless — dangerous ops silently rejected
forge = Forge(interactive=False)
Three levels: safe (auto) → sensitive (env gate) → dangerous (human prompt).
Domain Rules — ForgeMemory
Persist business rules that survive across sessions.
Rules are re-read on every generation call — update a rule, next call respects it immediately.
forge.memory.add_rule("financial", "Users under 18 get restricted account type")
forge.memory.add_rule("user", "Israeli phone numbers use format 05x-xxx-xxxx")
forge.memory.add_rule("orders", "Max 3 active loans per customer at any time")
# Rules inject into AI prompts automatically
users = forge.create_batch(User, count=50, context="Israeli SaaS platform")
Skeptical Memory — rules are hints, not truth. FixtureForge validates stored rules against the live schema before every generation call.
Progressive Forgetting — field names and types are never stored (re-derivable from the model). Only business rules that exist nowhere else in the code are kept.
ForgeDream — Coverage Analysis
Find gaps in your test-data coverage automatically:
import os
os.environ["FORGE_FLAG_DREAM"] = "1"
report = forge.dream(models=[User, Order], force=True)
print(report.summary())
# ForgeDream Report - 2026-04-08
# Coverage gaps found : 3
# Rule conflicts found : 0
# Top gaps:
# [User.age] no_boundary : No boundary-value rules for numeric field 'age'
# [User.email] no_invalid : No invalid-data rules for well-known field 'email'
# [Order.total] no_boundary: No boundary-value rules for numeric field 'total'
Four phases: Orient (read index) → Gather (find gaps) → Consolidate (merge rules) → Prune (trim to ≤200 lines).
Report saved as .forge/coverage_gaps.json.
Streaming — Memory-Safe Large Datasets
# Lazy evaluation — writes to disk one record at a time
for user in forge.create_stream(User, count=1_000_000, filename="users.json"):
pass # process one record, never loads all into memory
Supports .json, .csv, .sql output formats.
Export
from fixtureforge.core.exporter import DataExporter
users = forge.create_batch(User, count=100)
DataExporter.to_json(users, "users.json")
DataExporter.to_csv(users, "users.csv")
DataExporter.to_sql(users, "users.sql", table_name="users")
Response Cache
AI responses are cached locally for 7 days. Identical requests cost nothing after the first call.
forge = Forge(use_cache=True) # default — saves to ~/.fixtureforge/cache/
forge = Forge(use_cache=False) # disable caching
Feature Flags
from fixtureforge.config import is_enabled, flag_summary
flag_summary()
# {
# 'FORGE_SWARMS': True, # shipped
# 'FORGE_PERMISSIONS': True, # shipped
# 'FORGE_COMPRESSION': True, # shipped
# 'FORGE_MCP': True, # shipped
# 'FORGE_DREAM': False, # enable with FORGE_FLAG_DREAM=1
# 'FORGE_KAIROS': False, # coming in v2.x
# 'FORGE_ULTRAPLAN': False, # coming in v2.x
# }
Enable any staged feature with an env var:
FORGE_FLAG_DREAM=1 python run_tests.py
Stats & Diagnostics
forge.stats()
# {
# "registry": {"user": 50, "order": 200},
# "session_tokens": 1240,
# "memory": {"topics": 3, "total_kb": 2.4},
# "flags": {"FORGE_SWARMS": True, "FORGE_PERMISSIONS": True}
# }
forge.clear_registry() # reset FK registry between independent test scenarios
Architecture
FixtureForge v2.0
├── Config Layer feature flags, env-var overrides
├── Security Layer safe / sensitive / dangerous gates, mailbox pattern
├── Memory Layer FORGE.md pointer index, on-demand topic files
├── Generation Layer IntelligentRouter, SmartBatchEngine, DataSwarms
├── Compression Layer Micro → Auto → Full (three-layer pipeline)
├── Export Layer JSON / CSV / SQL / streaming
└── Background Layer ForgeDream coverage analysis (feature-flagged)
Provider-agnostic: Claude, GPT, Gemini, Groq, Ollama, or no AI at all.
Pydantic v2 native: full support for @computed_field, validators, and constrained types.
CI-safe: seed= parameter guarantees identical output across runs.
Comparison
| FixtureForge | factory_boy | faker | hypothesis | |
|---|---|---|---|---|
| AI-generated context | Yes | No | No | No |
| Deterministic (seed=) | Yes | Yes | Yes | Yes |
| FK relationships | Auto | Manual | No | No |
| Coverage analysis | Yes | No | No | Partial |
| CI-safe mode | Yes | Yes | Yes | Yes |
| Large datasets | Yes (100k+) | Manual | Manual | No |
| Permission gates | Yes | No | No | No |
FixtureForge is not a replacement for faker — it uses faker internally. It's not a replacement for hypothesis — it solves a different problem. It adds the layer between "I need realistic data" and "I need it to feel like production".
Requirements
- Python 3.11+
- pydantic >= 2.5
- faker >= 22.0
AI providers are optional extras — the core works with zero dependencies beyond pydantic and faker.
License
MIT — see LICENSE.
Links
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fixtureforge-2.0.2.tar.gz.
File metadata
- Download URL: fixtureforge-2.0.2.tar.gz
- Upload date:
- Size: 49.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8388aaff35992f36a400349091daec6b9b411011573c41f53f61906257a23af0
|
|
| MD5 |
49d7d9e3a2ce645ea0e1d5b95a24d0e0
|
|
| BLAKE2b-256 |
796e23a0df39e231759ac3b7f1073190371f341a6a7a9fe3efe94c6402dfe3a2
|
File details
Details for the file fixtureforge-2.0.2-py3-none-any.whl.
File metadata
- Download URL: fixtureforge-2.0.2-py3-none-any.whl
- Upload date:
- Size: 61.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ab19cbf6118c7686c342457f8f1ef07432526ac8312c2231387f36a9d3f7718f
|
|
| MD5 |
4e0ed1c97fc1a876a4690fc5bf01e548
|
|
| BLAKE2b-256 |
4cec66a5b5aebe42c8bc2094825aa569352a8085ecd38bb794e24868982bf7dc
|