Agentic Test Data Harness: memory, multi-agent swarms, permission gates, coverage analysis. Provider-agnostic (Gemini, OpenAI, Anthropic, Ollama).

These details have not been verified by PyPI

Project links

Project description

FixtureForge

Agentic Test Data Harness for Python.
Generate realistic, context-aware fixtures — deterministic in CI, AI-powered in development.

The Problem

# This is what most test data looks like:
user = User(name="Test User", email="test@test.com", bio="Lorem ipsum...")

# It doesn't catch real-world edge cases.
# It doesn't feel like production data.
# And writing 500 of them by hand? Not happening.

FixtureForge solves this in two modes:

# CI mode — deterministic, zero AI, seed-controlled. Same seed = same data. Always.
forge = Forge(use_ai=False, seed=42)
users = forge.create_batch(User, count=500)

# Dev mode — AI-generated, context-aware, realistic
forge = Forge()
reviews = forge.create_batch(Review, count=50, context="angry holiday customers")

Installation

pip install fixtureforge

With your preferred AI provider:

pip install "fixtureforge[anthropic]"   # Claude
pip install "fixtureforge[openai]"      # GPT
pip install "fixtureforge[gemini]"      # Google Gemini
pip install "fixtureforge[all]"         # All providers

Quick Start

from fixtureforge import Forge
from pydantic import BaseModel

class User(BaseModel):
    id: int
    name: str
    email: str
    bio: str

forge = Forge()  # auto-detects provider from env vars
users = forge.create_batch(User, count=50, context="SaaS platform users")

That's it. FixtureForge:

Assigns sequential IDs automatically
Generates name and email with Faker (zero API cost)
Sends only bio to the AI — in a single batch call for all 50 records

Core Concepts

Intelligent Field Routing

Every field is classified into a tier. Only semantic fields hit the AI:

Tier	Fields	Generator	Cost
Structural	`id`, `user_id`, `order_id`	Internal counters / FK registry	Free
Standard	`name`, `email`, `phone`, `address`, `date`	Faker	Free
Computed	`@computed_field` properties	Pydantic	Free
Semantic	`bio`, `description`, `review`, `message`	LLM (batched)	API tokens

100 users with 2 semantic fields = 2 API calls, not 200.

CI Mode vs Dev Mode

# CI — fully deterministic, no network, reproducible
forge = Forge(use_ai=False, seed=42)

# Dev — AI-powered, realistic context
forge = Forge(provider_name="anthropic", model="claude-haiku-4-5-20251001")

# Large datasets — seed+interpolation, constant cost regardless of count
forge.create_large(Order, count=100_000, seed_ratio=0.01)  # pays for ~1k, delivers 100k

Verbose Mode

See exactly where each value comes from:

forge = Forge(use_ai=False, seed=42, verbose=True)
user = forge.create(User)

# [structural] id    = 1
# [faker]      name  = 'Allison Hill'
# [faker]      email = 'donaldgarcia@example.net'
# [ai]         bio   = 'Passionate developer with 8 years...'

Providers

FixtureForge auto-detects your provider from environment variables:

export ANTHROPIC_API_KEY=...   # → Claude (default: claude-haiku-4-5-20251001)
export OPENAI_API_KEY=...      # → GPT    (default: gpt-4o-mini)
export GOOGLE_API_KEY=...      # → Gemini (default: gemini-2.0-flash)
export GROQ_API_KEY=...        # → Groq   (default: llama-3.3-70b-versatile)
# No key? → Ollama (localhost:11434) → Deterministic-only

Or be explicit:

forge = Forge(provider_name="anthropic", model="claude-sonnet-4-6")
forge = Forge(provider_name="ollama", model="llama3.2")
forge = Forge(use_ai=False)  # zero cost, zero network

Foreign Key Relationships

# Step 1: generate customers
customers = forge.create_batch(Customer, count=10)

# Step 2: orders automatically reference real customer IDs
orders = forge.create_batch(Order, count=100)
# order.customer_id → always a valid customer.id

DataSwarms — Parallel Multi-Model Generation

Generate multiple models in parallel with shared AI cache.
The first model warms the cache; every subsequent model inherits it (~90% cheaper per model).

results = forge.swarm(
    models=[User, Order, Product, Payment],
    counts=[10,   50,    100,     30],
    contexts=["SaaS users", "E-commerce orders", None, None],
)

# returns:
# {
#   "User":    [...10 users...],
#   "Order":   [...50 orders...],
#   "Product": [...100 products...],
#   "Payment": [...30 payments...],
# }

5 models ≈ cost of 1.5 models.

Permission Gates

FixtureForge classifies models by data sensitivity and gates dangerous operations:

class SafeUser(BaseModel):
    id: int
    name: str          # SAFE — auto-approved

class CustomerProfile(BaseModel):
    id: int
    ssn: str           # SENSITIVE — requires FORGE_ALLOW_PII=1
    salary: float      # SENSITIVE

class SecurityTest(BaseModel):
    id: int
    sql_injection: str # DANGEROUS — requires interactive confirmation

# PII auto-approved
forge = Forge(allow_pii=True)

# CI/headless — dangerous ops silently rejected
forge = Forge(interactive=False)

Three levels: safe (auto) → sensitive (env gate) → dangerous (human prompt).

Domain Rules — ForgeMemory

Persist business rules that survive across sessions.
Rules are re-read on every generation call — update a rule, next call respects it immediately.

forge.memory.add_rule("financial", "Users under 18 get restricted account type")
forge.memory.add_rule("user", "Israeli phone numbers use format 05x-xxx-xxxx")
forge.memory.add_rule("orders", "Max 3 active loans per customer at any time")

# Rules inject into AI prompts automatically
users = forge.create_batch(User, count=50, context="Israeli SaaS platform")

Skeptical Memory — rules are hints, not truth. FixtureForge validates stored rules against the live schema before every generation call.

Progressive Forgetting — field names and types are never stored (re-derivable from the model). Only business rules that exist nowhere else in the code are kept.

ForgeDream — Coverage Analysis

Find gaps in your test-data coverage automatically:

import os
os.environ["FORGE_FLAG_DREAM"] = "1"

report = forge.dream(models=[User, Order], force=True)
print(report.summary())

# ForgeDream Report - 2026-04-08
#   Coverage gaps found  : 3
#   Rule conflicts found : 0
#   Top gaps:
#     [User.age]   no_boundary : No boundary-value rules for numeric field 'age'
#     [User.email] no_invalid  : No invalid-data rules for well-known field 'email'
#     [Order.total] no_boundary: No boundary-value rules for numeric field 'total'

Four phases: Orient (read index) → Gather (find gaps) → Consolidate (merge rules) → Prune (trim to ≤200 lines).

Report saved as .forge/coverage_gaps.json.

Streaming — Memory-Safe Large Datasets

# Lazy evaluation — writes to disk one record at a time
for user in forge.create_stream(User, count=1_000_000, filename="users.json"):
    pass  # process one record, never loads all into memory

Supports .json, .csv, .sql output formats.

Export

from fixtureforge.core.exporter import DataExporter

users = forge.create_batch(User, count=100)
DataExporter.to_json(users, "users.json")
DataExporter.to_csv(users, "users.csv")
DataExporter.to_sql(users, "users.sql", table_name="users")

Response Cache

AI responses are cached locally for 7 days. Identical requests cost nothing after the first call.

forge = Forge(use_cache=True)   # default — saves to ~/.fixtureforge/cache/
forge = Forge(use_cache=False)  # disable caching

Feature Flags

from fixtureforge.config import is_enabled, flag_summary

flag_summary()
# {
#   'FORGE_SWARMS':      True,   # shipped
#   'FORGE_PERMISSIONS': True,   # shipped
#   'FORGE_COMPRESSION': True,   # shipped
#   'FORGE_MCP':         True,   # shipped
#   'FORGE_DREAM':       False,  # enable with FORGE_FLAG_DREAM=1
#   'FORGE_KAIROS':      False,  # coming in v2.x
#   'FORGE_ULTRAPLAN':   False,  # coming in v2.x
# }

Enable any staged feature with an env var:

FORGE_FLAG_DREAM=1 python run_tests.py

Stats & Diagnostics

forge.stats()
# {
#   "registry": {"user": 50, "order": 200},
#   "session_tokens": 1240,
#   "memory": {"topics": 3, "total_kb": 2.4},
#   "flags": {"FORGE_SWARMS": True, "FORGE_PERMISSIONS": True}
# }

forge.clear_registry()  # reset FK registry between independent test scenarios

Architecture

FixtureForge v2.0
├── Config Layer        feature flags, env-var overrides
├── Security Layer      safe / sensitive / dangerous gates, mailbox pattern
├── Memory Layer        FORGE.md pointer index, on-demand topic files
├── Generation Layer    IntelligentRouter, SmartBatchEngine, DataSwarms
├── Compression Layer   Micro → Auto → Full (three-layer pipeline)
├── Export Layer        JSON / CSV / SQL / streaming
└── Background Layer    ForgeDream coverage analysis (feature-flagged)

Provider-agnostic: Claude, GPT, Gemini, Groq, Ollama, or no AI at all.
Pydantic v2 native: full support for @computed_field, validators, and constrained types.
CI-safe: seed= parameter guarantees identical output across runs.

Comparison

	FixtureForge	factory_boy	faker	hypothesis
AI-generated context	Yes	No	No	No
Deterministic (seed=)	Yes	Yes	Yes	Yes
FK relationships	Auto	Manual	No	No
Coverage analysis	Yes	No	No	Partial
CI-safe mode	Yes	Yes	Yes	Yes
Large datasets	Yes (100k+)	Manual	Manual	No
Permission gates	Yes	No	No	No

FixtureForge is not a replacement for faker — it uses faker internally. It's not a replacement for hypothesis — it solves a different problem. It adds the layer between "I need realistic data" and "I need it to feel like production".

Requirements

Python 3.11+
pydantic >= 2.5
faker >= 22.0

AI providers are optional extras — the core works with zero dependencies beyond pydantic and faker.

License

MIT — see LICENSE.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

2.2.0

Apr 27, 2026

2.1.0

Apr 13, 2026

This version

2.0.2

Apr 8, 2026

2.0.1

Apr 8, 2026

2.0.0

Apr 8, 2026

0.1.1

Feb 13, 2026

0.1.0

Feb 12, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fixtureforge-2.0.2.tar.gz (49.1 kB view details)

Uploaded Apr 8, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

fixtureforge-2.0.2-py3-none-any.whl (61.6 kB view details)

Uploaded Apr 8, 2026 Python 3

File details

Details for the file fixtureforge-2.0.2.tar.gz.

File metadata

Download URL: fixtureforge-2.0.2.tar.gz
Upload date: Apr 8, 2026
Size: 49.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for fixtureforge-2.0.2.tar.gz
Algorithm	Hash digest
SHA256	`8388aaff35992f36a400349091daec6b9b411011573c41f53f61906257a23af0`
MD5	`49d7d9e3a2ce645ea0e1d5b95a24d0e0`
BLAKE2b-256	`796e23a0df39e231759ac3b7f1073190371f341a6a7a9fe3efe94c6402dfe3a2`

See more details on using hashes here.

File details

Details for the file fixtureforge-2.0.2-py3-none-any.whl.

File metadata

Download URL: fixtureforge-2.0.2-py3-none-any.whl
Upload date: Apr 8, 2026
Size: 61.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for fixtureforge-2.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ab19cbf6118c7686c342457f8f1ef07432526ac8312c2231387f36a9d3f7718f`
MD5	`4e0ed1c97fc1a876a4690fc5bf01e548`
BLAKE2b-256	`4cec66a5b5aebe42c8bc2094825aa569352a8085ecd38bb794e24868982bf7dc`

See more details on using hashes here.

fixtureforge 2.0.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

FixtureForge

The Problem

Installation

Quick Start

Core Concepts

Intelligent Field Routing

CI Mode vs Dev Mode

Verbose Mode

Providers

Foreign Key Relationships

DataSwarms — Parallel Multi-Model Generation

Permission Gates

Domain Rules — ForgeMemory

ForgeDream — Coverage Analysis

Streaming — Memory-Safe Large Datasets

Export

Response Cache

Feature Flags

Stats & Diagnostics

Architecture

Comparison

Requirements

License

Links

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes