AI-powered test data generator for QA engineers

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

testdata-ai

Stop writing test@test.com. Generate realistic, context-aware test data with GPT-4, Claude, or a local Ollama model — in one command.

testdata-ai CLI demo

pip install "testdata-ai[openai]"
testdata-ai generate --context ecommerce_customer --count 10

from testdata_ai import generate
users = generate("ecommerce_customer", count=50)  # list of 50 realistic dicts

Why testdata-ai?

13 built-in domains — e-commerce, banking, healthcare, HR, IoT, travel, and more
3 AI providers — OpenAI, Anthropic, or a local Ollama model (no API cost)
pytest plugin — session-scoped fixtures with caching, named seeds, and xdist support, auto-loaded

	Faker	testdata-ai
Realistic emails	`test123@example.com`	`aisha.patel.2024@gmail.com`
Cultural diversity	Limited	Names from many cultures
Behavioral coherence	None	Age, location, and habits match
Edge-case variety	Manual	AI generates it automatically

Installation
Configuration
CLI
Python API
Custom Contexts
Pytest Plugin
Available Contexts
Development Roadmap

Installation

pip install "testdata-ai[openai]"       # OpenAI only
pip install "testdata-ai[anthropic]"    # Anthropic only
pip install "testdata-ai[ollama]"       # Ollama only (no extra packages — uses stdlib)
pip install "testdata-ai[all]"          # All providers

Development install (from source)

git clone https://github.com/testcraft-ai/testdata-ai.git
cd testdata-ai
python -m venv venv && source venv/bin/activate
pip install -e ".[all]"

Configuration

Create a .env file in the project root:

# Provider selection
AI_PROVIDER=openai          # openai | anthropic | ollama

# OpenAI
OPENAI_API_KEY=sk-proj-...
OPENAI_MODEL=gpt-4o-mini    # default; gpt-4o for higher quality
OPENAI_MAX_TOKENS=4096
OPENAI_TEMPERATURE=0.7

# Anthropic
ANTHROPIC_API_KEY=sk-ant-...
ANTHROPIC_MODEL=claude-haiku-4-5-20251001   # default
ANTHROPIC_MAX_TOKENS=4096
ANTHROPIC_TEMPERATURE=0.7

# Ollama (local, no API key required)
OLLAMA_BASE_URL=http://localhost:11434  # default
OLLAMA_MODEL=qwen2.5:14b               # default
OLLAMA_MAX_TOKENS=4096
OLLAMA_TEMPERATURE=0.7

All env vars are optional except *_API_KEY (Ollama requires no API key). Defaults: gpt-4o-mini / claude-haiku-4-5-20251001 / qwen2.5:14b, temperature 0.7, max_tokens 4096.

CLI

After installation, use the testdata-ai command (or python -m testdata_ai):

`generate`

Generate test data records and output as JSON, JSONL, CSV, or YAML.

testdata-ai generate --context <name> [OPTIONS]

Option	Default	Description
`--context TEXT`	(required)	Context name (see Available Contexts)
`--count INTEGER`	`10`	Number of records to generate
`--batch-size INTEGER`	`10`	Records per AI call. For `count > batch-size`, records are output progressively
`-o, --output [json\|jsonl\|csv\|yaml]`	`json`	Output format. Write to file via shell redirection: `-o csv > data.csv`
`--provider TEXT`	from env	AI provider override (`openai` / `anthropic` / `ollama`)
`--model TEXT`	from env	Model name override
`--max-tokens INTEGER`	from env	Max tokens per AI call (auto-adjusted to `batch-size` by default)
`--temperature FLOAT`	from env	Sampling temperature `0.0–1.0`
`--no-validate`	off	Skip schema validation
`--context-file PATH`	—	YAML or JSON file with custom context definitions (repeatable)
`-q, --quiet`	off	Suppress status messages (data only to stdout)

Examples:

# 10 e-commerce customers to stdout (JSON)
testdata-ai generate --context ecommerce_customer --count 10

# 50 SaaS trial users saved as CSV
testdata-ai generate --context saas_trial --count 50 -o csv > trials.csv

# 100 records in batches of 20 — JSONL lines appear after each batch
testdata-ai generate --context ecommerce_customer --count 100 --batch-size 20 -o jsonl

# Use Anthropic instead of the default provider
testdata-ai generate --context banking_user --count 5 --provider anthropic

# Use a local Ollama model
testdata-ai generate --context ecommerce_customer --count 10 --provider ollama

# Use a specific model with higher token budget
testdata-ai generate --context hr_employee --count 30 --model gpt-4o --max-tokens 8192

# Machine-readable output (no status messages, plain JSON)
testdata-ai generate --context iot_device --count 20 -q | jq '.[0]'

# Use as Python module (same interface)
python -m testdata_ai generate --context ecommerce_customer --count 5

# Load a custom context from a YAML file and generate data for it
testdata-ai generate --context game_character --context-file my_contexts.yaml --count 5

# Quiet: suppress all status messages including the "Loaded context(s)..." line
testdata-ai generate --context game_character --context-file my_contexts.yaml -q

Batch generation / streaming: Large counts are split into multiple AI calls of --batch-size records each. Progress is reported per batch in stderr. With -o jsonl, records are written to stdout as each batch completes — output starts immediately rather than waiting for all records. With -o yaml, each batch is appended as it arrives. With -o json or -o csv, all records are accumulated and written at the end.

Token auto-adjustment: When --max-tokens is not set, the CLI estimates the required token budget per batch and automatically increases it if needed, printing a yellow notice to stderr.

CSV output: Nested dicts are flattened with dot notation (e.g., location.city); lists are serialized as JSON strings.

JSONL output: One JSON object per line — records appear progressively as batches complete.

YAML output: Records are appended batch-by-batch as generation progresses.

`list-contexts`

List all available contexts.

testdata-ai list-contexts [--category CATEGORY] [--context-file PATH]...

# List all contexts
testdata-ai list-contexts

# Filter by category
testdata-ai list-contexts --category finance
testdata-ai list-contexts --category healthcare

# Include custom contexts from a file
testdata-ai list-contexts --context-file my_contexts.yaml

`show-context`

Show full details of a context: fields, sample record, and prompt hints.

testdata-ai show-context <context> [--context-file PATH]...

testdata-ai show-context ecommerce_customer
testdata-ai show-context logistics_shipment

# Show a custom context defined in a file
testdata-ai show-context game_character --context-file my_contexts.yaml

`list-models` (Ollama only)

List models available in the running Ollama instance.

testdata-ai list-models [--provider ollama]

# Requires AI_PROVIDER=ollama in .env, or pass --provider explicitly
testdata-ai list-models
testdata-ai list-models --provider ollama

If no models are found, the command prints a hint to run ollama pull <model>.

Python API

`DataGenerator`

from testdata_ai import DataGenerator

# Default provider from .env
gen = DataGenerator()

# Explicit provider
gen = DataGenerator(provider="anthropic")

# Local Ollama model (no API key needed)
gen = DataGenerator(provider="ollama")
gen = DataGenerator(provider="ollama", model="mistral:latest")

# Full control
gen = DataGenerator(
    provider="openai",
    model="gpt-4o",
    temperature=0.9,
    max_tokens=8192,
)

# Pass API key directly (provider required when using api_key)
gen = DataGenerator(provider="openai", api_key="sk-proj-...")

# Generate records
customers = gen.generate("ecommerce_customer", count=10)
patients  = gen.generate("healthcare_patient", count=5)

# Large counts — automatically split into batches of 20 AI calls each
many = gen.generate("banking_user", count=100, batch_size=20)

# Skip schema validation
records = gen.generate("banking_user", count=20, validate=False)

DataGenerator.generate() returns List[Dict[str, Any]] — a list of plain Python dicts. For count > batch_size, it automatically splits the work into multiple AI calls and combines the results.

Raises:

ValueError — unknown context, invalid JSON from AI, or bad arguments
testdata_ai.contexts.ValidationError — one or more records missing required fields (when validate=True)

`generate()` convenience function

For one-off use without instantiating the class:

from testdata_ai import generate

customers = generate("ecommerce_customer", count=20)

# Large counts split automatically into 20-record batches
many = generate("ecommerce_customer", count=100, batch_size=20)

Configuration (provider, model, etc.) is read from environment variables. For explicit control use DataGenerator directly.

`generate_batched()` — streaming / incremental output

When you want to process or display records as they arrive rather than waiting for the full result:

from testdata_ai.generator import generate_batched

# Process records in batches of 10 as each batch completes
for batch in generate_batched("ecommerce_customer", count=50, batch_size=10):
    print(f"Got {len(batch)} records")
    save_to_db(batch)       # commit each batch immediately
    send_to_pipeline(batch) # or stream to a downstream system

# Or use DataGenerator directly for repeated use
gen = DataGenerator(provider="anthropic")
for batch in gen.generate_batched("banking_user", count=100, batch_size=20):
    process(batch)

generate_batched() / DataGenerator.generate_batched() yield List[Dict[str, Any]] — one batch per iteration.

`list_contexts()` / `get_context_schema()`

from testdata_ai import list_contexts, get_context_schema

# All context names
names = list_contexts()

# Filter by category
finance_contexts = list_contexts(category="finance")

# Inspect a schema
schema = get_context_schema("ecommerce_customer")
print(schema.fields)       # ['name', 'email', 'age', ...]
print(schema.description)  # 'e-commerce customer profiles'
print(schema.category)     # 'ecommerce'
print(schema.sample)       # full sample dict
print(schema.prompt_hints) # list of generation hints

Sample output

{
  "name": "Aisha Patel",
  "email": "aisha.patel.2024@gmail.com",
  "age": 28,
  "location": {
    "city": "Mumbai",
    "country": "India",
    "timezone": "Asia/Kolkata"
  },
  "shopping_behavior": {
    "frequency": "weekly",
    "avg_order_value": "$45-80",
    "preferred_categories": ["electronics", "books"],
    "device": "mobile",
    "payment_method": "upi"
  },
  "joined_date": "2023-04-15",
  "loyalty_tier": "silver"
}

Custom Contexts

The 13 built-in contexts cover common domains, but you can define your own for any data shape your project needs.

File-based (YAML or JSON)

Create a YAML file where each top-level key is a context name:

# my_contexts.yaml
game_character:
  description: "RPG game character profiles"
  category: "gaming"
  sample:
    character_id: "CHAR-0042"
    name: "Theron Blackwood"
    class: "Ranger"
    level: 15
    gold: 340
  prompt_hints:
    - "Fantasy names from diverse real-world cultures"
    - "Classes: Warrior, Mage, Ranger, Rogue, Cleric, Paladin, Druid, Bard"
    - "Level range 1-20; gold 10-5000 depending on level"

Load it with --context-file on any CLI command:

testdata-ai generate --context game_character --context-file my_contexts.yaml --count 5
testdata-ai list-contexts --context-file my_contexts.yaml
testdata-ai show-context game_character --context-file my_contexts.yaml

The flag is repeatable — pass multiple files to load several context collections at once.

JSON files are also supported (same structure, .json extension).

Programmatic (`register_context`)

from testdata_ai import register_context, ContextSchema

# Using ContextSchema
register_context("game_npc", ContextSchema(
    description="RPG non-player character profiles",
    category="gaming",
    sample={
        "npc_id": "NPC-0011",
        "name": "Mira Dawnwhisper",
        "role": "innkeeper",
        "disposition": "friendly",
        "gold": 80,
    },
    prompt_hints=[
        "Fantasy names from diverse real-world cultures",
        "Roles: innkeeper, blacksmith, guard, merchant, quest-giver",
        "Gold: 10-500 depending on role",
    ],
))

# Using a plain dict (no import of ContextSchema needed)
register_context("game_item", {
    "description": "RPG inventory items",
    "category": "gaming",
    "sample": {"item_id": "ITM-099", "name": "Elven Cloak", "rarity": "rare", "value_gold": 250},
    "prompt_hints": ["Rarities: common, uncommon, rare, epic, legendary"],
})

Both approaches register the context globally for the current process — DataGenerator and the pytest plugin pick it up immediately.

Loading from Python

from testdata_ai import load_contexts_from_file

names = load_contexts_from_file("my_contexts.yaml")  # returns ['game_character']

Schema rules

Field	Required	Notes
`description`	yes	Non-empty string
`sample`	yes	Non-empty dict; keys become the required field names
`prompt_hints`	yes	List of strings (empty list is allowed but reduces output quality)
`category`	no	Defaults to `"custom"`

Name rules: context names must start with a letter or underscore and contain only letters, digits, and underscores (snake_case recommended).

Warnings: register_context and load_contexts_from_file emit a UserWarning when prompt_hints is empty or when the sample contains nested dicts/lists (nested types are not validated at runtime).

Overwriting: pass overwrite=True to replace an existing context (including built-ins). A warning is emitted when a built-in is shadowed.

Atomicity: if a file contains multiple contexts and one fails validation, none of them are registered.

Pytest Plugin

The plugin ships with the package and is auto-loaded via the pytest11 entry point — no import or conftest setup needed.

Marker fixture: `testdata`

Function-scoped. Use with @pytest.mark.testdata to generate any context at any count. count defaults to 1 if omitted.

import pytest

@pytest.mark.testdata(context="ecommerce_customer", count=5)
def test_checkout_flow(testdata):
    assert len(testdata) == 5
    assert all("email" in row for row in testdata)

@pytest.mark.testdata(context="banking_user", count=1)
def test_single_bank_user(testdata):
    user = testdata[0]
    assert 300 <= user["credit_score"] <= 850

Auto-generated context fixtures

For every context, the plugin auto-generates two session-scoped fixtures:

Fixture name	Returns	Example
`<context>`	Single dict (1 record)	`ecommerce_customer`
`<context>s`	List of 10 dicts	`ecommerce_customers`

def test_single(ecommerce_customer):
    assert "email" in ecommerce_customer

def test_list(ecommerce_customers):
    assert len(ecommerce_customers) == 10

def test_patient(healthcare_patient):
    assert "blood_type" in healthcare_patient

def test_employees(hr_employees):
    assert all("salary" in e for e in hr_employees)

Caching and seeds

The plugin caches AI responses to avoid redundant API calls within and across test runs. Cache files live in .testdata_ai_cache/. Add .testdata_ai_cache/ and .testdata_ai.log to your .gitignore.

Seed = a named cache snapshot. Use --testdata-seed to name and reuse a cache:

# First run: generate data and save under "smoke-seed"
pytest --testdata-seed smoke-seed

# Subsequent runs: reuse the cached data (no AI calls)
pytest --testdata-seed smoke-seed

# Reuse the most recently used named seed
pytest --testdata-last-seed

Without --testdata-seed, a temporary seed is created per run and deleted automatically when the session ends.

Seed and cache management

These options perform an admin action and exit without running tests:

# List all available seeds
pytest --testdata-list-seeds

# Show what's cached in the current (or a specific) seed
pytest --testdata-show-cache
pytest --testdata-show-cache smoke-seed

# Delete a specific seed
pytest --testdata-delete-seed smoke-seed

# Delete the last used seed
pytest --testdata-delete-last

# Clear all seeds and reset the last-seeds queue
pytest --testdata-clear-cache

pytest-xdist support

When running with pytest-xdist, each worker will make its own AI calls unless you specify a shared named seed:

# Recommended: share one cache across all workers
pytest -n 4 --testdata-seed my-seed

Without --testdata-seed, a warning is printed per worker.

Manual fixture pattern

If you prefer explicit control in conftest.py:

# conftest.py
import pytest
from testdata_ai import DataGenerator

@pytest.fixture(scope="session")
def test_customers():
    gen = DataGenerator()
    return gen.generate("ecommerce_customer", count=10)

# test_checkout.py
def test_checkout_flow(test_customers):
    customer = test_customers[0]
    assert customer["email"]
    assert customer["age"] >= 18

Logging

The plugin writes structured logs to .testdata_ai.log (rotating, max 5 MB × 3 backups) and to stderr. Log entries include seed name and xdist worker ID.

Available Contexts

Context	Category	Key Fields
`ecommerce_customer`	`ecommerce`	name, email, age, location, shopping_behavior, joined_date, loyalty_tier
`banking_user`	`finance`	name, email, age, account_type, balance, monthly_income, credit_score, branch, account_opened
`saas_trial`	`saas`	name, email, company, role, plan, signup_date, trial_expires, usage_stats
`healthcare_patient`	`healthcare`	patient_id, name, date_of_birth, gender, blood_type, primary_diagnosis, medications, allergies, insurance_provider, last_visit, attending_physician
`education_student`	`education`	student_id, name, email, age, major, minor, year, gpa, enrollment_status, courses, advisor
`b2b_lead`	`b2b`	lead_id, contact_name, email, phone, company, industry, company_size, job_title, lead_source, lead_score, deal_value, stage, notes
`hr_employee`	`hr`	employee_id, name, email, department, job_title, hire_date, salary, employment_type, manager, location, performance_rating
`real_estate_listing`	`real_estate`	listing_id, address, property_type, bedrooms, bathrooms, sqft, year_built, list_price, status, days_on_market, agent, features
`iot_device`	`iot`	device_id, device_type, manufacturer, firmware_version, location, status, battery_level, last_reading, alert_threshold, installed_date
`social_media_profile`	`social_media`	username, display_name, bio, followers, following, posts, verified, joined, category, engagement_rate, top_hashtags
`travel_booking`	`travel`	booking_id, passenger_name, email, trip_type, origin, destination, departure_date, return_date, cabin_class, total_price, currency, travelers, status, add_ons
`restaurant_order`	`food`	order_id, customer_name, restaurant, cuisine, items, subtotal, delivery_fee, tip, total, payment_method, order_type, status, ordered_at
`logistics_shipment`	`logistics`	tracking_number, carrier, origin, destination, ship_date, estimated_delivery, actual_delivery, weight_kg, dimensions_cm, contents, status, last_checkpoint

Run testdata-ai list-contexts to see all contexts, or testdata-ai show-context <name> for full field details and a sample record.

Development Roadmap

Done:

OpenAI + Anthropic + Ollama provider-agnostic architecture
13 built-in contexts across 13 categories
Schema validation with missing-field reporting
CLI (generate, list-contexts, show-context, list-models) with JSON, JSONL, CSV, and YAML output
Auto token estimation and adjustment
Spinner with elapsed time (animated on TTY, static on non-TTY)
python -m testdata_ai support
Pytest plugin: marker fixture, auto-context fixtures, seed/cache system
Seed cache management CLI options (list, show, delete, clear)
TEMP seed auto-cleanup after session
pytest-xdist support with shared named seeds
Rotating log file (.testdata_ai.log)
Batch generation / streaming — generate_batched(), --batch-size, progressive JSONL/YAML output
Custom contexts — register_context(), load_contexts_from_file(), --context-file CLI option

Next:

PyPI publish — pip install testdata-ai (requires python -m build + twine upload)
SQL output format — --output sql / -o sql (INSERT statements, configurable table name)
/docs folder — installation, quickstart, CLI reference, API reference, custom contexts, pytest integration
Async API — async def generate() / generate_batched() for high-throughput pipelines
Locale / language support — generate data in non-English languages (--locale pl, --locale ja)
Schema-from-model — infer ContextSchema from a Pydantic model or JSON Schema dict
pandas output — DataGenerator.to_dataframe() convenience method
More providers — Google Gemini, Mistral, Cohere
Relationship generation — generate_with_relationships() (e.g. customers + matching orders)

Contributing

Contributions welcome — see CONTRIBUTING.md for the full guide.

Found a bug? Open a bug report
Have an idea? Open a feature request
Want to code? Fork, branch, and open a PR

License

MIT License — see LICENSE

Built by TestCraft AI

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

mkocim

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.12.0

Mar 11, 2026

0.11.0

Mar 10, 2026

0.10.0

Mar 10, 2026

0.9.0

Mar 10, 2026

0.8.0

Mar 9, 2026

0.7.0

Mar 6, 2026

0.6.0

Mar 5, 2026

0.5.0

Mar 5, 2026

0.4.0

Mar 5, 2026

0.3.0

Mar 4, 2026

This version

0.2.0

Mar 4, 2026

0.1.0

Mar 4, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

testdata_ai-0.2.0.tar.gz (70.7 kB view details)

Uploaded Mar 4, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

testdata_ai-0.2.0-py3-none-any.whl (40.7 kB view details)

Uploaded Mar 4, 2026 Python 3

File details

Details for the file testdata_ai-0.2.0.tar.gz.

File metadata

Download URL: testdata_ai-0.2.0.tar.gz
Upload date: Mar 4, 2026
Size: 70.7 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for testdata_ai-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`94637a02ac46535af4a724999ffd6090f13416991f39c1e49ad48a503c3f6144`
MD5	`5b3177a43f743ecbd4feec4d88928565`
BLAKE2b-256	`ed9cd48e5c6f1d31b2f40e63bdd99d9318a616b7b2ddc801d188abdc4df9952a`

See more details on using hashes here.

Provenance

The following attestation bundles were made for testdata_ai-0.2.0.tar.gz:

Publisher: publish.yml on testcraft-ai/testdata-ai

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: testdata_ai-0.2.0.tar.gz
- Subject digest: 94637a02ac46535af4a724999ffd6090f13416991f39c1e49ad48a503c3f6144
- Sigstore transparency entry: 1032315581
- Sigstore integration time: Mar 4, 2026
Source repository:
- Permalink: testcraft-ai/testdata-ai@462d62f17e7d30f9a5e5b6196afa73a21b508ae7
- Branch / Tag: refs/tags/v0.2.0
- Owner: https://github.com/testcraft-ai
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@462d62f17e7d30f9a5e5b6196afa73a21b508ae7
- Trigger Event: release

File details

Details for the file testdata_ai-0.2.0-py3-none-any.whl.

File metadata

Download URL: testdata_ai-0.2.0-py3-none-any.whl
Upload date: Mar 4, 2026
Size: 40.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for testdata_ai-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6734c98a797bb56174452f42b1ea1aca6831bf81bc335ffdf02e2fbddbbbc151`
MD5	`f63cfe54b514e34a77fc5b8950972672`
BLAKE2b-256	`775ce8dc40b4ff524e8e6ba6dc7e8e3869f98f562d26eca36caf0deb33ce7888`

See more details on using hashes here.

Provenance

The following attestation bundles were made for testdata_ai-0.2.0-py3-none-any.whl:

Publisher: publish.yml on testcraft-ai/testdata-ai

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: testdata_ai-0.2.0-py3-none-any.whl
- Subject digest: 6734c98a797bb56174452f42b1ea1aca6831bf81bc335ffdf02e2fbddbbbc151
- Sigstore transparency entry: 1032315790
- Sigstore integration time: Mar 4, 2026
Source repository:
- Permalink: testcraft-ai/testdata-ai@462d62f17e7d30f9a5e5b6196afa73a21b508ae7
- Branch / Tag: refs/tags/v0.2.0
- Owner: https://github.com/testcraft-ai
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@462d62f17e7d30f9a5e5b6196afa73a21b508ae7
- Trigger Event: release

testdata-ai 0.2.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

testdata-ai

Table of Contents

Installation

Development install (from source)

Configuration

CLI

generate

list-contexts

show-context

list-models (Ollama only)

Python API

DataGenerator

generate() convenience function

generate_batched() — streaming / incremental output

list_contexts() / get_context_schema()

Sample output

Custom Contexts

File-based (YAML or JSON)

Programmatic (register_context)

Loading from Python

Schema rules

Pytest Plugin

Marker fixture: testdata

Auto-generated context fixtures

Caching and seeds

Seed and cache management

pytest-xdist support

Manual fixture pattern

Logging

Available Contexts

Development Roadmap

Contributing

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

`generate`

`list-contexts`

`show-context`

`list-models` (Ollama only)

`DataGenerator`

`generate()` convenience function

`generate_batched()` — streaming / incremental output

`list_contexts()` / `get_context_schema()`

Programmatic (`register_context`)

Marker fixture: `testdata`