Skip to main content

AI-powered test data generator that reads your database schema and fills it with realistic, FK-valid data in seconds

Project description

SeedForge

One command to fill your database with realistic test data.

SeedForge connects to your database, reads the schema (tables, columns, foreign keys, constraints), and generates realistic, FK-valid data — no code, no config, no seed scripts.

pip install seedforge
seedforge connect postgresql://user:pass@localhost/mydb
seedforge generate --rows 1000
# Done. 40 tables filled in 3 seconds.

Features

  • Zero-config — reads your DB schema automatically, no setup needed
  • FK integrity — resolves foreign keys via topological sort, inserts in correct order
  • Smart heuristics — 80+ column name patterns for realistic data (email → real email, price → decimal, role → admin/user/editor)
  • Deterministic — use --seed to get the same data every time
  • AI-powered — optional Claude AI integration for maximum realism
  • Export — SQL or JSON file output
  • Privacy-first — runs entirely locally, your data never leaves your machine

Installation

pip install seedforge

# With MySQL support
pip install seedforge[mysql]

# With AI support (Claude API)
pip install seedforge[ai]

# Everything
pip install seedforge[all]

Quick Start

1. Connect

seedforge connect postgresql://user:pass@localhost:5432/mydb

Saves the connection to .seedforge.yaml so you don't have to type it again.

2. Inspect

seedforge inspect

Shows all tables, columns, types, foreign keys, and insertion order:

Found 18 tables (insertion order):

         1. users
┏━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━┓
┃ Column     ┃ Type      ┃ Nullable ┃ FK →  ┃
┡━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━┩
│ id         │ serial    │ NO       │       │
│ email      │ varchar   │ NO       │       │
│ name       │ varchar   │ YES      │       │
└────────────┴───────────┴──────────┴───────┘

             2. orders
┏━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Column     ┃ Type      ┃ Nullable ┃ FK →       ┃
┡━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━┩
│ id         │ serial    │ NO       │            │
│ user_id    │ integer   │ NO       │ users.id   │
│ total      │ numeric   │ NO       │            │
└────────────┴───────────┴──────────┴────────────┘

3. Generate

# Generate and insert 100 rows per table
seedforge generate --rows 100

# Preview without inserting
seedforge generate --rows 10 --dry-run

# Export to SQL file
seedforge generate --rows 1000 --export sql

# Export to JSON
seedforge generate --rows 1000 --export json

# Deterministic (same data every time)
seedforge generate --rows 100 --seed 42

# Only specific tables (auto-includes FK parents)
seedforge generate --tables orders,payments --rows 50

# Clean tables before generating
seedforge generate --rows 100 --clean

4. AI Generate (optional)

export ANTHROPIC_API_KEY=sk-...
seedforge ai-generate --rows 20

Uses Claude AI to generate context-aware data with maximum realism.

How It Works

  1. Schema introspection — connects to your database, reads information_schema to get tables, columns, types, FK relationships, constraints, ENUMs
  2. Dependency graph — builds a directed graph from FK relationships, runs topological sort to determine insertion order (parents first)
  3. Smart heuristics — maps column names to appropriate generators (email → realistic email, phone → phone number, created_at → recent datetime)
  4. FK resolution — child rows automatically reference real IDs from already-generated parent rows
  5. Batch insert — fast bulk insertion with proper transaction handling

Column Name Heuristics

SeedForge automatically detects what kind of data to generate based on column names:

Column name Generated data
email john.smith@example.com
phone, mobile +1-555-0123
first_name John
last_name Smith
username jsmith42
address, street 123 Main St, Apt 4
city San Francisco
country United States
price, amount, total 49.99
url, website https://example.com
avatar, image_url https://picsum.photos/seed/123/400/300
role admin, user, moderator
status active, pending, completed
plan free, pro, enterprise
created_at, updated_at Recent datetime
is_active, verified true (85% bias)
is_deleted, archived false (90% bias)
password SHA-256 hash
token, api_key Random hex string
uuid, guid Valid UUID v4
...and 60+ more patterns

Configuration

.seedforge.yaml (auto-created by seedforge connect):

db_url: postgresql://user:pass@localhost:5432/mydb
default_rows: 100
default_schema: public
seed: 42  # optional, for deterministic generation
exclude_tables:
  - _prisma_migrations
  - django_migrations

Supported Databases

  • PostgreSQL
  • MySQL / MariaDB
  • SQLite

Data Privacy

Your data never leaves your machine. SeedForge runs entirely locally — it connects directly to your database, generates data in memory, and inserts it. No cloud, no telemetry, no data collection.

License

MIT

Contributing

Issues and PRs welcome at github.com/silkhorizonstudios/seedforge.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

seedforge-0.2.0.tar.gz (26.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

seedforge-0.2.0-py3-none-any.whl (24.9 kB view details)

Uploaded Python 3

File details

Details for the file seedforge-0.2.0.tar.gz.

File metadata

  • Download URL: seedforge-0.2.0.tar.gz
  • Upload date:
  • Size: 26.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for seedforge-0.2.0.tar.gz
Algorithm Hash digest
SHA256 977eb545a22fc662223f6aa9b0d20e142ca7715451218bc2a9cbf06c591b46b7
MD5 a47a400d0e429bf38adf95d89bcac581
BLAKE2b-256 6383376263ee51754b3bf7528e5d193baaa1999f4e9654e70fce0422e2920166

See more details on using hashes here.

File details

Details for the file seedforge-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: seedforge-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 24.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for seedforge-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 77baf5cfa45aef6979ceafe17111bc519b0869561a575c6d80dc2c3d0c28cfdf
MD5 3df7c64700692d65e0bedcbacdedc728
BLAKE2b-256 5a0aa3bde237540ded9541b268d98a30bf65aa677426589d17025b00c52d19d8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page