Skip to main content

Lightweight test data generation with realistic fake values.

Project description

philiprehberger-data-factory

Tests PyPI version Last updated

Lightweight test data generation with realistic fake values.

Installation

pip install philiprehberger-data-factory

Usage

Generate fake values

from philiprehberger_data_factory import fake

fake.name()      # "Alice Johnson"
fake.email()     # "bob.smith@example.com"
fake.integer()   # 472
fake.boolean()   # True
fake.text()      # "alpha bravo charlie delta echo"
fake.date()      # "2023-07-14"
fake.uuid()      # "a3b2c1d4-..."
fake.phone()     # "+1-555-123-4567"
fake.address()   # "742 Oak Street, Springfield"

Factory

Define a schema and generate records in bulk:

from philiprehberger_data_factory import Factory

user_factory = Factory({
    "name": "name",
    "email": "email",
    "age": "integer",
    "bio": "text",
    "joined": "date",
    "id": "uuid",
    "phone": "phone",
    "address": "address",
})

user = user_factory.build()
# {'name': 'Grace Wilson', 'email': 'henry.moore@mail.com', ...}

users = user_factory.build_batch(100)
# [{'name': ..., 'email': ..., ...}, ...]

Bulk generation with overrides

Use batch() to generate multiple records with optional per-item or shared overrides:

from philiprehberger_data_factory import Factory

factory = Factory({"name": "name", "role": "text"})

# Shared overrides applied to every record
admins = factory.batch(5, overrides={"role": "admin"})

# Per-item overrides
records = factory.batch(3, overrides=[
    {"role": "admin"},
    {"role": "editor"},
    {"role": "viewer"},
])

Relationship / foreign-key support

Link factories so generated objects have consistent foreign-key references:

from philiprehberger_data_factory import Factory

user_factory = Factory({"id": "uuid", "name": "name"})
user = user_factory.build()

post_factory = Factory({"title": "text", "body": "text"})
post_factory.related(user_factory, field="user_id", source_field="id")

post = post_factory.build()
# post["user_id"] matches user["id"]

Statistical distribution profiles

Generate numeric fields following statistical distributions:

from philiprehberger_data_factory import Factory

factory = Factory({"name": "name"})
factory.field("age", distribution="normal", mean=30, std=10)
factory.field("score", distribution="uniform", min=0, max=100)
factory.field("wait_time", distribution="exponential", scale=5.0)

record = factory.build()
# {'name': 'Alice Johnson', 'age': 28.4, 'score': 73.2, 'wait_time': 3.1}

Weighted choices

from philiprehberger_data_factory import fake

tier = fake.weighted_choice({"gold": 0.1, "silver": 0.3, "bronze": 0.6})

Custom providers

Pass a callable instead of a provider string:

from philiprehberger_data_factory import fake, Factory

factory = Factory({
    "name": "name",
    "role": lambda: fake.choice(["admin", "editor", "viewer"]),
})

Reproducible output

from philiprehberger_data_factory import fake

fake.seed(42)
fake.name()  # always the same name for seed 42

API

Function / Class Description
fake.name() Random full name
fake.email() Random email address
fake.integer(min, max) Random integer (default 0-1000)
fake.decimal(min, max, precision) Random float (default 0-1000, 2 decimals)
fake.boolean() Random True or False
fake.choice(items) Random element from a list
fake.text(words) Random words (default 5)
fake.date(start, end) Random ISO date string
fake.uuid() Random UUID4 string
fake.phone() Random phone number in +1-XXX-XXX-XXXX format
fake.address() Random street address with city
fake.weighted_choice(options) Weighted random selection from a dict of options to weights
fake.normal(mean, std) Random float from a normal distribution
fake.exponential(scale) Random float from an exponential distribution
fake.seed(n) Set random seed for reproducibility
Factory(schema) Create a factory from a schema dict
factory.build(**overrides) Generate one record with optional field overrides
factory.build_batch(n) Generate n records
factory.batch(n, overrides) Generate n records with shared or per-item overrides
factory.field(name, distribution, **params) Add a field with a statistical distribution (normal, uniform, exponential)
factory.related(other, field, source_field) Link to another factory for foreign-key consistency

Development

pip install -e .
python -m pytest tests/ -v

Support

If you find this project useful:

Star the repo

🐛 Report issues

💡 Suggest features

❤️ Sponsor development

🌐 All Open Source Projects

💻 GitHub Profile

🔗 LinkedIn Profile

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

philiprehberger_data_factory-0.3.1.tar.gz (9.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

philiprehberger_data_factory-0.3.1-py3-none-any.whl (8.2 kB view details)

Uploaded Python 3

File details

Details for the file philiprehberger_data_factory-0.3.1.tar.gz.

File metadata

File hashes

Hashes for philiprehberger_data_factory-0.3.1.tar.gz
Algorithm Hash digest
SHA256 6526db230db2c991ec925f22d35c1402e830ab71c28a3f9ada6067693ed4e16d
MD5 dc33b50118cf719ea1fee0b59a2bfb60
BLAKE2b-256 42ff44ec9baaaddd201b11dd81de8fb6f8c491ac6f865c96432f673422c70adb

See more details on using hashes here.

File details

Details for the file philiprehberger_data_factory-0.3.1-py3-none-any.whl.

File metadata

File hashes

Hashes for philiprehberger_data_factory-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 39e923def7201f0fdda22612e23a4ab8b1d4fe6530613d2fe73f908b6fb19d54
MD5 716037200116d6d365d12a90ed310f8a
BLAKE2b-256 3078c0bd5af1a772f32f2a540ce93fd70e650ddd778b741980e07e7df58e63b0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page