Skip to main content

Lightweight test data generation with realistic fake values.

Project description

philiprehberger-data-factory

Tests PyPI version GitHub release Last updated License Bug Reports Feature Requests Sponsor

Lightweight test data generation with realistic fake values.

Installation

pip install philiprehberger-data-factory

Usage

Generate fake values

from philiprehberger_data_factory import fake

fake.name()      # "Alice Johnson"
fake.email()     # "bob.smith@example.com"
fake.integer()   # 472
fake.boolean()   # True
fake.text()      # "alpha bravo charlie delta echo"
fake.date()      # "2023-07-14"
fake.uuid()      # "a3b2c1d4-..."
fake.phone()     # "+1-555-123-4567"
fake.address()   # "742 Oak Street, Springfield"

Factory

Define a schema and generate records in bulk:

from philiprehberger_data_factory import Factory

user_factory = Factory({
    "name": "name",
    "email": "email",
    "age": "integer",
    "bio": "text",
    "joined": "date",
    "id": "uuid",
    "phone": "phone",
    "address": "address",
})

user = user_factory.build()
# {'name': 'Grace Wilson', 'email': 'henry.moore@mail.com', ...}

users = user_factory.build_batch(100)
# [{'name': ..., 'email': ..., ...}, ...]

Bulk generation with overrides

Use batch() to generate multiple records with optional per-item or shared overrides:

from philiprehberger_data_factory import Factory

factory = Factory({"name": "name", "role": "text"})

# Shared overrides applied to every record
admins = factory.batch(5, overrides={"role": "admin"})

# Per-item overrides
records = factory.batch(3, overrides=[
    {"role": "admin"},
    {"role": "editor"},
    {"role": "viewer"},
])

Relationship / foreign-key support

Link factories so generated objects have consistent foreign-key references:

from philiprehberger_data_factory import Factory

user_factory = Factory({"id": "uuid", "name": "name"})
user = user_factory.build()

post_factory = Factory({"title": "text", "body": "text"})
post_factory.related(user_factory, field="user_id", source_field="id")

post = post_factory.build()
# post["user_id"] matches user["id"]

Statistical distribution profiles

Generate numeric fields following statistical distributions:

from philiprehberger_data_factory import Factory

factory = Factory({"name": "name"})
factory.field("age", distribution="normal", mean=30, std=10)
factory.field("score", distribution="uniform", min=0, max=100)
factory.field("wait_time", distribution="exponential", scale=5.0)

record = factory.build()
# {'name': 'Alice Johnson', 'age': 28.4, 'score': 73.2, 'wait_time': 3.1}

Weighted choices

from philiprehberger_data_factory import fake

tier = fake.weighted_choice({"gold": 0.1, "silver": 0.3, "bronze": 0.6})

Custom providers

Pass a callable instead of a provider string:

from philiprehberger_data_factory import fake, Factory

factory = Factory({
    "name": "name",
    "role": lambda: fake.choice(["admin", "editor", "viewer"]),
})

Reproducible output

from philiprehberger_data_factory import fake

fake.seed(42)
fake.name()  # always the same name for seed 42

API

Function / Class Description
fake.name() Random full name
fake.email() Random email address
fake.integer(min, max) Random integer (default 0-1000)
fake.decimal(min, max, precision) Random float (default 0-1000, 2 decimals)
fake.boolean() Random True or False
fake.choice(items) Random element from a list
fake.text(words) Random words (default 5)
fake.date(start, end) Random ISO date string
fake.uuid() Random UUID4 string
fake.phone() Random phone number in +1-XXX-XXX-XXXX format
fake.address() Random street address with city
fake.weighted_choice(options) Weighted random selection from a dict of options to weights
fake.normal(mean, std) Random float from a normal distribution
fake.exponential(scale) Random float from an exponential distribution
fake.seed(n) Set random seed for reproducibility
Factory(schema) Create a factory from a schema dict
factory.build(**overrides) Generate one record with optional field overrides
factory.build_batch(n) Generate n records
factory.batch(n, overrides) Generate n records with shared or per-item overrides
factory.field(name, distribution, **params) Add a field with a statistical distribution (normal, uniform, exponential)
factory.related(other, field, source_field) Link to another factory for foreign-key consistency

Development

pip install -e .
python -m pytest tests/ -v

Support

If you find this package useful, consider giving it a star on GitHub — it helps motivate continued maintenance and development.

LinkedIn More packages

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

philiprehberger_data_factory-0.3.0.tar.gz (9.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

philiprehberger_data_factory-0.3.0-py3-none-any.whl (8.3 kB view details)

Uploaded Python 3

File details

Details for the file philiprehberger_data_factory-0.3.0.tar.gz.

File metadata

File hashes

Hashes for philiprehberger_data_factory-0.3.0.tar.gz
Algorithm Hash digest
SHA256 d5832b5c2bc15a1b19b7692b318d607476bada0888ad3a002c048d2e9b91fbdb
MD5 685adb0a5841108f13d0caac6727fd23
BLAKE2b-256 6718facc57747453674e0e6f9414085df8a22829804bcde205849bc0aebf0499

See more details on using hashes here.

File details

Details for the file philiprehberger_data_factory-0.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for philiprehberger_data_factory-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 786cd4dd482b0ecd8b1b0cd49064f96e2aa11fc79dee2ef9bac78cefac6f19a4
MD5 9417c924a7745d68ead1cf61cd6e1eec
BLAKE2b-256 62055b6e82642f828666b042c622404c297acc60a5db52f583b8c172c293295d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page