Skip to main content

Lightweight test data generation with realistic fake values.

Project description

philiprehberger-data-factory

Tests PyPI version Last updated

philiprehberger-data-factory

Lightweight test data generation with realistic fake values.

Installation

pip install philiprehberger-data-factory

Usage

Generate fake values

from philiprehberger_data_factory import fake

fake.name()      # "Alice Johnson"
fake.email()     # "bob.smith@example.com"
fake.integer()   # 472
fake.boolean()   # True
fake.text()      # "alpha bravo charlie delta echo"
fake.date()      # "2023-07-14"
fake.uuid()      # "a3b2c1d4-..."
fake.phone()     # "+1-555-123-4567"
fake.address()   # "742 Oak Street, Springfield"

Factory

Define a schema and generate records in bulk:

from philiprehberger_data_factory import Factory

user_factory = Factory({
    "name": "name",
    "email": "email",
    "age": "integer",
    "bio": "text",
    "joined": "date",
    "id": "uuid",
    "phone": "phone",
    "address": "address",
})

user = user_factory.build()
# {'name': 'Grace Wilson', 'email': 'henry.moore@mail.com', ...}

users = user_factory.build_batch(100)
# [{'name': ..., 'email': ..., ...}, ...]

Bulk generation with overrides

Use batch() to generate multiple records with optional per-item or shared overrides:

from philiprehberger_data_factory import Factory

factory = Factory({"name": "name", "role": "text"})

# Shared overrides applied to every record
admins = factory.batch(5, overrides={"role": "admin"})

# Per-item overrides
records = factory.batch(3, overrides=[
    {"role": "admin"},
    {"role": "editor"},
    {"role": "viewer"},
])

Relationship / foreign-key support

Link factories so generated objects have consistent foreign-key references:

from philiprehberger_data_factory import Factory

user_factory = Factory({"id": "uuid", "name": "name"})
user = user_factory.build()

post_factory = Factory({"title": "text", "body": "text"})
post_factory.related(user_factory, field="user_id", source_field="id")

post = post_factory.build()
# post["user_id"] matches user["id"]

Statistical distribution profiles

Generate numeric fields following statistical distributions:

from philiprehberger_data_factory import Factory

factory = Factory({"name": "name"})
factory.field("age", distribution="normal", mean=30, std=10)
factory.field("score", distribution="uniform", min=0, max=100)
factory.field("wait_time", distribution="exponential", scale=5.0)

record = factory.build()
# {'name': 'Alice Johnson', 'age': 28.4, 'score': 73.2, 'wait_time': 3.1}

Weighted choices

from philiprehberger_data_factory import fake

tier = fake.weighted_choice({"gold": 0.1, "silver": 0.3, "bronze": 0.6})

Custom providers

Pass a callable instead of a provider string:

from philiprehberger_data_factory import fake, Factory

factory = Factory({
    "name": "name",
    "role": lambda: fake.choice(["admin", "editor", "viewer"]),
})

Reproducible output

from philiprehberger_data_factory import fake

fake.seed(42)
fake.name()  # always the same name for seed 42

Auto-incrementing IDs with sequence_field

from philiprehberger_data_factory import Factory

users = Factory({"name": "name", "email": "email"}).sequence_field("id")
users.build_batch(3)
# [{"id": 1, "name": ..., ...}, {"id": 2, ...}, {"id": 3, ...}]

# Custom start / step
orders = Factory({}).sequence_field("order_no", start=1000, step=5)

Each factory owns its own counter, so two factories that share the same schema produce independent sequences without colliding.

API

Function / Class Description
fake.name() Random full name
fake.email() Random email address
fake.integer(min, max) Random integer (default 0-1000)
fake.decimal(min, max, precision) Random float (default 0-1000, 2 decimals)
fake.boolean() Random True or False
fake.choice(items) Random element from a list
fake.text(words) Random words (default 5)
fake.date(start, end) Random ISO date string
fake.uuid() Random UUID4 string
fake.phone() Random phone number in +1-XXX-XXX-XXXX format
fake.address() Random street address with city
fake.weighted_choice(options) Weighted random selection from a dict of options to weights
fake.normal(mean, std) Random float from a normal distribution
fake.exponential(scale) Random float from an exponential distribution
fake.seed(n) Set random seed for reproducibility
Factory(schema) Create a factory from a schema dict
factory.build(**overrides) Generate one record with optional field overrides
factory.build_batch(n) Generate n records
factory.batch(n, overrides) Generate n records with shared or per-item overrides
factory.field(name, distribution, **params) Add a field with a statistical distribution (normal, uniform, exponential)
factory.related(other, field, source_field) Link to another factory for foreign-key consistency
factory.sequence_field(name, start=1, step=1) Register a monotonically increasing integer field (per-factory counter)

Development

pip install -e .
python -m pytest tests/ -v

Support

If you find this project useful:

Star the repo

🐛 Report issues

💡 Suggest features

❤️ Sponsor development

🌐 All Open Source Projects

💻 GitHub Profile

🔗 LinkedIn Profile

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

philiprehberger_data_factory-0.4.0.tar.gz (188.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

philiprehberger_data_factory-0.4.0-py3-none-any.whl (8.8 kB view details)

Uploaded Python 3

File details

Details for the file philiprehberger_data_factory-0.4.0.tar.gz.

File metadata

File hashes

Hashes for philiprehberger_data_factory-0.4.0.tar.gz
Algorithm Hash digest
SHA256 11ef063ae0fb390da8519f828c0a2795e99bedfc56e395896409434b9712e10a
MD5 da007803b3beae1e6b49dc7c2ed2c6a3
BLAKE2b-256 e80d53e1c47d01a6595111264acf949b2ea2f8f04817dac17313efb664d73cc2

See more details on using hashes here.

File details

Details for the file philiprehberger_data_factory-0.4.0-py3-none-any.whl.

File metadata

File hashes

Hashes for philiprehberger_data_factory-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6ac9b0f0e388f2dc09381bda72c448ff87515ed84776766d40e0282f0ac2a6f3
MD5 53a0dcf7ffe4c169bbe3a4ea546905b9
BLAKE2b-256 b82befaaa95f0c3fe11e43dcd5c82e405064170106209dbea2d0a5cb9f345586

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page