Skip to main content

Python wrapper for DataSynth synthetic data generation

Project description

datasynth-py

Python wrapper for the DataSynth synthetic data generator.

Installation

From PyPI

pip install datasynth-py[all]

Or install specific extras:

pip install datasynth-py           # Core only (no dependencies)
pip install datasynth-py[cli]      # CLI generation (PyYAML)
pip install datasynth-py[memory]   # In-memory tables (pandas)
pip install datasynth-py[streaming] # Streaming (websockets)
pip install datasynth-py[all]      # All optional dependencies

From Source

cd python
pip install -e ".[all]"

Quick Start

from datasynth_py import DataSynth, CompanyConfig, Config, GlobalSettings, ChartOfAccountsSettings

config = Config(
    global_settings=GlobalSettings(
        industry="retail",
        start_date="2024-01-01",
        period_months=12,
    ),
    companies=[
        CompanyConfig(code="C001", name="Retail Corp", currency="USD", country="US"),
    ],
    chart_of_accounts=ChartOfAccountsSettings(complexity="small"),
)

synth = DataSynth()
result = synth.generate(config=config, output={"format": "csv", "sink": "temp_dir"})
print(result.output_dir)

Using Blueprints

from datasynth_py import DataSynth
from datasynth_py.config import blueprints

config = blueprints.retail_small(companies=4, transactions=10000)
synth = DataSynth()
result = synth.generate(config=config, output={"format": "parquet", "sink": "path", "path": "./output"})

Requirements

The wrapper shells out to the datasynth-data CLI binary. Build it with:

cargo build --release
export DATASYNTH_BINARY=target/release/datasynth-data

Or pass binary_path when creating the client:

synth = DataSynth(binary_path="/path/to/datasynth-data")

Documentation

See the Python Wrapper Guide for complete documentation.

License

Apache 2.0 License - see the main project LICENSE file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datasynth_py-0.2.1.tar.gz (18.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

datasynth_py-0.2.1-py3-none-any.whl (18.2 kB view details)

Uploaded Python 3

File details

Details for the file datasynth_py-0.2.1.tar.gz.

File metadata

  • Download URL: datasynth_py-0.2.1.tar.gz
  • Upload date:
  • Size: 18.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for datasynth_py-0.2.1.tar.gz
Algorithm Hash digest
SHA256 0b9efcae5840a262c4c1832d9a6d3cd5a1ba2aee51b8cb7137d234f387b6e362
MD5 ac155c8574a530b6a35b8ad31119e313
BLAKE2b-256 f20bd234a1b43d64732bf5ee7992a7ef8e635e1f6b65fa170c80c44125a27c8d

See more details on using hashes here.

File details

Details for the file datasynth_py-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: datasynth_py-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 18.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for datasynth_py-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 872f715eff2d84f8f9a9058bad152502a2a0712254bea7612b5f34a0cdb2e11a
MD5 1e5f64bf26dada1e67482e8d8596c534
BLAKE2b-256 05c45bf92a83fbe77c3fff89d2b8cd3a8cf7768d3a6a4e781864a71a98ccf142

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page