Skip to main content

Synthetic data generation for Databricks — with a built-in notebook UI

Project description

DashSynthetic — Databricks Library

CI PyPI License

Part of the Dashlibs suite — Databricks libraries built for business users.

Installation

%pip install dash-synthetic

Quick Start

import dashsynthetic
dashsynthetic.launch()   # Opens interactive UI in your Databricks notebook

The UI has two tabs:

  • Single Table — profile a source table/DataFrame/SQL query and generate synthetic data from it.
  • Multi-Table Relationships — define multiple tables, their primary keys, foreign keys, and master data columns (e.g. currency/country codes); the tool figures out the dependency order and generates every table with referentially valid foreign keys.

What it looks like

Single Table — profile a source and generate synthetic data from it:

DashSynthetic single-table tab

Multi-Table Relationships — define tables, primary/foreign keys, and master data columns:

DashSynthetic multi-table relationships tab

Python API

from dashsynthetic import RelationshipGraph, MultiTableGenerator

graph = RelationshipGraph()
graph.add_table("Customer", table="catalog.schema.dim_customer", primary_key="customer_id")
graph.add_table("Account", table="catalog.schema.fact_account", primary_key="account_id",
                master_data_columns=["currency_code"])
graph.add_foreign_key("Account", "customer_id", "Customer", "customer_id")

gen = MultiTableGenerator(graph)
gen.configure_table("Customer", n_rows=5000)
gen.configure_table("Account", n_rows=20000, output_table="catalog.schema.syn_account")
results = gen.run()   # {"Customer": df, "Account": df}, generated in dependency order

Part of Dashlibs

Library Purpose
dash-dq Data Quality
dash-synthetic Synthetic Data Generation
dash-observe Data Observability (freshness, volume, schema)
dash-ml ML Model Monitoring
dash-ingest Data Ingestion
dash-gov Data Governance
dash-ontology Ontology & Lineage for AI
dash-ui Shared UI components (PyPI: dash-uis)

Quality & Contributing

  • 16 unit tests, zero Spark dependency to run them — pytest tests/ -v (the relationship graph, generation ordering, and multi-table orchestration logic are all pure Python and fully covered)
  • Lint-clean (ruff check dashsynthetic/), PEP 561 typed (py.typed)
  • Every change ships through a reviewed pull request; CI (lint → test on Python 3.9–3.12 → build) gates every PR and every release
  • See CONTRIBUTING.md for dev setup, CHANGELOG.md for release history, SECURITY.md to report a vulnerability, and CODE_OF_CONDUCT.md

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dash_synthetic-0.1.5.tar.gz (239.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dash_synthetic-0.1.5-py3-none-any.whl (19.3 kB view details)

Uploaded Python 3

File details

Details for the file dash_synthetic-0.1.5.tar.gz.

File metadata

  • Download URL: dash_synthetic-0.1.5.tar.gz
  • Upload date:
  • Size: 239.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dash_synthetic-0.1.5.tar.gz
Algorithm Hash digest
SHA256 3dae8e6416d0794e13aec7235ee48291980b9abed0859a93e0a96252f4dbc4f1
MD5 a10a263415f0764190b963a083e6730f
BLAKE2b-256 44e30a5955f812a244a22d48e76aca6161715f3490d5118e753393d36d79256e

See more details on using hashes here.

Provenance

The following attestation bundles were made for dash_synthetic-0.1.5.tar.gz:

Publisher: release.yml on dash-libs/dash-synthetic

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dash_synthetic-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: dash_synthetic-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 19.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dash_synthetic-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 675b344b09e55f43ccd191f38c96e4409a940ed0aea7b432cc706dee143561b3
MD5 1d8fc7e8c413796822ad238b66a87a26
BLAKE2b-256 a95978dbd4559e675f7290af7fc56623a343fcb9de769f40fa90501607434b19

See more details on using hashes here.

Provenance

The following attestation bundles were made for dash_synthetic-0.1.5-py3-none-any.whl:

Publisher: release.yml on dash-libs/dash-synthetic

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page