Skip to main content

Synthetic data generation for Databricks — with a built-in notebook UI

Project description

DashSynthetic — Databricks Library

CI PyPI License

Part of the Dashlibs suite — Databricks libraries built for business users.

Installation

%pip install dash-synthetic

Quick Start

import dashsynthetic
dashsynthetic.launch()   # Opens interactive UI in your Databricks notebook

The UI has two tabs:

  • Single Table — profile a source table/DataFrame/SQL query and generate synthetic data from it.
  • Multi-Table Relationships — define multiple tables, their primary keys, foreign keys, and master data columns (e.g. currency/country codes); the tool figures out the dependency order and generates every table with referentially valid foreign keys.

What it looks like

Single Table — profile a source and generate synthetic data from it:

DashSynthetic single-table tab

Multi-Table Relationships — define tables, primary/foreign keys, and master data columns:

DashSynthetic multi-table relationships tab

Python API

from dashsynthetic import RelationshipGraph, MultiTableGenerator

graph = RelationshipGraph()
graph.add_table("Customer", table="catalog.schema.dim_customer", primary_key="customer_id")
graph.add_table("Account", table="catalog.schema.fact_account", primary_key="account_id",
                master_data_columns=["currency_code"])
graph.add_foreign_key("Account", "customer_id", "Customer", "customer_id")

gen = MultiTableGenerator(graph)
gen.configure_table("Customer", n_rows=5000)
gen.configure_table("Account", n_rows=20000, output_table="catalog.schema.syn_account")
results = gen.run()   # {"Customer": df, "Account": df}, generated in dependency order

Part of Dashlibs

Library Purpose
dash-dq Data Quality
dash-synthetic Synthetic Data Generation
dash-observe Data Observability (freshness, volume, schema)
dash-ml ML Model Monitoring
dash-ingest Data Ingestion
dash-gov Data Governance
dash-relate Ontology & Lineage for AI
dash-ui Shared UI components (PyPI: dash-uis)

Quality & Contributing

  • 16 unit tests, zero Spark dependency to run them — pytest tests/ -v (the relationship graph, generation ordering, and multi-table orchestration logic are all pure Python and fully covered)
  • Lint-clean (ruff check dashsynthetic/), PEP 561 typed (py.typed)
  • Every change ships through a reviewed pull request; CI (lint → test on Python 3.9–3.12 → build) gates every PR and every release
  • See CONTRIBUTING.md for dev setup, CHANGELOG.md for release history, SECURITY.md to report a vulnerability, and CODE_OF_CONDUCT.md

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dash_synthetic-0.1.3.tar.gz (233.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dash_synthetic-0.1.3-py3-none-any.whl (17.4 kB view details)

Uploaded Python 3

File details

Details for the file dash_synthetic-0.1.3.tar.gz.

File metadata

  • Download URL: dash_synthetic-0.1.3.tar.gz
  • Upload date:
  • Size: 233.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dash_synthetic-0.1.3.tar.gz
Algorithm Hash digest
SHA256 3b6eda64891a4665a2018f91b987458d976bca46c082779074ce498ce720f1d8
MD5 02f7ec7ac529ce9b75cc1e5d23ab0dbe
BLAKE2b-256 a3b5eb72850ad0a484a7770759e5434a699dc363fb10561ce002d4b3e8698c6d

See more details on using hashes here.

Provenance

The following attestation bundles were made for dash_synthetic-0.1.3.tar.gz:

Publisher: release.yml on dash-libs/dash-synthetic

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dash_synthetic-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: dash_synthetic-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 17.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dash_synthetic-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 8a49377e77971ebef27a840ca6a7f0c6b7b2d2aac05ff6aa38995829d9677275
MD5 0a6c0cabd5a71cabd55ada1b75add7e3
BLAKE2b-256 82e46e09d235604524db16ec416211689c47ac858a9a185ad438f540d7723fea

See more details on using hashes here.

Provenance

The following attestation bundles were made for dash_synthetic-0.1.3-py3-none-any.whl:

Publisher: release.yml on dash-libs/dash-synthetic

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page