Skip to main content

Synthetic data generation for Databricks — with a built-in notebook UI

Project description

DashSynthetic — Databricks Library

CI PyPI License

Part of the Dashlibs suite — Databricks libraries built for business users.

Installation

%pip install dash-synthetic

Quick Start

import dashsynthetic
dashsynthetic.launch()   # Opens interactive UI in your Databricks notebook

The UI has two tabs:

  • Single Table — profile a source table/DataFrame/SQL query and generate synthetic data from it.
  • Multi-Table Relationships — define multiple tables, their primary keys, foreign keys, and master data columns (e.g. currency/country codes); the tool figures out the dependency order and generates every table with referentially valid foreign keys.

What it looks like

Single Table — profile a source and generate synthetic data from it:

DashSynthetic single-table tab

Multi-Table Relationships — define tables, primary/foreign keys, and master data columns:

DashSynthetic multi-table relationships tab

Python API

from dashsynthetic import RelationshipGraph, MultiTableGenerator

graph = RelationshipGraph()
graph.add_table("Customer", table="catalog.schema.dim_customer", primary_key="customer_id")
graph.add_table("Account", table="catalog.schema.fact_account", primary_key="account_id",
                master_data_columns=["currency_code"])
graph.add_foreign_key("Account", "customer_id", "Customer", "customer_id")

gen = MultiTableGenerator(graph)
gen.configure_table("Customer", n_rows=5000)
gen.configure_table("Account", n_rows=20000, output_table="catalog.schema.syn_account")
results = gen.run()   # {"Customer": df, "Account": df}, generated in dependency order

Part of Dashlibs

Library Purpose
dash-dq Data Quality
dash-synthetic Synthetic Data Generation
dash-observe Data Observability (freshness, volume, schema)
dash-ml ML Model Monitoring
dash-ingest Data Ingestion
dash-gov Data Governance
dash-ontology Ontology & Lineage for AI
dash-ui Shared UI components (PyPI: dash-uis)

Quality & Contributing

  • 16 unit tests, zero Spark dependency to run them — pytest tests/ -v (the relationship graph, generation ordering, and multi-table orchestration logic are all pure Python and fully covered)
  • Lint-clean (ruff check dashsynthetic/), PEP 561 typed (py.typed)
  • Every change ships through a reviewed pull request; CI (lint → test on Python 3.9–3.12 → build) gates every PR and every release
  • See CONTRIBUTING.md for dev setup, CHANGELOG.md for release history, SECURITY.md to report a vulnerability, and CODE_OF_CONDUCT.md

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dash_synthetic-0.1.4.tar.gz (239.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dash_synthetic-0.1.4-py3-none-any.whl (19.4 kB view details)

Uploaded Python 3

File details

Details for the file dash_synthetic-0.1.4.tar.gz.

File metadata

  • Download URL: dash_synthetic-0.1.4.tar.gz
  • Upload date:
  • Size: 239.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dash_synthetic-0.1.4.tar.gz
Algorithm Hash digest
SHA256 8ec6f3b0fdad44349a7f8c2263be4ab4a2427ba0aa9e5ae24d8af8c88214eba5
MD5 bfa5f233d130f2041bbbb20cf77b84fe
BLAKE2b-256 b6b4cdce2594bf99b05e6b73f784449bd689f952ca67c1a9c6b230766af0695f

See more details on using hashes here.

Provenance

The following attestation bundles were made for dash_synthetic-0.1.4.tar.gz:

Publisher: release.yml on dash-libs/dash-synthetic

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dash_synthetic-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: dash_synthetic-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 19.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dash_synthetic-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 69a6800cee08e5154f631c3d08608ed7a6966905281366d6190b8e817d6b8eaf
MD5 b4bd2f88fc815072f38334f2331a779b
BLAKE2b-256 02f7bfbaaf379473e3939c78c4bd04b6e3c6ba9403b9c5bc08d9205052fb31e6

See more details on using hashes here.

Provenance

The following attestation bundles were made for dash_synthetic-0.1.4-py3-none-any.whl:

Publisher: release.yml on dash-libs/dash-synthetic

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page