Skip to main content

Synthetic data generation and evaluation library

Project description

Synthyverse logo

Welcome to the synthyverse!

An extensive ecosystem for synthetic data generation and evaluation in Python.

Read the docs for in-depth usage.

The synthyverse is a work in progress. Please provide any suggestions through a GitHub Issue.

Features

  • 🔧 Highly modular installation. Install only those modules which you require to keep your installation lightweight.
  • 📚 Extensive library for synthetic data. Any generator or metric can be quickly added without dependency conflicts due to synthyverse's modular installation. This allows the synthyverse to host a great amount of generators and evaluation metrics. It also allows the synthyverse to wrap around any existing synthetic data library.
  • ⚙️ Benchmarking module for simplified synthetic data pipelines. The benchmarking module executes a modular pipeline of synthetic data generation and evaluation. Choose a generator, set of evaluation metrics, and pipeline parameters, and obtain results on synthetic data quality.
  • 👷 Minimal preprocessing required. All preprocessing is handled by the synthyverse, so no need for scaling, one-hot encoding, or handling missing values. Different preprocessing schemes can be used by setting simple parameters.
  • 👍 Set constraints for your synthetic data. You can specify inter-column constraints which you want your synthetic data to follow. Constraints are modelled explicitly by the synthyverse, not through oversampling. This ensures efficient and reliable constraint setting.

Installation

The synthyverse is unique in its modular installation set-up. To avoid conflicting dependencies, we provide various installation templates. Each template installs only those dependencies which are required to access certain modules.

Templates provide installation for specific generators, the evaluation module, and more. Install multiple templates to get access to multiple modules of the synthyverse, e.g., multiple generators and evaluation.

We strongly advise to only install templates which you require during a specific run. Installing multiple templates gives rise to potential dependency conflicts. Use separate virtual environments across installations.

Note that the core installation without any template doesn't install any modules.

Available Installation Templates

The following installation templates are available:

Template Name Category Installation Command
arf Generator pip install synthyverse[arf]
bn Generator pip install synthyverse[bn]
cdtd Generator pip install synthyverse[cdtd]
ctabgan Generator pip install synthyverse[ctabgan]
ctgan Generator pip install synthyverse[ctgan]
forestdiffusion Generator pip install synthyverse[forestdiffusion]
nrgboost Generator pip install synthyverse[nrgboost]
permutation Generator pip install synthyverse[permutation]
realtabformer Generator pip install synthyverse[realtabformer]
smote Generator pip install synthyverse[smote]
synthpop Generator pip install synthyverse[synthpop]
tabargn Generator pip install synthyverse[tabargn]
tabddpm Generator pip install synthyverse[tabddpm]
tabsyn Generator pip install synthyverse[tabsyn]
tvae Generator pip install synthyverse[tvae]
unmaskingtrees Generator pip install synthyverse[unmaskingtrees]
xgenboost Generator pip install synthyverse[xgenboost]
base Generator pip install synthyverse[base]
eval Evaluation pip install synthyverse[eval]
full All pip install synthyverse[full]

Note: You can install multiple templates by separating them with commas, e.g., pip install synthyverse[ctgan,eval]

General Installation Template

pip install synthyverse[template]

Installation Examples

pip install synthyverse[ctgan]
pip install synthyverse[arf,bn,ctgan,tvae]
pip install synthyverse[ctgan,eval]

Usage

We refer to the docs to learn how to use the synthyverse!

Tutorials

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

synthyverse-0.1.6.tar.gz (2.5 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

synthyverse-0.1.6-py3-none-any.whl (129.3 kB view details)

Uploaded Python 3

File details

Details for the file synthyverse-0.1.6.tar.gz.

File metadata

  • Download URL: synthyverse-0.1.6.tar.gz
  • Upload date:
  • Size: 2.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.8.18

File hashes

Hashes for synthyverse-0.1.6.tar.gz
Algorithm Hash digest
SHA256 f51d1ec8e06010fdc8e1439b214998c036cd6b5b909728768e20967231e22d91
MD5 be53484589d7c4364451780c794bb754
BLAKE2b-256 773246147d9eb51a5c4234cc73e984f59e228f7aeb30230951e0a967712a560e

See more details on using hashes here.

File details

Details for the file synthyverse-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: synthyverse-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 129.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.8.18

File hashes

Hashes for synthyverse-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 35f51be2e744dafc85bd068bf7e28ee97c446e760c2a123375bbbe8d5e5f519e
MD5 55284feff0b19b842977beb2eb5dff64
BLAKE2b-256 6db1106564395a0be99a7ed35804d39a6469fbf6300c74fcd6ebde0a906eeb8e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page