Skip to main content

Synthetic data generation and evaluation library

Project description

Synthyverse logo

Welcome to the synthyverse!

An extensive ecosystem for synthetic data generation and evaluation in Python.

Read the docs for in-depth usage.

The synthyverse is a work in progress. Please provide any suggestions through a GitHub Issue.

Features

  • 🔧 Highly modular installation. Install only those modules which you require to keep your installation lightweight.
  • 📚 Extensive library for synthetic data. Any generator or metric can be quickly added without dependency conflicts due to synthyverse's modular installation. This allows the synthyverse to host a great amount of generators and evaluation metrics. It also allows the synthyverse to wrap around any existing synthetic data library.
  • ⚙️ Benchmarking module for simplified synthetic data pipelines. The benchmarking module executes a modular pipeline of synthetic data generation and evaluation. Choose a generator, set of evaluation metrics, and pipeline parameters, and obtain results on synthetic data quality.
  • 👷 Minimal preprocessing required. All preprocessing is handled by the synthyverse, so no need for scaling, one-hot encoding, or handling missing values. Different preprocessing schemes can be used by setting simple parameters.
  • 👍 Set constraints for your synthetic data. You can specify inter-column constraints which you want your synthetic data to follow. Constraints are modelled explicitly by the synthyverse, not through oversampling. This ensures efficient and reliable constraint setting.

Installation

The synthyverse is unique in its modular installation set-up. To avoid conflicting dependencies, we provide various installation templates. Each template installs only those dependencies which are required to access certain modules.

Templates provide installation for specific generators, the evaluation module, and more. Install multiple templates to get access to multiple modules of the synthyverse, e.g., multiple generators and evaluation.

We strongly advise to only install templates which you require during a specific run. Installing multiple templates gives rise to potential dependency conflicts. Use separate virtual environments across installations.

Note that the core installation without any template doesn't install any modules.

Available Installation Templates

The following installation templates are available:

Template Name Category Installation Command
arf Generator pip install synthyverse[arf]
bn Generator pip install synthyverse[bn]
cdtd Generator pip install synthyverse[cdtd]
ctabgan Generator pip install synthyverse[ctabgan]
ctgan Generator pip install synthyverse[ctgan]
forestdiffusion Generator pip install synthyverse[forestdiffusion]
nrgboost Generator pip install synthyverse[nrgboost]
permutation Generator pip install synthyverse[permutation]
realtabformer Generator pip install synthyverse[realtabformer]
smote Generator pip install synthyverse[smote]
synthpop Generator pip install synthyverse[synthpop]
tabargn Generator pip install synthyverse[tabargn]
tabddpm Generator pip install synthyverse[tabddpm]
tabsyn Generator pip install synthyverse[tabsyn]
tvae Generator pip install synthyverse[tvae]
unmaskingtrees Generator pip install synthyverse[unmaskingtrees]
base Generator pip install synthyverse[base]
eval Evaluation pip install synthyverse[eval]
full All pip install synthyverse[full]

Note: You can install multiple templates by separating them with commas, e.g., pip install synthyverse[ctgan,eval]

General Installation Template

pip install synthyverse[template]

Installation Examples

pip install synthyverse[ctgan]
pip install synthyverse[arf,bn,ctgan,tvae]
pip install synthyverse[ctgan,eval]

Usage

We refer to the docs to learn how to use the synthyverse!

Tutorials

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

synthyverse-0.1.4.tar.gz (2.5 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

synthyverse-0.1.4-py3-none-any.whl (98.1 kB view details)

Uploaded Python 3

File details

Details for the file synthyverse-0.1.4.tar.gz.

File metadata

  • Download URL: synthyverse-0.1.4.tar.gz
  • Upload date:
  • Size: 2.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.8.18

File hashes

Hashes for synthyverse-0.1.4.tar.gz
Algorithm Hash digest
SHA256 968aa01ce94eb471c62adc4f848b4d0a6889177561945daaf6065753136c1661
MD5 cf189f294b1a0458b28173585349eb43
BLAKE2b-256 baca47e205e32a89e2c05673e9c64a4e898ce1385f8b3e79bd33bfed416c1ec9

See more details on using hashes here.

File details

Details for the file synthyverse-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: synthyverse-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 98.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.8.18

File hashes

Hashes for synthyverse-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 ae95803beac05c32a9d3c089c8f2ca72321c809c4a67021f800d55bed5432d26
MD5 231b70a55d95f86815812c33de45a72e
BLAKE2b-256 6f45be8cfaee33479c8de2c6da6015dbab988dd5a924fde5c15345b246ff3e2b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page