Skip to main content

Synthetic data generation and evaluation library

Project description

Synthyverse logo

Welcome to the synthyverse!

An extensive ecosystem for synthetic data generation and evaluation in Python.

Read the docs for in-depth usage.

The synthyverse is a work in progress. Please provide any suggestions through a GitHub Issue.

Features

  • 🔧 Highly modular installation. Install only those modules which you require to keep your installation lightweight.
  • 📚 Extensive library for synthetic data. Any generator or metric can be quickly added without dependency conflicts due to synthyverse's modular installation. This allows the synthyverse to host a great amount of generators and evaluation metrics. It also allows the synthyverse to wrap around any existing synthetic data library.
  • ⚙️ Benchmarking module for simplified synthetic data pipelines. The benchmarking module executes a modular pipeline of synthetic data generation and evaluation. Choose a generator, set of evaluation metrics, and pipeline parameters, and obtain results on synthetic data quality.
  • 👷 Minimal preprocessing required. All preprocessing is handled by the synthyverse, so no need for scaling, one-hot encoding, or handling missing values. Different preprocessing schemes can be used by setting simple parameters.
  • 👍 Set constraints for your synthetic data. You can specify inter-column constraints which you want your synthetic data to follow. Constraints are modelled explicitly by the synthyverse, not through oversampling. This ensures efficient and reliable constraint setting.

Installation

The synthyverse is unique in its modular installation set-up. To avoid conflicting dependencies, we provide various installation templates. Each template installs only those dependencies which are required to access certain modules.

Templates provide installation for specific generators, the evaluation module, and more. Install multiple templates to get access to multiple modules of the synthyverse, e.g., multiple generators and evaluation.

We strongly advise to only install templates which you require during a specific run. Installing multiple templates gives rise to potential dependency conflicts. Use separate virtual environments across installations.

Note that the core installation without any template doesn't install any modules.

Available Installation Templates

The following installation templates are available:

Template Name Category Installation Command
arf Generator pip install synthyverse[arf]
bn Generator pip install synthyverse[bn]
cdtd Generator pip install synthyverse[cdtd]
ctabgan Generator pip install synthyverse[ctabgan]
ctgan Generator pip install synthyverse[ctgan]
forestdiffusion Generator pip install synthyverse[forestdiffusion]
nrgboost Generator pip install synthyverse[nrgboost]
permutation Generator pip install synthyverse[permutation]
realtabformer Generator pip install synthyverse[realtabformer]
smote Generator pip install synthyverse[smote]
synthpop Generator pip install synthyverse[synthpop]
tabargn Generator pip install synthyverse[tabargn]
tabddpm Generator pip install synthyverse[tabddpm]
tabsyn Generator pip install synthyverse[tabsyn]
tvae Generator pip install synthyverse[tvae]
unmaskingtrees Generator pip install synthyverse[unmaskingtrees]
xgenboost Generator pip install synthyverse[xgenboost]
base Generator pip install synthyverse[base]
eval Evaluation pip install synthyverse[eval]
full All pip install synthyverse[full]

Note: You can install multiple templates by separating them with commas, e.g., pip install synthyverse[ctgan,eval]

General Installation Template

pip install synthyverse[template]

Installation Examples

pip install synthyverse[ctgan]
pip install synthyverse[arf,bn,ctgan,tvae]
pip install synthyverse[ctgan,eval]

Usage

We refer to the docs to learn how to use the synthyverse!

Tutorials

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

synthyverse-0.1.5.tar.gz (2.5 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

synthyverse-0.1.5-py3-none-any.whl (129.4 kB view details)

Uploaded Python 3

File details

Details for the file synthyverse-0.1.5.tar.gz.

File metadata

  • Download URL: synthyverse-0.1.5.tar.gz
  • Upload date:
  • Size: 2.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.8.18

File hashes

Hashes for synthyverse-0.1.5.tar.gz
Algorithm Hash digest
SHA256 9de3b50ca6d6c73a54e0463410ec7eeafbcb3a0c7d85e8b5a978c019bcad870d
MD5 983c943e131045e428ae476184551d3f
BLAKE2b-256 9fdbdf7d239823bc400a1f5608338c1581338cd13e75175b61418471457ce422

See more details on using hashes here.

File details

Details for the file synthyverse-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: synthyverse-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 129.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.8.18

File hashes

Hashes for synthyverse-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 63043ac5d664b51ca628232f78252e221faf7e878f20d527a9030eb04fe035a4
MD5 bb601043cf23aa10b1dd9221235e609d
BLAKE2b-256 3ba2743f252f798b17cd6077a16d7af04dc1fae2315b4b624321c92024f1e0d7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page