Skip to main content

A toolkit for generating and evaluating synthetic data in terms of utility, privacy, and similarity

Project description

Synthetic Data Generation Toolkit

This repository provides a comprehensive toolkit for generating synthetic data using seven different models. The toolkit evaluates the generated data for utility, similarity/fidelity, and privacy, specifically tailored for tabular datasets with binary classification problems (e.g., True/False, Yes/No).

Models Included

The project implements the following models for synthetic data generation:

  1. CopulaGAN
  2. CTGAN
  3. Gaussian Copula
  4. TVAE
  5. Gaussian Multivariate
  6. WGAN
  7. ARF

Quick Start

Step 1: Install the Package

Install the package using pip:

pip install synthius

Step 2: Usage Example

To understand how to use this package, explore the three example Jupyter notebooks included in the repository:

  1. Generator

    • Demonstrates how to generate synthetic data using seven different models.
    • Update paths and configurations (e.g., file paths, target column) to fit your dataset.
    • Run the cells to generate synthetic datasets.
  2. AutoGloun

    • Evaluates the utility.
    • Update the paths as needed to analyze your data.
  3. Evaluation

    • Provides examples of computing metrics for evaluating synthetic data, including:
      • Utility
      • Fidelity/Similarity
      • Privacy
    • Update paths and dataset-specific configurations and run the cells to compute the results.

These notebooks serve as practical examples to demonstrate how to effectively utilize the toolkit.

Additional Setup for Mac Users

Mac users may encounter errors during installation. To resolve these issues, install the required dependencies and set up the environment:

  1. Install dependencies using Homebrew:

    brew install libomp llvm
    
  2. Set up the environment:

    export PATH="/opt/homebrew/opt/llvm/bin:$PATH"
    export CC=$(brew --prefix llvm)/bin/clang
    export CXX=$(brew --prefix llvm)/bin/clang++
    export CXXFLAGS="-I$(brew --prefix llvm)/include -I$(brew --prefix libomp)/include"
    export LDFLAGS="-L$(brew --prefix llvm)/lib -L$(brew --prefix libomp)/lib -lomp"
    

Acknowledgments

Special thanks to all contributors and the libraries used in this project.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

synthius-0.2.0.tar.gz (58.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

synthius-0.2.0-py3-none-any.whl (77.6 kB view details)

Uploaded Python 3

File details

Details for the file synthius-0.2.0.tar.gz.

File metadata

  • Download URL: synthius-0.2.0.tar.gz
  • Upload date:
  • Size: 58.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for synthius-0.2.0.tar.gz
Algorithm Hash digest
SHA256 adeee5f1cfde9e716265a547eb464bb34a8ea077e1286cfeb69051e41b605c08
MD5 cef357dac3648cec7cfe7e7880588b6e
BLAKE2b-256 802463f0587054883de10f0142c5a1050942c3f615a1f2e96919704655da790e

See more details on using hashes here.

Provenance

The following attestation bundles were made for synthius-0.2.0.tar.gz:

Publisher: publish.yml on calgo-lab/Synthius

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file synthius-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: synthius-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 77.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for synthius-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2909587b4ad12a91ae17acb12f9d8d4ae0c733e27ad3399f01c421f423b4cd40
MD5 f52db77016826caeeab2adaf352d7de6
BLAKE2b-256 1ce16f332f2dc50d4eccede86d82f86d8515dcb83b5173e5496a53912f62844e

See more details on using hashes here.

Provenance

The following attestation bundles were made for synthius-0.2.0-py3-none-any.whl:

Publisher: publish.yml on calgo-lab/Synthius

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page