Skip to main content

Synthetic Data Generation with optional Differential Privacy

Project description

Gretel Synthetics

Gobs the Gretel.ai cat
A permissive synthetic data library from Gretel.ai

Documentation Status CLA assistant PyPI Python Downloads GitHub stars Discord

Documentation

Try it out now

If you want to quickly discover gretel-synthetics, simply click the button below and follow the tutorials!

Open in Colab

Check out additional examples here.

Getting Started

This section will guide you through installation of gretel-synthetics and dependencies that are not directly installed by the Python package manager.

Dependency Requirements

By default, we do not install certain core requirements, the following dependencies should be installed external to the installation of gretel-synthetics, depending on which model(s) you plan to use.

  • Torch: Used by Timeseries DGAN and ACTGAN (for ACTGAN, Torch is installed by SDV), we recommend version 2.0
  • SDV (Synthetic Data Vault): Used by ACTGAN, we recommend version 0.17.x

These dependencies can be installed by doing the following:

pip install sdv<0.18 # for ACTGAN
pip install torch==2.0 # for Timeseries DGAN

To install the actual gretel-synthetics package, first clone the repo and then...

pip install -U .

or

pip install gretel-synthetics

then...

pip install jupyter
jupyter notebook

When the UI launches in your browser, navigate to examples/synthetic_records.ipynb and get generating!

If you want to install gretel-synthetics locally and use a GPU (recommended):

  1. Create a virtual environment (e.g. using conda)
conda create --name tf python=3.9
  1. Activate the virtual environment
conda activate tf
  1. Run the setup script ./setup-utils/setup-gretel-synthetics-tensorflow24-with-gpu.sh

The last step will install all the necessary software packages for GPU usage, tensorflow=2.8 and gretel-synthetics. Note that this script works only for Ubuntu 18.04. You might need to modify it for other OS versions.

Timeseries DGAN Overview

The timeseries DGAN module contains a PyTorch implementation of a DoppelGANger model that is optimized for timeseries data. Similar to tensorflow, you will need to manually install pytorch:

pip install torch==1.13.1

This notebook shows basic usage on a small data set of smart home sensor readings.

ACTGAN Overview

ACTGAN (Anyway CTGAN) is an extension of the popular CTGAN implementation that provides some additional functionality to improve memory usage, autodetection and transformation of columns, and more.

To use this model, you will need to manually install SDV:

pip install sdv<0.18

Keep in mind that this will also install several dependencies like PyTorch that SDV relies on, which may conflict with PyTorch versions installed for use with other models like Timeseries DGAN.

The ACTGAN interface is a superset of the CTGAN interface. To see the additional features, please take a look at the ACTGAN demo notebook in the examples directory of this repo.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gretel-synthetics-0.22.20.tar.gz (1.7 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gretel_synthetics-0.22.20-py3-none-any.whl (136.0 kB view details)

Uploaded Python 3

File details

Details for the file gretel-synthetics-0.22.20.tar.gz.

File metadata

  • Download URL: gretel-synthetics-0.22.20.tar.gz
  • Upload date:
  • Size: 1.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for gretel-synthetics-0.22.20.tar.gz
Algorithm Hash digest
SHA256 2e17de9602001e325421a50c99a858035345998c79ce2f6609eb66700b740c64
MD5 4c583b3088720653e0e19c0310a158e0
BLAKE2b-256 38da310d003bca9c51b3d9fc93f32a05b08f34fce5e2a589ed5f947c635bf844

See more details on using hashes here.

File details

Details for the file gretel_synthetics-0.22.20-py3-none-any.whl.

File metadata

File hashes

Hashes for gretel_synthetics-0.22.20-py3-none-any.whl
Algorithm Hash digest
SHA256 fd45ec072270be5c578941b922d48a54a82fe2a37ecf6bac3f466c4b7031deb1
MD5 19cc263f148f0c089b035190c03da4a4
BLAKE2b-256 a25df6983689a8f084123a0d1fae50f139e01756b99b092dbb7176aff766b227

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page