Skip to main content

Synthetic data generation methods with different synthetization methods.

Project description

Synthetic Data Logo

Join us on slack

What is Synthetic Data?

Synthetic data is artificially generated data that is not collected from real world events. It replicates the statistical components of real data without containing any identifiable information, ensuring individuals' privacy.

Why Synthetic Data?

Synthetic data can be used for many applications:

  • Privacy
  • Remove bias
  • Balance datasets
  • Augment datasets

ydata-synthetic

This repository contains material related with Generative Adversarial Networks for synthetic data generation, in particular regular tabular data and time-series. It consists in a set of different GANs architectures developed ussing Tensorflow 2.0. An example Jupyter Notebook is included, to show how to use the different architectures.

Quickstart

pip install ydata-synthetic

Examples

Here you can find usage examples of the package and models to synthesize tabular data.

Credit Fraud dataset Open in Colab

Stock dataset Open in Colab

Project Resources

In this repo you can find the following GAN architectures:

Tabular data

Sequential data

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ydata-synthetic-0.3.1.tar.gz (28.2 kB view hashes)

Uploaded Source

Built Distribution

ydata_synthetic-0.3.1-py2.py3-none-any.whl (38.5 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page