Synthetic data generation.
Project description
Welcome to LINDA
MELINDA
is a python library for creating tabular synthetic data.
It uses various generative models in artificial intelligence
to learn statistical properties from your real data and
use them to generate synthetic data.
Installation
git clone https://github.com/hse-cs/LINDA.git
cd LINDA
pip install -e .
or
poetry install
Basic usage
The following code snippet creates an example of real data, fits a generative model, and samples synthetic data.
import numpy as np
import pandas as pd
from melinda.models import ProbaformsSynthesizer
from probaforms.models import CVAE
# generate an example of real data
n = 100
data_real = pd.DataFrame()
data_real['col_1'] = np.random.rand(n)
data_real['col_2'] = np.random.rand(n)
data_real['col_3'] = [str(i) for i in np.random.randint(0, 10, n)]
data_real['col_4'] = [str(i) for i in np.random.randint(0, 5, n)]
num_cols = ['col_1', 'col_2']
cat_cols = ['col_3', 'col_4']
lab_cols = None
# fit a generative model
model = CVAE(latent_dim=10, hidden=(10,), lr=0.001, n_epochs=10)
gen = ProbaformsSynthesizer(model, num_cols, cat_cols, lab_cols, cat_transform='OneHotEncoder')
gen.fit(data_real)
# sample synthetic data
data_synthetic = gen.sample(n_samples=10)
data_synthetic.head()
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
melinda-0.1.0.tar.gz
(6.5 kB
view details)
Built Distribution
File details
Details for the file melinda-0.1.0.tar.gz
.
File metadata
- Download URL: melinda-0.1.0.tar.gz
- Upload date:
- Size: 6.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.19
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 98a94471b3f9c83f6b11583890b47feef8d71ce1c93de39af9e2bc29c99e7391 |
|
MD5 | 9c3e7d3d0cc30f7a365c915708dacc61 |
|
BLAKE2b-256 | cae88fe3b12e6f07fc32d263372c9253cb1a11208807a7264432947bcf09df24 |
File details
Details for the file melinda-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: melinda-0.1.0-py3-none-any.whl
- Upload date:
- Size: 7.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.19
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ebf005f6088b879cf07797b9e534b04d3e30b779217bf0a9d7b4a14d036409e3 |
|
MD5 | b4b0e02a633e359a9c3a33dd7ba3ceca |
|
BLAKE2b-256 | 5906bc716529365f12ffe66af0e3f439eb9c71e33aeda5078c9d7fbaff168fd5 |