A Python-native synthd client library
Project description
- License: Apache v2.0
- Documentation: https://openquery-io.github.io/synthpy/
- Homepage: https://getsynth.com
What is this?
This is Synth
! A fast and highly
configurable NoSQL synthetic data engine. It reconciles the two
worlds of synthetic data and test data by letting users generate
realistic synthetic data for testing their applications and ML models.
What can I do with this?
With Synth
you can:
-
Anonymize sensitive data easily. As simple as JSON-in/JSON-out. If you're not happy with the result, simply tweak the synthetic data model with a custom JSON metadata format and
Synth
will adjust everything on the fly, no additional ETL required. -
Augment your datasets with synthetic data. For those times when you already have some data but just not enough of it to do what you need to do. It can extrapolate from patterns it finds in your data, so you can create as much of it as you want.
-
Create entirely new fake data declaratively. You can even add you own set of constraints and logic to create completely unseen scenario.
How does it work?
It has two components:
synthd
: a persistent process that ingests raw (usually sensitive) training data and trains and builds synthetic data models from it. Think of it as a NoSQL datastore that never persists actual data, only anonymized model parameters.synthpy
: our reference Python implementation for thesynthd
API. This lets you leveragesynthd
in custom scripts and test harnesses.
Quickstart
Here is an end-to-end example using the Python client, synthpy
.
from synthpy import Synth
# Assuming `synthd` is running on `localhost` with default settings
client = Synth("localhost:8182")
with open("my_users_data.json", "r") as data_f:
documents = json.load(data_f)
# Submit your JSON documents to `synthd` for training
client.put_documents(namespace="app", collection="users", batch=documents)
# Generate 100 new synthetic users
synthetic_users = client.get_documents(namespace="app", collection="users", size=100)
Want to know more?
As of now, only the Python client for Synth
is free and
open-source. But it is also on our roadmap to open-source big chunks
of the daemon, synthd
, where the real magic happens! So
stay tuned!
In the meantime, head over to our documentation or hit us up if you want to give Synth a try!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for getsynthpy-0.2.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5522613f4b0e32baf7aba0d39cee45c9f2677e04e10943ebbd02f0defd7f7faa |
|
MD5 | 9a17a13f7622ab2d9633c1274b41efac |
|
BLAKE2b-256 | 8e4ad68099bda5673fac8adbf654fa25a1fe9272d7a0cff770134736ffd85e11 |