Skip to main content

Add your description here

Project description

GENESIS Core Lib

Python Version License

What is GENESIS Core Lib?

GENESIS Core Lib is an advanced synthetic data generation library for Python 3.12+ that provides state-of-the-art machine learning models including VAEs (TabularVAE, TimeSeriesVAE) and CTGAN for generating high-quality tabular and time series data. The library features adaptive on-the-fly training with automatic model adaptation to data characteristics, model persistence for reusing trained models, behavior control through custom mathematical functions, and integrated quality evaluation metrics. Designed with privacy preservation in mind and optimized for both CPU and GPU processing, it offers a comprehensive Job API for data augmentation, privacy preservation, and ML model testing across various data types and use cases.

Why use GENESIS core Lib?

GENESIS Core Lib is the ideal solution for synthetic data generation because it specializes in industrial sensor data and time series applications, offering state-of-the-art models like TimeSeriesVAE that preserve temporal patterns and statistical properties essential for real-world scenarios. It provides immediate access to high-quality synthetic data without the costs and delays of physical sensor deployments, enabling rapid prototyping, algorithm development, and comprehensive testing while maintaining data privacy through synthetic data sharing. The library ensures data fidelity with advanced evaluation metrics including Dynamic Time Warping and correlation preservation, making it perfect for manufacturing IoT, environmental monitoring, energy utilities, and medical device applications where realistic temporal data is critical.

GENESIS Core Lib is a powerful, extensible library for generating high-quality synthetic data using state-of-the-art machine learning models. Perfect for data augmentation, privacy preservation, and ML model testing.

✨ Key Features

  • Generative AI Architectures: Advanced VAEs (TabularVAE, TimeSeriesVAE) and CTGAN for Tabular and Time series data
  • Adaptive Training: On-the-fly model training with automatic adaptation to your data characteristics
  • Model Persistence: Save and reuse trained generative models for consistent data generation
  • Behavior Control: Manipulate generation patterns with custom mathematical functions
  • Integrated Evaluation: Built-in quality assessment metrics for comprehensive data evaluation
  • High Performance: Optimized for both CPU and GPU processing

🛠️ Quick Start

Quick Install

pip install sdg-core-lib

🚀 Try it

from sdg_core_lib import Job

# Text-based JSON configuration (no file needed)
config = {
    "n_rows": 1000,
    "model": {
        "algorithm_name": "sdg_core_lib.data_generator.models.VAEs.implementation.TabularVAE.TabularVAE",
        "model_name": "customer_synthetic_model"
    },
    "dataset": {
        "dataset_type": "table",
        "data": [
            {
                "column_data": [13.71, 13.4, 13.27, 13.17, 14.13, 13.88, 13.24, 13.73],
                "column_name": "alcohol",
                "column_type": "continuous",
                "column_datatype": "float64"
            },
            {
                "column_data": [5.65, 3.91, 4.28, 2.59, 4.1, 3.9, 3.8, 4.2],
                "column_name": "malic_acid",
                "column_type": "continuous",
                "column_datatype": "float64"
            },
            {
                "column_data": [1.28, 1.05, 1.02, 1.03, 1.71, 1.23, 1.07, 1.5],
                "column_name": "ash",
                "column_type": "continuous",
                "column_datatype": "float64"
            }
        ]
    },
    "save_filepath": "./models"
}

# Create and run a synthetic data generation job
job = Job(
    n_rows=config["n_rows"],
    model_info=config["model"],
    dataset=config["dataset"],
    save_filepath=config.get("save_filepath", "./models")
)

# Generate synthetic data
results, metrics, model, schema = job.train()
print(f"Generated {len(results)} synthetic rows")
print(f"Quality metrics: {metrics}")

📖 See Quick Start Guide for detailed examples

📚 Documentation

📖 User Documentation

Complete guide for users including:

  • Core concepts and the Job API
  • Data Types and Datasets
  • Fantastic Models and how to use them
  • How to handle raw data with Processors
  • How to control generation of synthetic data with Functions
  • Evaluate your work with Evaluators

🔧 Developer Documentation

Technical documentation for developers:

  • Architecture overview and design patterns
  • Extension points and customization
  • Development setup and testing
  • Code organization and standards

Quick Start Guide

Get started immediately with:

  • Installation instructions
  • Basic examples and tutorials
  • Common use cases
  • Troubleshooting tips

📋 Step-by-Step Tutorial

Hands-on tutorial covering:

  • Complete project workflow
  • Real-world examples
  • Advanced techniques
  • Performance optimization

Roadmap

A detailed Roadmap can be found here.

🤝 Contributing

We welcome contributions! Please see our Contributing Guidelines for details.

📄 License

This project is licensed under the GNU Affero General Public License v3.0 - see the LICENSE file for details.

🙏 Acknowledgments

  • Built with TensorFlow and Keras for deep learning models
  • Statistical evaluation using scipy and numpy
  • Inspired by state-of-the-art synthetic data generation research

📞 Support


GENESIS Core Lib - Generating Tomorrow's Data, Today 🚀

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sdg_core_lib-0.1.9.dev13.tar.gz (35.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sdg_core_lib-0.1.9.dev13-py3-none-any.whl (66.0 kB view details)

Uploaded Python 3

File details

Details for the file sdg_core_lib-0.1.9.dev13.tar.gz.

File metadata

  • Download URL: sdg_core_lib-0.1.9.dev13.tar.gz
  • Upload date:
  • Size: 35.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for sdg_core_lib-0.1.9.dev13.tar.gz
Algorithm Hash digest
SHA256 962bfd6c3048a2eefe2c78b879ef44d3e8d9e4c95a3f6eaa9fcb7377d5fde734
MD5 4ec1e090b23ab4abf5cf431dfe0855b5
BLAKE2b-256 a28afc793fa314a045cc8f436fc6bb4d8f014f2c26eb0afdbc4bb30c454a9b1f

See more details on using hashes here.

File details

Details for the file sdg_core_lib-0.1.9.dev13-py3-none-any.whl.

File metadata

  • Download URL: sdg_core_lib-0.1.9.dev13-py3-none-any.whl
  • Upload date:
  • Size: 66.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for sdg_core_lib-0.1.9.dev13-py3-none-any.whl
Algorithm Hash digest
SHA256 a73dc5c1d876380ffe563071cea01b96439185755b0555f39df21b3fc239d209
MD5 0f8417c7b99c9f556ccb421bdf260a9e
BLAKE2b-256 6b37086a4edea1eb409d4951ae5a9c60e868d7d5192d70e6116a86f2c6d846c7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page