Add your description here
Project description
GENESIS Core Lib
🧬 Advanced Synthetic Data Generation Library for Python 3.12+
GENESIS Core Lib is a powerful, extensible library for generating high-quality synthetic data using state-of-the-art machine learning models. Perfect for data augmentation, privacy preservation, and ML model testing.
✨ Key Features
- 🎯 Multiple Model Types: VAEs (TabularVAE, TimeSeriesVAE) and CTGAN
- 📊 Data Type Support: Tabular data, time series with group_index, and custom datasets
- 🔧 Function-Based Generation: Mathematical functions for controlled data generation
- 📈 Quality Evaluation: Built-in metrics for data quality assessment
- 🚀 High Performance: Optimized for both CPU and GPU processing
- 🔒 Privacy Focused: Designed with privacy preservation in mind
🛠️ Installation
Quick Install
pip install sdg-core-lib
Development Install
git clone https://github.com/emiliocimino/generator_core_lib.git
cd generator_core_lib
pip install -e ".[dev]"
🚀 Quick Start
from sdg_core_lib import Job
# Load your dataset configuration
import json
with open('config.json', 'r') as f:
config = json.load(f)
# Create and run a synthetic data generation job
job = Job(
n_rows=config["n_rows"],
model_info=config["model"],
dataset=config["dataset"],
save_filepath=config.get("save_filepath", "./models")
)
# Generate synthetic data
results, metrics, model, schema = job.train()
print(f"Generated {len(results)} synthetic rows")
print(f"Quality metrics: {metrics}")
📖 See Quick Start Guide for detailed examples
🔧 Function-Based Generation
# Generate data using mathematical functions
functions = [
{
"feature": "linear_data",
"function_name": "LinearFunction",
"parameters": {
"m": 2.0,
"q": 1.0,
"min_value": 0.0,
"max_value": 100.0
}
}
]
job = Job(n_rows=100, functions=functions)
synthetic_data = job.generate_from_functions()
📚 Documentation
📖 User Documentation
Complete guide for users including:
- Core concepts and architecture
- Data types (tabular, time series, custom)
- Model configurations (VAEs, CTGAN)
- API reference and examples
- Best practices and troubleshooting
🔧 Developer Documentation
Technical documentation for developers:
- Architecture overview and design patterns
- Extension points and customization
- Development setup and testing
- Code organization and standards
⚡ Quick Start Guide
Get started immediately with:
- Installation instructions
- Basic examples and tutorials
- Common use cases
- Troubleshooting tips
📋 Step-by-Step Tutorial
Hands-on tutorial covering:
- Complete project workflow
- Real-world examples
- Advanced techniques
- Performance optimization
🏗️ Architecture
GENESIS Core Lib follows a modular architecture:
- Data Generator: ML models (TabularVAE, TimeSeriesVAE, CTGAN)
- Dataset: Data abstraction (Table, TimeSeries) with proper column structure
- Preprocess: Data transformation and normalization strategies
- Postprocess: Function application and data modification
- Evaluate: Quality assessment and statistical metrics
🤝 Contributing
We welcome contributions! Please see our Contributing Guidelines for details.
Development Setup
# Clone repository
git clone https://github.com/emiliocimino/generator_core_lib.git
cd generator_core_lib
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install in development mode
pip install -e ".[dev]"
Running Tests
# Run all tests
pytest
# Run with coverage
pytest --cov=sdg_core_lib
# Run specific test file
pytest tests/test_job.py
📄 License
This project is licensed under the GNU Affero General Public License v3.0 - see the LICENSE file for details.
🙏 Acknowledgments
- Built with TensorFlow and Keras for deep learning models
- Statistical evaluation using scipy and numpy
- Inspired by state-of-the-art synthetic data generation research
📞 Support
- 📖 Documentation
- 🐛 Issues
- 💬 Discussions
GENESIS Core Lib - Generating Tomorrow's Data, Today 🚀
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sdg_core_lib-0.1.8.dev4.tar.gz.
File metadata
- Download URL: sdg_core_lib-0.1.8.dev4.tar.gz
- Upload date:
- Size: 34.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.7 {"installer":{"name":"uv","version":"0.10.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
28005984beeb1238565104e9e06a406fc5e9cf862e3572b3597e472071a1c47a
|
|
| MD5 |
717b897d235e491b854a4c271aeb485a
|
|
| BLAKE2b-256 |
c7ff9d21d1eb118317a95a2f77017ade8279641f6054250dba8e42a4883d2a66
|
File details
Details for the file sdg_core_lib-0.1.8.dev4-py3-none-any.whl.
File metadata
- Download URL: sdg_core_lib-0.1.8.dev4-py3-none-any.whl
- Upload date:
- Size: 64.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.7 {"installer":{"name":"uv","version":"0.10.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8d7e5cb83435801068eec98ffd5d030c385939d6816293b76e883cb37e192be8
|
|
| MD5 |
a67491f226692c10d88e82c5a6226200
|
|
| BLAKE2b-256 |
ab8bab30d333457c8369e9ec2a38cd16769bda9b107efba95391080c95de7080
|