Skip to main content

Synthetic Data Generation

Project description

SDG Hub

Composable blocks and flows for synthetic data generation

Docs PyPI Tests Python 3.10+ License Coverage Ask DeepWiki


SDG Hub Demo

SDG Hub is a Python framework for building synthetic data generation pipelines. Chain LLM, parsing, transform, filtering, and agent blocks into YAML-defined flows -- then generate training data at scale.

Get Started

pip install sdg-hub
from sdg_hub import FlowRegistry, Flow

# Discover and load a built-in flow
FlowRegistry.discover_flows()
flow = Flow.from_yaml(FlowRegistry.get_flow_path("MCP Server Distillation"))

# Configure and run
flow.set_model_config(model="openai/gpt-4o")
result = flow.generate(dataset)

See the Quick Start for a full walkthrough, or browse all built-in flows.

Documentation

Full documentation at ai-innovation.team/sdg_hub

  • Installation -- setup, optional dependencies, development install
  • Quick Start -- end-to-end walkthrough from loading a flow to generating data
  • Core Concepts -- blocks, flows, registries, and dataset handling
  • Block Reference -- LLM, parsing, transform, filtering, agent, and custom blocks
  • Flow Reference -- YAML schema, built-in flows, custom flows
  • API Reference -- auto-generated from source
  • Contributing -- development setup and contribution guidelines

License

Apache License 2.0 -- see LICENSE.


Built by the Red Hat AI Innovation Team

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sdg_hub-0.9.2.tar.gz (8.7 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sdg_hub-0.9.2-py3-none-any.whl (256.8 kB view details)

Uploaded Python 3

File details

Details for the file sdg_hub-0.9.2.tar.gz.

File metadata

  • Download URL: sdg_hub-0.9.2.tar.gz
  • Upload date:
  • Size: 8.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for sdg_hub-0.9.2.tar.gz
Algorithm Hash digest
SHA256 d45071fc12534690b02bc66398832da3ec6f995ff0d2a508eb37e6415559b8c9
MD5 f9def32d1939631c251849cf898bedfe
BLAKE2b-256 dd0e7e57edc9c94168cb9159e8342ff8351f0ada2fb9212fa08669c3c9b2e4ce

See more details on using hashes here.

Provenance

The following attestation bundles were made for sdg_hub-0.9.2.tar.gz:

Publisher: pypi.yml on Red-Hat-AI-Innovation-Team/sdg_hub

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sdg_hub-0.9.2-py3-none-any.whl.

File metadata

  • Download URL: sdg_hub-0.9.2-py3-none-any.whl
  • Upload date:
  • Size: 256.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for sdg_hub-0.9.2-py3-none-any.whl
Algorithm Hash digest
SHA256 dd77b641edc8e69d79fa14e887dedd43554ba54de802eea81630efb52ca27256
MD5 a70ae669380d089c46173ff61d79f017
BLAKE2b-256 daa11ff210e94d0b3665d061d63a2cd09582f2ac9840f354d8c10f63e2e399de

See more details on using hashes here.

Provenance

The following attestation bundles were made for sdg_hub-0.9.2-py3-none-any.whl:

Publisher: pypi.yml on Red-Hat-AI-Innovation-Team/sdg_hub

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page