Skip to main content

Synthetic Data Generation

Project description

SDG Hub

Composable blocks and flows for synthetic data generation

Docs PyPI Tests Python 3.10+ License Coverage Ask DeepWiki


SDG Hub Demo

SDG Hub is a Python framework for building synthetic data generation pipelines. Chain LLM, parsing, transform, filtering, and agent blocks into YAML-defined flows -- then generate training data at scale.

Get Started

pip install sdg-hub
from sdg_hub import FlowRegistry, Flow

# Discover and load a built-in flow
FlowRegistry.discover_flows()
flow = Flow.from_yaml(FlowRegistry.get_flow_path("MCP Server Distillation"))

# Configure and run
flow.set_model_config(model="openai/gpt-4o")
result = flow.generate(dataset)

See the Quick Start for a full walkthrough, or browse all built-in flows.

Documentation

Full documentation at ai-innovation.team/sdg_hub

  • Installation -- setup, optional dependencies, development install
  • Quick Start -- end-to-end walkthrough from loading a flow to generating data
  • Core Concepts -- blocks, flows, registries, and dataset handling
  • Block Reference -- LLM, parsing, transform, filtering, agent, and custom blocks
  • Flow Reference -- YAML schema, built-in flows, custom flows
  • API Reference -- auto-generated from source
  • Contributing -- development setup and contribution guidelines

License

Apache License 2.0 -- see LICENSE.


Built by the Red Hat AI Innovation Team

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sdg_hub-0.9.3.tar.gz (8.7 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sdg_hub-0.9.3-py3-none-any.whl (258.4 kB view details)

Uploaded Python 3

File details

Details for the file sdg_hub-0.9.3.tar.gz.

File metadata

  • Download URL: sdg_hub-0.9.3.tar.gz
  • Upload date:
  • Size: 8.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for sdg_hub-0.9.3.tar.gz
Algorithm Hash digest
SHA256 660bc6005baa7bf74b8a6c17df2071e4a46ca7e71b58069bd6a099ad305d057f
MD5 29cba39391c66be50314e73881b4360e
BLAKE2b-256 f05d2bd74c6c8ffa5bb27271859229c80954bc00e60108c3f1ad92597459e192

See more details on using hashes here.

Provenance

The following attestation bundles were made for sdg_hub-0.9.3.tar.gz:

Publisher: pypi.yml on Red-Hat-AI-Innovation-Team/sdg_hub

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sdg_hub-0.9.3-py3-none-any.whl.

File metadata

  • Download URL: sdg_hub-0.9.3-py3-none-any.whl
  • Upload date:
  • Size: 258.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for sdg_hub-0.9.3-py3-none-any.whl
Algorithm Hash digest
SHA256 8a24620739cce797b173efb79d2fbc33efd7c1bd6b97439fee14e56d53e63aae
MD5 7876407b7af045cf5d776a98002b9d29
BLAKE2b-256 b5c6041e9b0e5403bdc4462267369185f848189c48dcf20d3e3cf3e92ae28cfd

See more details on using hashes here.

Provenance

The following attestation bundles were made for sdg_hub-0.9.3-py3-none-any.whl:

Publisher: pypi.yml on Red-Hat-AI-Innovation-Team/sdg_hub

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page