Skip to main content

Rulesgen library with an optional FastAPI app for safe rule parsing and execution.

Project description

rulesgen

rulesgen is a secure rule-processing service for synthetic data workflows. It accepts rule input as either natural_language or a restricted DSL, translates natural_language requests into an untrusted semantic_frame plus DSL candidate, validates DSL into a compiled_rule, supports local execution_preview, and can execute full dataset generation as a tracked job.

Natural-language output is never trusted directly. A rule only becomes executable after validation and compilation succeed, and diagnostics are part of the contract at every stage.

Documentation

End-user documentation is published on TDspora:

The source for all public pages, including pages that may not have been published to TDspora yet, lives in docs/public/:

The canonical contributor and agent vocabulary lives in docs/agent-harness/glossary.md.

Public publishing happens through the tdm-docs Docusaurus site, which imports docs/public/ during its build. When updating public docs here, rebuild the tdm-docs site with this repository available as the RULESGEN_DOCS_SOURCE input.

Quick Local Start

The fastest local path is Docker Compose:

./scripts/run_stack.sh

If no LLM provider credential is present, the script falls back to RULESGEN_LLM_GATEWAY_BACKEND=stub so the API can still run locally.

Verify readiness:

curl -s http://127.0.0.1:8000/health/ready

Then follow the local Example Workflows documentation.

What Is Included

  • FastAPI HTTP service for health, rules, datasets, jobs, and artifacts.
  • Python library API for parsing, compilation, preview, generation, and artifact copying.
  • Restricted DSL compilation based on Python AST validation.
  • Local preview execution for row-phase helpers.
  • Subprocess dataset execution for local generation.
  • Optional Alibaba OpenSandbox integration for dataset generation.
  • Prompt-injection and jailbreak guardrails for natural-language input.
  • LLM gateway support for provider-backed and stub translation paths.
  • Databricks Foundation Model APIs support through the Databricks extra.
  • Filesystem-backed repositories for rules, jobs, prompt audits, uploads, and generated artifacts.

API and Library Entry Points

Use the HTTP API when running the service:

  • GET /health/live
  • GET /health/ready
  • POST /rules/parse
  • POST /rules/compile
  • POST /rules/preview
  • POST /rules/execute
  • POST /datasets/uploads
  • POST /datasets/generate
  • POST /jobs
  • GET /jobs/{job_id}
  • GET /jobs/{job_id}/dataset
  • GET /jobs/{job_id}/artifacts/{artifact_id}

Use the Python library when embedding rulesgen in another process:

  • parse_rule
  • compile_rule
  • preview_rule
  • execute_generation_plan
  • download_job_dataset
  • download_job_artifact

See API Reference and Python Library for request shapes and examples.

Configuration

Runtime settings use the RULESGEN_ prefix. In Docker Compose, configuration comes from compose.yaml, compose.opensandbox.yaml, and the shell environment. In host-run mode, configuration comes from .env and the shell environment.

Start with .env.example, then review Configuration for the full settings guide.

Development

Install development dependencies:

uv sync --extra api --extra dev

Useful checks:

uv run pytest
uv run ruff check .
uv run ruff format --check .
uv run mypy src
uv run pip-audit

Optional extras:

  • api: FastAPI, Uvicorn, and multipart upload support.
  • dev: linting, formatting, type checking, tests, audit, and doc-fence tooling.
  • guardrails: ML-backed prompt-injection and jailbreak detection.
  • guardrails-onnx: ML guardrails with ONNX runtime support.
  • databricks: Databricks Foundation Model APIs gateway.

Examples:

uv sync --extra api --extra dev --extra databricks
uv sync --extra api --extra dev --extra guardrails --extra databricks
pip install 'rulesgen[api,dev,databricks]'
pip install 'rulesgen[guardrails-onnx,databricks]'

Design and Contributor Docs

Repository-level design and contributor material remains here:

Release Process

Pushes to main run CI, build the wheel and sdist, create the GitHub Release through semantic-release, attach release artifacts, publish distributions to PyPI, and build and push a Docker image.

Before enabling automated releases, configure these repository secrets:

  • DEPLOY_KEY
  • PYPI_TOKEN
  • DOCKER_HUB_USER
  • DOCKER_HUB_TOKEN

pyproject.toml:project.version and CHANGELOG.md are owned by semantic-release and should not be hand-edited.

License

This project is licensed under the Apache License 2.0. See LICENSE and NOTICE.

Contributing

See CONTRIBUTING.md. This project follows CODE_OF_CONDUCT.md. Report security issues according to SECURITY.md.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rulesgen-0.8.5.tar.gz (541.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rulesgen-0.8.5-py3-none-any.whl (97.7 kB view details)

Uploaded Python 3

File details

Details for the file rulesgen-0.8.5.tar.gz.

File metadata

  • Download URL: rulesgen-0.8.5.tar.gz
  • Upload date:
  • Size: 541.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for rulesgen-0.8.5.tar.gz
Algorithm Hash digest
SHA256 e184224066b63a4ba92897051d4d57acc62ac985ad47fed96e8fb98698670de1
MD5 ac5cff045fd42e850e3c33567c5ab642
BLAKE2b-256 e1aa20d71e418838b517d930fc4ef8052d5030d82c1150184832112204d66522

See more details on using hashes here.

File details

Details for the file rulesgen-0.8.5-py3-none-any.whl.

File metadata

  • Download URL: rulesgen-0.8.5-py3-none-any.whl
  • Upload date:
  • Size: 97.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for rulesgen-0.8.5-py3-none-any.whl
Algorithm Hash digest
SHA256 c99dfb7e8d769af718fd28612a1d8964909c0c5ec0c40e96178832337096d0f2
MD5 e9da68b8973a819e027790cd1e2ce1ba
BLAKE2b-256 772445b55e2c8247951c6858702524a56a1cceeb4c340d84efbc2b2df99e93fa

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page