Rulesgen library with an optional FastAPI app for safe rule parsing and execution.
Project description
rulesgen
rulesgen is a secure rule-processing service for synthetic data workflows.
It accepts rule input as either natural_language or a restricted DSL,
translates natural_language requests into an untrusted semantic_frame plus
DSL candidate, validates DSL into a compiled_rule, supports local
execution_preview, and can execute full dataset generation as a tracked
job.
Natural-language output is never trusted directly. A rule only becomes
executable after validation and compilation succeed, and diagnostics are
part of the contract at every stage.
Documentation
End-user documentation is published on TDspora:
The source for all public pages, including pages that may not have been
published to TDspora yet, lives in docs/public/:
- Overview
- Quick Start
- Example Workflows
- API Reference
- Python Library
- Configuration
- Run Modes
- Safety Guardrails
- Databricks Models
- Repository Docs
The canonical contributor and agent vocabulary lives in
docs/agent-harness/glossary.md.
Public publishing happens through the tdm-docs Docusaurus site, which
imports docs/public/ during its build. When updating public docs here,
rebuild the tdm-docs site with this repository available as the
RULESGEN_DOCS_SOURCE input.
Quick Local Start
The fastest local path is Docker Compose:
./scripts/run_stack.sh
If no LLM provider credential is present, the script falls back to
RULESGEN_LLM_GATEWAY_BACKEND=stub so the API can still run locally.
Verify readiness:
curl -s http://127.0.0.1:8000/health/ready
Then follow the local Example Workflows documentation.
What Is Included
- FastAPI HTTP service for health, rules, datasets, jobs, and artifacts.
- Python library API for parsing, compilation, preview, generation, and artifact copying.
- Restricted DSL compilation based on Python AST validation.
- Local preview execution for row-phase helpers.
- Subprocess dataset execution for local generation.
- Optional Alibaba OpenSandbox integration for dataset generation.
- Prompt-injection and jailbreak guardrails for natural-language input.
- LLM gateway support for provider-backed and stub translation paths.
- Databricks Foundation Model APIs support through the Databricks extra.
- Filesystem-backed repositories for rules, jobs, prompt audits, uploads, and generated artifacts.
API and Library Entry Points
Use the HTTP API when running the service:
GET /health/liveGET /health/readyPOST /rules/parsePOST /rules/compilePOST /rules/previewPOST /rules/executePOST /datasets/uploadsPOST /datasets/generatePOST /jobsGET /jobs/{job_id}GET /jobs/{job_id}/datasetGET /jobs/{job_id}/artifacts/{artifact_id}
Use the Python library when embedding rulesgen in another process:
parse_rulecompile_rulepreview_ruleexecute_generation_plandownload_job_datasetdownload_job_artifact
See API Reference and Python Library for request shapes and examples.
Configuration
Runtime settings use the RULESGEN_ prefix. In Docker Compose, configuration
comes from compose.yaml, compose.opensandbox.yaml, and the shell
environment. In host-run mode, configuration comes from .env and the shell
environment.
Start with .env.example, then review
Configuration for the full settings guide.
Development
Install development dependencies:
uv sync --extra api --extra dev
Useful checks:
uv run pytest
uv run ruff check .
uv run ruff format --check .
uv run mypy src
uv run pip-audit
Optional extras:
api: FastAPI, Uvicorn, and multipart upload support.dev: linting, formatting, type checking, tests, audit, and doc-fence tooling.guardrails: ML-backed prompt-injection and jailbreak detection.guardrails-onnx: ML guardrails with ONNX runtime support.databricks: Databricks Foundation Model APIs gateway.
Examples:
uv sync --extra api --extra dev --extra databricks
uv sync --extra api --extra dev --extra guardrails --extra databricks
pip install 'rulesgen[api,dev,databricks]'
pip install 'rulesgen[guardrails-onnx,databricks]'
Design and Contributor Docs
Repository-level design and contributor material remains here:
Release Process
Pushes to main run CI, build the wheel and sdist, create the GitHub Release
through semantic-release, attach release artifacts, publish distributions to
PyPI, and build and push a Docker image.
Before enabling automated releases, configure these repository secrets:
DEPLOY_KEYPYPI_TOKENDOCKER_HUB_USERDOCKER_HUB_TOKEN
pyproject.toml:project.version and CHANGELOG.md are owned by
semantic-release and should not be hand-edited.
License
This project is licensed under the Apache License 2.0. See
LICENSE and NOTICE.
Contributing
See CONTRIBUTING.md. This project follows
CODE_OF_CONDUCT.md. Report security issues according
to SECURITY.md.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rulesgen-0.8.4.tar.gz.
File metadata
- Download URL: rulesgen-0.8.4.tar.gz
- Upload date:
- Size: 540.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8c8d1c78f76216591d045e82fcc014bb8606fa598f3e025bb8eb57dabb8f9bf5
|
|
| MD5 |
47de07d7dd2d83c287fbc9a325f1eb53
|
|
| BLAKE2b-256 |
a4c6e025c273e47fc3cabc931c80524f5a94c40bf735fd527a9b962cfd9ff66f
|
File details
Details for the file rulesgen-0.8.4-py3-none-any.whl.
File metadata
- Download URL: rulesgen-0.8.4-py3-none-any.whl
- Upload date:
- Size: 97.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3054488967a9eb9531e4d23889ba3cccd06753da31ca51a91c01c68cbe54176a
|
|
| MD5 |
63722dbdd1afe8cc8dfe0a00f4b736b0
|
|
| BLAKE2b-256 |
74a7eaf69d1756b1962b235eeddde53dae5d9b28673381a506cf0c8e837ee4ad
|