A unified, production-ready AI SDK that enforces structured outputs and anti-hallucination prompting via the RACTO principle. One package for OpenAI, Gemini, and Anthropic — with streaming, tool calling, embeddings, and strict Pydantic validation.
Project description
RactoGateway
One Python package for all production-grade LLM solutions.
RactoGateway is a unified AI SDK that gives you a single, clean interface to OpenAI, Google Gemini, and Anthropic Claude — with built-in anti-hallucination prompting, strict Pydantic validation, streaming, tool calling, and embeddings. No more messy JSON dicts. No more provider lock-in. No more inconsistent response formats.
Why RactoGateway?
Every LLM provider has a different SDK, different request format, different response structure, and different tool-calling schema. Building production AI applications means writing glue code, parsing deeply nested objects, and manually stripping markdown fences from JSON responses.
RactoGateway solves this by providing:
- RACTO Prompt Engine — a structured prompt framework (Role, Aim, Constraints, Tone, Output) that compiles into optimized, anti-hallucination system prompts
- Three Developer Kits —
opd(OpenAI),god(Google),anth(Anthropic) — each withchat(),achat(),stream(),astream(),embed(), andaembed() - Strict Pydantic models for every input and output — no raw dicts anywhere
- Automatic JSON parsing — responses are cleaned of markdown fences and auto-parsed
- Unified tool calling — define tools once as Python functions, use them with any provider
- Streaming with typed chunks — every
StreamChunkhas.delta.text,.accumulated_text,.is_final,.usage
Installation
# Core package (includes RACTO prompt engine and tool registry)
pip install ractogateway
# With a specific provider
pip install ractogateway[openai]
pip install ractogateway[google]
pip install ractogateway[anthropic]
# All providers
pip install ractogateway[all]
# Development (all providers + testing + linting)
pip install ractogateway[dev]
Requirements: Python 3.10+, Pydantic 2.0+
Quick Start
1. Define a RACTO Prompt
Every prompt is a validated Pydantic model with five required fields:
from ractogateway import RactoPrompt
prompt = RactoPrompt(
role="You are a senior Python code reviewer at a Fortune 500 company.",
aim="Review the given code for bugs, security vulnerabilities, and PEP-8 violations.",
constraints=[
"Only report issues you are certain about.",
"Do not suggest stylistic preferences.",
"If no issues are found, say so explicitly.",
"Never fabricate code examples that you cannot verify.",
],
tone="Professional and concise",
output_format="json",
)
2. Use a Developer Kit
from ractogateway import openai_developer_kit as opd
kit = opd.OpenAIDeveloperKit(
model="gpt-4o",
api_key="sk-...", # or set OPENAI_API_KEY env var
default_prompt=prompt,
)
# Synchronous chat
response = kit.chat(opd.ChatConfig(user_message="Review this function:\ndef add(a, b): return a + b"))
print(response.content) # cleaned text
print(response.parsed) # auto-parsed JSON dict (if response was JSON)
print(response.usage) # {"prompt_tokens": 42, "completion_tokens": 18, "total_tokens": 60}
3. Stream Responses
for chunk in kit.stream(opd.ChatConfig(user_message="Explain Python generators")):
print(chunk.delta.text, end="", flush=True)
if chunk.is_final:
print(f"\n\nTokens used: {chunk.usage}")
4. Async Support
import asyncio
async def main():
response = await kit.achat(opd.ChatConfig(user_message="What is SOLID?"))
print(response.content)
async for chunk in kit.astream(opd.ChatConfig(user_message="Explain SOLID")):
print(chunk.delta.text, end="", flush=True)
asyncio.run(main())
Developer Kits
RactoGateway provides three developer kits — one per provider. Each is a self-contained module with the kit class, all input models, and all output models.
from ractogateway import openai_developer_kit as opd # OpenAI / Azure OpenAI
from ractogateway import google_developer_kit as god # Google Gemini
from ractogateway import anthropic_developer_kit as anth # Anthropic Claude
Method Reference
| Method | opd |
god |
anth |
Description |
|---|---|---|---|---|
chat(config) |
Yes | Yes | Yes | Synchronous chat completion |
achat(config) |
Yes | Yes | Yes | Async chat completion |
stream(config) |
Yes | Yes | Yes | Sync streaming (yields StreamChunk) |
astream(config) |
Yes | Yes | Yes | Async streaming (yields StreamChunk) |
embed(config) |
Yes | Yes | -- | Sync embeddings |
aembed(config) |
Yes | Yes | -- | Async embeddings |
Anthropic does not offer a native embedding API. Use the OpenAI or Google kit for embeddings.
Kit Constructors
# OpenAI
kit = opd.OpenAIDeveloperKit(
model="gpt-4o", # required
api_key="sk-...", # or OPENAI_API_KEY env var
base_url="https://custom-proxy.com/v1", # optional (Azure, proxies)
embedding_model="text-embedding-3-small", # default
default_prompt=prompt, # optional
)
# Google Gemini
kit = god.GoogleDeveloperKit(
model="gemini-2.0-flash", # required
api_key="AIza...", # or GEMINI_API_KEY env var
embedding_model="text-embedding-004", # default
default_prompt=prompt, # optional
)
# Anthropic Claude
kit = anth.AnthropicDeveloperKit(
model="claude-sonnet-4-5-20250929", # required
api_key="sk-ant-...", # or ANTHROPIC_API_KEY env var
default_prompt=prompt, # optional
)
Input Models
All inputs are strictly validated Pydantic models. No raw dicts. No positional argument sprawl.
ChatConfig
The single input for chat(), achat(), stream(), and astream().
config = opd.ChatConfig(
user_message="Explain monads in simple terms.", # required, min 1 char
prompt=prompt, # optional (falls back to kit default)
temperature=0.3, # 0.0–2.0, default 0.0
max_tokens=2048, # default 4096
tools=my_tool_registry, # optional ToolRegistry
response_model=MyPydanticModel, # optional output validation
history=[ # optional multi-turn context
opd.Message(role=opd.MessageRole.USER, content="What is FP?"),
opd.Message(role=opd.MessageRole.ASSISTANT, content="Functional programming is..."),
],
extra={"top_p": 0.9, "seed": 42}, # provider-specific pass-through
)
EmbeddingConfig
The input for embed() and aembed().
config = opd.EmbeddingConfig(
texts=["Hello world", "Goodbye world"], # required, min 1 text
model="text-embedding-3-large", # optional (overrides kit default)
dimensions=512, # optional (for models that support it)
)
Output Models
LLMResponse
Returned by chat() and achat(). Unified across all providers.
| Field | Type | Description |
|---|---|---|
content |
str | None |
Cleaned text (markdown fences stripped) |
parsed |
dict | list | None |
Auto-parsed JSON (if response was valid JSON) |
tool_calls |
list[ToolCallResult] |
Tool calls requested by the model |
finish_reason |
FinishReason |
STOP, TOOL_CALL, LENGTH, CONTENT_FILTER, ERROR |
usage |
dict[str, int] |
prompt_tokens, completion_tokens, total_tokens |
raw |
Any |
The unmodified provider response (escape hatch) |
StreamChunk
Yielded by stream() and astream(). One per streaming event.
| Field | Type | Description |
|---|---|---|
delta.text |
str |
Incremental text for this chunk |
accumulated_text |
str |
Full text accumulated so far |
is_final |
bool |
True only on the last chunk |
finish_reason |
FinishReason | None |
Set on final chunk only |
tool_calls |
list[ToolCallResult] |
Populated on final chunk only |
usage |
dict[str, int] |
Populated on final chunk only |
raw |
Any |
Raw provider streaming event |
EmbeddingResponse
Returned by embed() and aembed().
| Field | Type | Description |
|---|---|---|
vectors |
list[EmbeddingVector] |
Each has .index, .text, .embedding |
model |
str |
Model used for embedding |
usage |
dict[str, int] |
Token usage |
RACTO Prompt Engine
The RACTO principle structures every prompt into five unambiguous sections:
| Letter | Field | Purpose |
|---|---|---|
| R | role |
Who the model is |
| A | aim |
What it must accomplish |
| C | constraints |
Hard rules it must never violate |
| T | tone |
Communication style |
| O | output_format |
Exact shape of the response |
Compiled Output
prompt.compile() produces a clearly delimited system prompt:
[ROLE]
You are a senior Python code reviewer at a Fortune 500 company.
[AIM]
Review the given code for bugs, security vulnerabilities, and PEP-8 violations.
[CONSTRAINTS]
- Only report issues you are certain about.
- Do not suggest stylistic preferences.
- If no issues are found, say so explicitly.
- Never fabricate code examples that you cannot verify.
[TONE]
Professional and concise
[OUTPUT]
Respond ONLY with valid JSON. Do NOT wrap the response in markdown code
fences (```json … ```) or add any commentary before or after the JSON object.
[GUARDRAILS]
- If you are unsure or lack sufficient information, state it explicitly rather than guessing.
- Do NOT fabricate facts, citations, URLs, statistics, or code that you cannot verify.
- Stick strictly to what is asked. Do not add unrequested information.
- If the answer requires assumptions, list each assumption explicitly before proceeding.
Advanced: Pydantic Schema as Output Format
Pass a Pydantic model as output_format and the full JSON Schema is embedded directly in the prompt:
from pydantic import BaseModel
class CodeReview(BaseModel):
issues: list[str]
severity: str
suggestion: str
prompt = RactoPrompt(
role="You are a code reviewer.",
aim="Review the code.",
constraints=["Only report real issues."],
tone="Concise",
output_format=CodeReview, # JSON Schema embedded in prompt
)
Optional Fields
| Field | Type | Description |
|---|---|---|
context |
str |
Domain background injected between AIM and CONSTRAINTS |
examples |
list[dict] |
Few-shot input/output pairs for steering |
anti_hallucination |
bool |
Append [GUARDRAILS] block (default True) |
Tool Calling
Define tools as Python functions — never write nested JSON dicts.
Register Tools
from ractogateway import ToolRegistry
registry = ToolRegistry()
@registry.register
def get_weather(city: str, unit: str = "celsius") -> str:
"""Get the current weather for a city.
:param city: The city name
:param unit: Temperature unit (celsius or fahrenheit)
"""
# Your implementation here
return f"Weather in {city}: 22°{unit[0].upper()}"
Use with Any Kit
config = opd.ChatConfig(
user_message="What's the weather in Tokyo?",
tools=registry,
)
response = kit.chat(config)
if response.tool_calls:
for tc in response.tool_calls:
print(f"Call: {tc.name}({tc.arguments})")
# Execute the function
fn = registry.get_callable(tc.name)
result = fn(**tc.arguments)
Register Pydantic Models as Tools
from pydantic import BaseModel, Field
class SearchQuery(BaseModel):
"""Search the knowledge base."""
query: str = Field(description="The search query")
max_results: int = Field(default=5, description="Maximum results to return")
registry.register(SearchQuery)
Validated Response Models
Force the LLM output through a Pydantic model for guaranteed structure:
class SentimentResult(BaseModel):
sentiment: str # "positive", "negative", "neutral"
confidence: float # 0.0 to 1.0
reasoning: str
config = opd.ChatConfig(
user_message="Analyze sentiment: 'This product is amazing!'",
response_model=SentimentResult,
)
response = kit.chat(config)
print(response.parsed)
# {"sentiment": "positive", "confidence": 0.95, "reasoning": "Strong positive adjective 'amazing'"}
Switching Providers
Same ChatConfig, different kit. That's it.
from ractogateway import openai_developer_kit as opd
from ractogateway import google_developer_kit as god
from ractogateway import anthropic_developer_kit as anth
from ractogateway import RactoPrompt
prompt = RactoPrompt(
role="You are a helpful assistant.",
aim="Answer the user's question accurately.",
constraints=["Be concise.", "Cite sources when possible."],
tone="Friendly and professional",
output_format="text",
)
config = opd.ChatConfig(user_message="What is quantum computing?")
# OpenAI
okit = opd.OpenAIDeveloperKit(model="gpt-4o", default_prompt=prompt)
print(okit.chat(config).content)
# Google Gemini
gkit = god.GoogleDeveloperKit(model="gemini-2.0-flash", default_prompt=prompt)
print(gkit.chat(config).content)
# Anthropic Claude
akit = anth.AnthropicDeveloperKit(model="claude-sonnet-4-5-20250929", default_prompt=prompt)
print(akit.chat(config).content)
Architecture
src/ractogateway/
├── __init__.py # Top-level: RactoPrompt, Gateway, tool, ToolRegistry
├── py.typed # PEP 561 typed package marker
│
├── _models/ # Shared Pydantic input/output models
│ ├── chat.py # ChatConfig, Message, MessageRole
│ ├── stream.py # StreamChunk, StreamDelta
│ └── embedding.py # EmbeddingConfig, EmbeddingResponse, EmbeddingVector
│
├── prompts/ # RACTO Prompt Engine
│ └── engine.py # RactoPrompt model + compile()
│
├── tools/ # Tool Registry
│ └── registry.py # @tool decorator, ToolRegistry, ToolSchema
│
├── adapters/ # Internal provider adapters (Adapter Pattern)
│ ├── base.py # BaseLLMAdapter ABC, LLMResponse, FinishReason
│ ├── openai_kit.py # OpenAILLMKit
│ ├── google_kit.py # GoogleLLMKit
│ └── anthropic_kit.py # AnthropicLLMKit
│
├── gateway/ # Unified Gateway Runner
│ └── runner.py # Gateway orchestrator class
│
├── openai_developer_kit/ # OpenAI Developer Kit (import as opd)
│ └── kit.py # OpenAIDeveloperKit class
│
├── google_developer_kit/ # Google Developer Kit (import as god)
│ └── kit.py # GoogleDeveloperKit class
│
└── anthropic_developer_kit/ # Anthropic Developer Kit (import as anth)
└── kit.py # AnthropicDeveloperKit class
Design Principles
- Lazy provider imports —
openai,google-genai, andanthropicSDKs are only imported when you instantiate a kit.import ractogatewaynever fails due to a missing optional dependency. - Composition over inheritance — Developer kits compose internal adapters rather than extending them, keeping the public API surface clean.
- Pydantic everywhere — Every input is a validated model. Every output is a typed model. No
dict[str, Any]at the API boundary. - Sync + async parity — Every method has both a synchronous and asynchronous variant.
- Provider-agnostic tool schemas — Define tools once, use them with any provider. The internal adapters handle the translation.
Environment Variables
| Variable | Provider | Description |
|---|---|---|
OPENAI_API_KEY |
OpenAI | API key (used when api_key not passed to constructor) |
GEMINI_API_KEY |
API key (used when api_key not passed to constructor) |
|
ANTHROPIC_API_KEY |
Anthropic | API key (used when api_key not passed to constructor) |
Contributing
Contributions are welcome. Please open an issue first to discuss what you'd like to change.
# Clone and install in development mode
git clone https://github.com/IAMPathak2702/RactoGateway.git
cd RactoGateway
pip install -e ".[dev]"
# Run tests
pytest
# Lint and format
ruff check src/ tests/
ruff format src/ tests/
# Type checking
mypy src/
License
Apache License 2.0 — see LICENSE for details.
Copyright 2025 Ved Prakash Pathak
Author
Ved Prakash Pathak
- GitHub: @IAMPathak2702
- Email: vp.ved.vpp@gmail.com
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ractogateway-0.1.0.tar.gz.
File metadata
- Download URL: ractogateway-0.1.0.tar.gz
- Upload date:
- Size: 32.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
60b9564bc024b45fe174a090fbe7c74e6b74b1cc588542d1f57dbf9d8857b930
|
|
| MD5 |
1198c3c04389bc74e552c84a5515da2b
|
|
| BLAKE2b-256 |
efe9297e70954c5b603a42109c180a9e1340b02a9a7195b05a3f3712b7ed741b
|
File details
Details for the file ractogateway-0.1.0-py3-none-any.whl.
File metadata
- Download URL: ractogateway-0.1.0-py3-none-any.whl
- Upload date:
- Size: 44.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a55ef71658c300e59821d61553686cceb203708a9cffbe82d96ad0e64909d5a0
|
|
| MD5 |
830511368bf88807dd4bb04175cc7185
|
|
| BLAKE2b-256 |
cd589cb34fce21be7abc5dd7ebe3b1a8c3c0154305023c998402276e4d1d2e18
|