roscoe — Ready-to-run Orchestration SDK: Configurable, Observable, Extensible.
Project description
roscoe
Ready-to-run Orchestration SDK — Configurable, Observable, Extensible
roscoe is a Python SDK for building LLM-powered agents that ship with production plumbing built in — retries, cost tracking, audit logging, rate limiting, human approval, memory, monitoring, and evals. You write tools (plain Python functions) and a YAML config; roscoe handles everything else.
Switch LLM providers by editing one config block. Your code never changes.
pip install roscoe
Quick start
Option A: Scaffold with the CLI
roscoe init my-agent
A GUI wizard opens — pick your provider, toggle middleware, configure memory. Hit Create Project and you get a ready-to-run folder:
my-agent/
├── agent_config.yaml # fully commented — every option explained
├── main.py # 6 lines to run your agent
├── tools/my_tools.py # your @tool functions go here
├── prompts/system.txt # agent personality + instructions
├── evals/test_cases.json
└── .env.example # all possible credential placeholders
cd my-agent
cp .env.example .env # fill in your API key
python main.py
Use --quick to skip the wizard, or --cli for a terminal-based wizard.
Option B: Code it directly
from roscoe import AgentRunner
from roscoe.tools import tool
@tool
def get_price(sku: str) -> dict:
"""Fetch the price for a product SKU."""
return {"sku": sku, "price": 1999}
agent = AgentRunner.from_config("agent_config.yaml", tools=[get_price])
result = agent.run("What is the price of SKU-001?")
print(result.output) # the LLM's answer
print(result.status) # "success" | "error" | "paused"
print(result.cost_usd) # estimated cost in USD
print(result.total_tokens) # token usage
print(result.run_id) # UUID tying this run to the audit trail
Features
Provider-agnostic
Swap LLM providers by editing the model: block in your YAML config. Your Python code
stays identical.
| Provider | Config key | Example model |
|---|---|---|
| OpenAI | openai |
gpt-4o-mini |
| OpenRouter | openai + base_url |
any of 100+ models |
| Azure OpenAI | azure_openai |
gpt-4o (via deployment) |
| Anthropic | anthropic |
claude-sonnet-4-5 |
| Google Gemini | gemini |
gemini-1.5-pro |
| Ollama (local) | ollama |
llama3.1 (free, no key) |
Register custom providers: ProviderFactory.register("my_provider", MyProviderClass).
# Just change this block — nothing else
model:
provider: openai
model: gpt-4o-mini
api_key: ${OPENAI_API_KEY}
temperature: 0.1
Automatic middleware
All middleware is configured in YAML and runs automatically on every agent.run() call.
Zero boilerplate code.
middleware:
retry:
max_attempts: 3 # exponential backoff on transient errors
rate_limiter:
enabled: true
requests_per_minute: 60 # token-bucket per provider
cost_tracking:
enabled: true # USD estimate in result.cost_usd
audit:
enabled: true # JSONL log at logs/audit.jsonl
- Retry — handles rate limits, timeouts, 500s with exponential backoff.
- Rate limiting — token-bucket algorithm, one bucket per provider. Prevents thrashing API limits when multiple agents share a provider. Ollama is skipped (no external limit).
- Cost tracking — reads
usage_metadatafrom LangChain, prices via built-in rate table. Extensible:from roscoe.middleware.cost_tracker import COST_TABLE COST_TABLE["openai"]["gpt-4.1"] = {"input": 0.002, "output": 0.008}
- Audit logging — non-blocking JSONL, one line per run: run_id, agent, provider,
model, tokens, cost, status, latency. Feed to
roscoe monitorfor dashboards.
Memory
Three types, all configured in YAML:
memory:
conversation:
enabled: true
window_size: 10 # last N messages per session_id
persistent:
enabled: true
backend: sqlite
connection: ./facts.db # long-term facts per user_id
- Conversation — short-term, per
session_id, windowed. Passsession_idtoagent.run()to keep context across turns. - Persistent — long-term facts per
user_idin sqlite. The agent remembers "My name is Rhea" across sessions. - Knowledge / RAG — vector retrieval via FAISS (if installed) or a zero-dependency
keyword retriever. Set up in code:
from roscoe.memory.knowledge import KnowledgeMemory km = KnowledgeMemory.from_texts(["policy doc text..."], metadatas=[{"source": "hr.pdf"}])
Connectors
Pre-built tool bundles for enterprise systems. Each connector gives you LangChain tools
you hand straight to AgentRunner:
from roscoe.connectors import GitHubConnector
gh = GitHubConnector({"token": "ghp_..."})
agent = AgentRunner.from_config("agent.yaml", tools=gh.tools)
# Mix with your own tools:
agent = AgentRunner.from_config("agent.yaml", tools=[my_tool] + gh.tools)
| Connector | Tools | Auth |
|---|---|---|
| REST (any API) | GET, POST, PUT, DELETE | Bearer / Basic / API key |
| Jira | search, create, update, transition issues | Email + API token |
| ServiceNow | create, query, update incidents | Username + password |
| Outlook | send email, read inbox, create event, availability | MS Graph (OAuth2) |
| SharePoint | list files, download, upload, search | MS Graph (OAuth2) |
| GitHub | list repos, issues, PRs, create issue | Personal access token |
| Notion | search, pages, databases, blocks | Integration token |
| Google Workspace | Gmail send/read, Calendar, Tasks, Drive search | Service account |
| Snowflake | execute SQL queries | pip install roscoe[snowflake] |
Human-in-the-loop
Make sensitive tools require approval before they run:
middleware:
human_approval:
require_approval_for: ["send_email", "delete_record"]
result = agent.run("Send an email to bob@acme.com")
if result.status == "paused":
print(result.pending_action) # inspect what the agent wants to do
# approve — tool runs as planned
result = agent.resume(result.run_id, "approve")
# reject — tool is blocked, agent gets a rejection message
result = agent.resume(result.run_id, "reject")
# modify — change the arguments before running
result = agent.resume(result.run_id, "modify", payload={"to": "correct@acme.com"})
The run stops before the gated tool executes and returns status="paused". Call
resume() to continue. Wire this to a Slack button, a web UI, or a CLI prompt.
Monitoring
Offline aggregation of your audit logs — no live server required.
roscoe monitor --path logs/audit.jsonl
Outputs a text dashboard with:
- Total runs, cost per day, cost per agent
- Latency percentiles (p50 / p95 / p99) per agent
- Error rate breakdown
- Token usage summary
Alerts — configure thresholds for daily cost, error rate, and latency:
from roscoe.monitoring.alerts import check_and_notify
from roscoe.monitoring.notifier import build_notifier
notifier = build_notifier("slack", {"webhook_url": "https://hooks.slack.com/..."})
check_and_notify(metrics, alert_config, notifier)
Exporters — push metrics to Prometheus Pushgateway or Azure Monitor:
from roscoe.monitoring.exporters.prometheus import PrometheusPushgatewayExporter
exporter = PrometheusPushgatewayExporter(gateway_url="http://localhost:9091")
exporter.push(metrics)
Evals
Test your agent with a dataset of cases and score the results:
roscoe eval --dataset evals/test_cases.json --config agent_config.yaml
Scorers:
- Tool usage — deterministic, order-aware (did the agent call the right tools?).
- Output quality — LLM-as-judge, 0–10 scale. Add
--judgeto enable. - Hallucination — LLM-as-judge, checks output against provided context docs.
Regression diffing — compare two eval runs:
from roscoe.evals.regression import compare_runs
diff = compare_runs(report_a, report_b)
print(diff.improved) # cases that got better
print(diff.regressed) # cases that got worse
Test case format (evals/test_cases.json):
{
"cases": [
{
"id": "weather-london",
"input": "What's the weather in London?",
"expected_tools": ["get_weather"],
"expected_output": "Should mention London weather",
"context_docs": ["London is currently 15°C and rainy."]
}
]
}
Templates
Six pre-built templates, each with tools, system prompt, config, and approval gates pre-configured:
roscoe init my-hr-bot --template hr_agent
roscoe init my-it-bot --template it_support_agent
roscoe init my-legal --template legal_agent
roscoe init my-kb --template knowledge_base_agent
roscoe init my-ea --template exec_assistant_agent
roscoe init my-gws --template google_workspace_agent
| Template | Use case | Connector | Approval gate |
|---|---|---|---|
hr_agent |
Leave, payslips, personal details | REST API | submit_leave_request |
it_support_agent |
Tickets, escalation, KB search | ServiceNow | escalate_ticket |
legal_agent |
Contract search, clause extraction, risk flags | Knowledge (RAG) | — |
knowledge_base_agent |
Q&A over Notion / SharePoint / docs | Notion + Knowledge | — (read-only) |
exec_assistant_agent |
Email, calendar, availability | Outlook (MS Graph) | send_email, create_event |
google_workspace_agent |
Gmail, Calendar, Tasks, Drive | Google Workspace | send_email, create_event, create_task |
Architecture
roscoe runs its own async ReAct loop (no LangGraph dependency). The loop is ~100 lines
in roscoe/core/executor.py:
User message
→ model.invoke(messages)
→ tool_calls in response?
→ approval gate check (pause if gated)
→ execute tools
→ append results to messages
→ loop back to model
→ no tool_calls? done → AgentResult
Built on LangChain (models, tools, messages) but not LangGraph. This keeps the agent loop small, transparent, and easy to debug.
CLI reference
roscoe init <name> # scaffold with GUI wizard
roscoe init <name> --quick # scaffold with defaults (no wizard)
roscoe init <name> --cli # scaffold with terminal wizard
roscoe init <name> --template <t> # scaffold from a template
roscoe monitor # dashboard from logs/audit.jsonl
roscoe monitor --path /path/to/audit.jsonl # custom audit log path
roscoe eval --dataset cases.json --config agent.yaml # tool-usage scoring
roscoe eval --dataset cases.json --config agent.yaml --judge # + LLM-as-judge
roscoe eval --dataset cases.json --config agent.yaml --tools module:attr # custom tools
roscoe --version # print version
Install
pip install roscoe # core
pip install "roscoe[snowflake]" # + Snowflake driver
pip install "roscoe[azure]" # + Azure Monitor exporter
pip install "roscoe[dev]" # + pytest (for contributors)
From source
git clone https://github.com/rhealaloo45/roscoe.git
cd roscoe
pip install -e ".[dev]"
pytest -q # 121 tests, all passing
Writing tools
A tool is a plain Python function with type hints and a docstring:
from roscoe.tools import tool
@tool
def search_docs(query: str) -> list[dict]:
"""Search internal documents. Use when the user asks about company policies."""
# your logic here
return [{"title": "Remote Work Policy", "snippet": "..."}]
- The docstring is what the LLM reads to decide when to call the tool.
- Type hints are used to generate the JSON schema automatically.
- Return dicts or primitives — the LLM reads the return value.
- Add the tool to the
TOOLSlist intools/my_tools.py, or pass it directly toAgentRunner.from_config().
Configuration reference
Full agent_config.yaml with all options (also generated by roscoe init with inline
comments):
agent_name: my-agent
system_prompt_file: prompts/system.txt
# system_prompt: |
# Inline prompt alternative
model:
provider: openai # openai | azure_openai | anthropic | gemini | ollama
model: gpt-4o-mini
api_key: ${OPENAI_API_KEY} # resolved from environment
temperature: 0.1
# base_url: https://openrouter.ai/api/v1 # for OpenRouter / custom endpoints
# max_tokens: 4096
memory:
conversation:
enabled: true
window_size: 10
persistent:
enabled: false
backend: sqlite
connection: ./facts.db
middleware:
retry:
max_attempts: 3
rate_limiter:
enabled: true
requests_per_minute: 60
cost_tracking:
enabled: true
audit:
enabled: true
# human_approval:
# require_approval_for: ["send_email", "delete_record"]
Environment variables use ${VAR_NAME} syntax and are resolved at config load time.
License
MIT — see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file roscoe-0.1.1.tar.gz.
File metadata
- Download URL: roscoe-0.1.1.tar.gz
- Upload date:
- Size: 83.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
039804fe841d866663416520a2913ee3013a7dbd1ccef663ceeac41b8ebc7d28
|
|
| MD5 |
111a87b090a2127ff05b283f53a2cb15
|
|
| BLAKE2b-256 |
cb47bc97eaaada75aa550ff31c8ec0b345968c342107cf8a6f6cb936cd07dab7
|
File details
Details for the file roscoe-0.1.1-py3-none-any.whl.
File metadata
- Download URL: roscoe-0.1.1-py3-none-any.whl
- Upload date:
- Size: 110.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
12e377ecfef7fa8ba1b01cf056fff4ee9e59028f7d8f1e710c31f0a9c902859c
|
|
| MD5 |
16d5a6c7dcb6e538b78b8e96b790c9ee
|
|
| BLAKE2b-256 |
f2b55267ea6c2754b55ee3fa2f662eb9ec5fdf0be0cece48fa94d48f92d2956a
|