roscoe — Ready-to-run Orchestration SDK: Configurable, Observable, Extensible.

These details have not been verified by PyPI

Project links

Project description

roscoe

Ready-to-run Orchestration SDK — Configurable, Observable, Extensible

roscoe is a Python SDK for building LLM-powered agents that ship with production plumbing built in — retries, cost tracking, audit logging, rate limiting, human approval, memory, monitoring, and evals. You write tools (plain Python functions) and a YAML config; roscoe handles everything else.

Switch LLM providers by editing one config block. Your code never changes.

pip install roscoe

Quick start

Option A: Scaffold with the CLI

roscoe init my-agent

A GUI wizard opens — pick your provider, toggle middleware, configure memory. Hit Create Project and you get a ready-to-run folder:

my-agent/
├── agent_config.yaml    # fully commented — every option explained
├── main.py              # 6 lines to run your agent
├── tools/my_tools.py    # your @tool functions go here
├── prompts/system.txt   # agent personality + instructions
├── evals/test_cases.json
└── .env.example         # all possible credential placeholders

cd my-agent
cp .env.example .env     # fill in your API key
python main.py

Use --quick to skip the wizard, or --cli for a terminal-based wizard.

Option B: Code it directly

from roscoe import AgentRunner
from roscoe.tools import tool


@tool
def get_price(sku: str) -> dict:
    """Fetch the price for a product SKU."""
    return {"sku": sku, "price": 1999}


agent = AgentRunner.from_config("agent_config.yaml", tools=[get_price])
result = agent.run("What is the price of SKU-001?")

print(result.output)        # the LLM's answer
print(result.status)        # "success" | "error" | "paused"
print(result.cost_usd)      # estimated cost in USD
print(result.total_tokens)  # token usage
print(result.run_id)        # UUID tying this run to the audit trail

Features

Provider-agnostic

Swap LLM providers by editing the model: block in your YAML config. Your Python code stays identical.

Provider	Config key	Example model
OpenAI	`openai`	`gpt-4o-mini`
OpenRouter	`openai` + `base_url`	any of 100+ models
Azure OpenAI	`azure_openai`	`gpt-4o` (via deployment)
Anthropic	`anthropic`	`claude-sonnet-4-5`
Google Gemini	`gemini`	`gemini-1.5-pro`
Ollama (local)	`ollama`	`llama3.1` (free, no key)

# Just change this block — nothing else
model:
  provider: openai
  model: gpt-4o-mini
  api_key: ${OPENAI_API_KEY}
  temperature: 0.1

Automatic middleware

All middleware is configured in YAML and runs automatically on every agent.run() call. Zero boilerplate code.

middleware:
  retry:
    max_attempts: 3              # exponential backoff on transient errors
  rate_limiter:
    enabled: true
    requests_per_minute: 60      # token-bucket per provider
  cost_tracking:
    enabled: true                # USD estimate in result.cost_usd
  audit:
    enabled: true                # JSONL log at logs/audit.jsonl

Retry — handles rate limits, timeouts, 500s with exponential backoff.
Rate limiting — token-bucket algorithm, one bucket per provider. Prevents thrashing API limits when multiple agents share a provider. Ollama is skipped (no external limit).

Cost tracking — reads usage_metadata from LangChain, prices via built-in rate table. Extensible:

from roscoe.middleware.cost_tracker import COST_TABLE
COST_TABLE["openai"]["gpt-4.1"] = {"input": 0.002, "output": 0.008}

Audit logging — non-blocking JSONL, one line per run: run_id, agent, provider, model, tokens, cost, status, latency. Feed to roscoe monitor for dashboards.

Memory

Three types, all configured in YAML:

memory:
  conversation:
    enabled: true
    window_size: 10         # last N messages per session_id
  persistent:
    enabled: true
    backend: sqlite
    connection: ./facts.db  # long-term facts per user_id

Conversation — short-term, per session_id, windowed. Pass session_id to agent.run() to keep context across turns.
Persistent — long-term facts per user_id in sqlite. The agent remembers "My name is Rhea" across sessions.

Knowledge / RAG — vector retrieval via FAISS (if installed) or a zero-dependency keyword retriever. Set up in code:

from roscoe.memory.knowledge import KnowledgeMemory
km = KnowledgeMemory.from_texts(["policy doc text..."], metadatas=[{"source": "hr.pdf"}])

Connectors

Pre-built tool bundles for enterprise systems. Each connector gives you LangChain tools you hand straight to AgentRunner:

from roscoe.connectors import GitHubConnector

gh = GitHubConnector({"token": "ghp_..."})
agent = AgentRunner.from_config("agent.yaml", tools=gh.tools)

# Mix with your own tools:
agent = AgentRunner.from_config("agent.yaml", tools=[my_tool] + gh.tools)

Connector	Tools	Auth
REST (any API)	GET, POST, PUT, DELETE	Bearer / Basic / API key
Jira	search, create, update, transition issues	Email + API token
ServiceNow	create, query, update incidents	Username + password
Outlook	send email, read inbox, create event, availability	MS Graph (OAuth2)
SharePoint	list files, download, upload, search	MS Graph (OAuth2)
GitHub	list repos, issues, PRs, create issue	Personal access token
Notion	search, pages, databases, blocks	Integration token
Google Workspace	Gmail send/read, Calendar, Tasks, Drive search	Service account
Snowflake	execute SQL queries	`pip install roscoe[snowflake]`

Human-in-the-loop

Make sensitive tools require approval before they run:

middleware:
  human_approval:
    require_approval_for: ["send_email", "delete_record"]

result = agent.run("Send an email to bob@acme.com")

if result.status == "paused":
    print(result.pending_action)  # inspect what the agent wants to do

    # approve — tool runs as planned
    result = agent.resume(result.run_id, "approve")

    # reject — tool is blocked, agent gets a rejection message
    result = agent.resume(result.run_id, "reject")

    # modify — change the arguments before running
    result = agent.resume(result.run_id, "modify", payload={"to": "correct@acme.com"})

The run stops before the gated tool executes and returns status="paused". Call resume() to continue. Wire this to a Slack button, a web UI, or a CLI prompt.

Monitoring

Offline aggregation of your audit logs — no live server required.

roscoe monitor --path logs/audit.jsonl

Outputs a text dashboard with:

Total runs, cost per day, cost per agent
Latency percentiles (p50 / p95 / p99) per agent
Error rate breakdown
Token usage summary

Alerts — configure thresholds for daily cost, error rate, and latency:

from roscoe.monitoring.alerts import check_and_notify
from roscoe.monitoring.notifier import build_notifier

notifier = build_notifier("slack", {"webhook_url": "https://hooks.slack.com/..."})
check_and_notify(metrics, alert_config, notifier)

Exporters — push metrics to Prometheus Pushgateway or Azure Monitor:

from roscoe.monitoring.exporters.prometheus import PrometheusPushgatewayExporter
exporter = PrometheusPushgatewayExporter(gateway_url="http://localhost:9091")
exporter.push(metrics)

Evals

Test your agent with a dataset of cases and score the results:

roscoe eval --dataset evals/test_cases.json --config agent_config.yaml

Scorers:

Tool usage — deterministic, order-aware (did the agent call the right tools?).
Output quality — LLM-as-judge, 0–10 scale. Add --judge to enable.
Hallucination — LLM-as-judge, checks output against provided context docs.

Regression diffing — compare two eval runs:

from roscoe.evals.regression import compare_runs
diff = compare_runs(report_a, report_b)
print(diff.improved)    # cases that got better
print(diff.regressed)   # cases that got worse

Test case format (evals/test_cases.json):

{
  "cases": [
    {
      "id": "weather-london",
      "input": "What's the weather in London?",
      "expected_tools": ["get_weather"],
      "expected_output": "Should mention London weather",
      "context_docs": ["London is currently 15°C and rainy."]
    }
  ]
}

Templates

Six pre-built templates, each with tools, system prompt, config, and approval gates pre-configured:

roscoe init my-hr-bot --template hr_agent
roscoe init my-it-bot --template it_support_agent
roscoe init my-legal --template legal_agent
roscoe init my-kb --template knowledge_base_agent
roscoe init my-ea --template exec_assistant_agent
roscoe init my-gws --template google_workspace_agent

Template	Use case	Connector	Approval gate
`hr_agent`	Leave, payslips, personal details	REST API	submit_leave_request
`it_support_agent`	Tickets, escalation, KB search	ServiceNow	escalate_ticket
`legal_agent`	Contract search, clause extraction, risk flags	Knowledge (RAG)	—
`knowledge_base_agent`	Q&A over Notion / SharePoint / docs	Notion + Knowledge	— (read-only)
`exec_assistant_agent`	Email, calendar, availability	Outlook (MS Graph)	send_email, create_event
`google_workspace_agent`	Gmail, Calendar, Tasks, Drive	Google Workspace	send_email, create_event, create_task

Architecture

roscoe runs its own async ReAct loop (no LangGraph dependency). The loop is ~100 lines in roscoe/core/executor.py:

User message
  → model.invoke(messages)
  → tool_calls in response?
    → approval gate check (pause if gated)
    → execute tools
    → append results to messages
    → loop back to model
  → no tool_calls? done → AgentResult

Built on LangChain (models, tools, messages) but not LangGraph. This keeps the agent loop small, transparent, and easy to debug.

CLI reference

roscoe init <name>                              # scaffold with GUI wizard
roscoe init <name> --quick                      # scaffold with defaults (no wizard)
roscoe init <name> --cli                        # scaffold with terminal wizard
roscoe init <name> --template <t>               # scaffold from a template

roscoe monitor                                  # dashboard from logs/audit.jsonl
roscoe monitor --path /path/to/audit.jsonl      # custom audit log path

roscoe eval --dataset cases.json --config agent.yaml          # tool-usage scoring
roscoe eval --dataset cases.json --config agent.yaml --judge  # + LLM-as-judge
roscoe eval --dataset cases.json --config agent.yaml --tools module:attr  # custom tools

roscoe --version                                # print version

Install

pip install roscoe                # core
pip install "roscoe[snowflake]"   # + Snowflake driver
pip install "roscoe[azure]"       # + Azure Monitor exporter
pip install "roscoe[dev]"         # + pytest (for contributors)

From source

git clone https://github.com/rhealaloo45/roscoe.git
cd roscoe
pip install -e ".[dev]"
pytest -q    # 121 tests, all passing

Writing tools

A tool is a plain Python function with type hints and a docstring:

from roscoe.tools import tool


@tool
def search_docs(query: str) -> list[dict]:
    """Search internal documents. Use when the user asks about company policies."""
    # your logic here
    return [{"title": "Remote Work Policy", "snippet": "..."}]

The docstring is what the LLM reads to decide when to call the tool.
Type hints are used to generate the JSON schema automatically.
Return dicts or primitives — the LLM reads the return value.
Add the tool to the TOOLS list in tools/my_tools.py, or pass it directly to AgentRunner.from_config().

Configuration reference

Full agent_config.yaml with all options (also generated by roscoe init with inline comments):

agent_name: my-agent

system_prompt_file: prompts/system.txt
# system_prompt: |
#   Inline prompt alternative

model:
  provider: openai                 # openai | azure_openai | anthropic | gemini | ollama
  model: gpt-4o-mini
  api_key: ${OPENAI_API_KEY}       # resolved from environment
  temperature: 0.1
  # base_url: https://openrouter.ai/api/v1   # for OpenRouter / custom endpoints
  # max_tokens: 4096

memory:
  conversation:
    enabled: true
    window_size: 10
  persistent:
    enabled: false
    backend: sqlite
    connection: ./facts.db

middleware:
  retry:
    max_attempts: 3
  rate_limiter:
    enabled: true
    requests_per_minute: 60
  cost_tracking:
    enabled: true
  audit:
    enabled: true
  # human_approval:
  #   require_approval_for: ["send_email", "delete_record"]

Environment variables use ${VAR_NAME} syntax and are resolved at config load time.

License

MIT — see LICENSE.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.2

Jun 28, 2026

0.1.1

Jun 28, 2026

0.1.0

Jun 28, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

roscoe-0.1.2.tar.gz (83.5 kB view details)

Uploaded Jun 28, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

roscoe-0.1.2-py3-none-any.whl (111.0 kB view details)

Uploaded Jun 28, 2026 Python 3

File details

Details for the file roscoe-0.1.2.tar.gz.

File metadata

Download URL: roscoe-0.1.2.tar.gz
Upload date: Jun 28, 2026
Size: 83.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.2

File hashes

Hashes for roscoe-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`23c6992775ae36abf0a8887aa5b109745578f51adae5e872771268b3e2688b5b`
MD5	`6e0d503de8f6ea923cd8174abdd120e0`
BLAKE2b-256	`59681b6bd92d97e5de03d4e9a05f00d6e82e133a309a1dba48f8abbb5e2d3a85`

See more details on using hashes here.

File details

Details for the file roscoe-0.1.2-py3-none-any.whl.

File metadata

Download URL: roscoe-0.1.2-py3-none-any.whl
Upload date: Jun 28, 2026
Size: 111.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.2

File hashes

Hashes for roscoe-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`054da7f77621e401908f1c50edea1b6e7c3e977ffe94cf9cb074e7d8be863259`
MD5	`22170badcd857302605103f9bedb71af`
BLAKE2b-256	`160ef5ad1398c3274b533032b71917437b44f76ab88df5448b0e98657cb0ff55`

See more details on using hashes here.

roscoe 0.1.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

roscoe

Quick start

Option A: Scaffold with the CLI

Option B: Code it directly

Features

Provider-agnostic

Automatic middleware

Memory

Connectors

Human-in-the-loop

Monitoring

Evals

Templates

Architecture

CLI reference

Install

From source

Writing tools

Configuration reference

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes