Skip to main content

Minimal agent framework. Explicit state, policy-first tools, honest uncertainty.

Project description

BareBear
Every step leaves a print.

Minimal agent framework. Explicit state, policy-first tools, honest uncertainty.

Tests License: MIT Python 3.9+ PyPI

BareBear demo

What is BareBear?

BareBear is a Python framework for building LLM agents that you can actually trust in production. It gives you 7 primitives, a policy engine, and a run receipt — nothing else. Every tool call is checked against policy before execution. Every run produces a traceable report. The agent knows what it doesn't know, and tells you.

Why?

Most agent frameworks give you magic orchestration, invisible state, and tools that fire without guardrails. That works great in demos. It falls apart when a customer asks why the agent sent that email, or when your bill spikes because the loop ran 200 steps.

BareBear is built on different assumptions:

  • Tools are dangerous. Every tool declares its risk level and side effects. Policy checks happen before execution, not after.
  • State should be visible. No hidden memory, no magic context. State is a dict you can inspect, snapshot, and diff.
  • Autonomy needs bounds. Step limits, cost caps, token budgets, approval gates — all declared upfront in policy.
  • Runs should be receipts. Every run produces a Report with steps, costs, assumptions, and uncertainties. You can audit it, log it, replay it.
  • Uncertainty is data, not a bug. If the agent isn't sure, it says so. Assumptions are tracked explicitly.

Quickstart

pip install barebear
from barebear import Bear, Task, Policy, Tool, MockModel

def greet(name: str) -> str:
    return f"Hello, {name}!"

bear = Bear(
    model=MockModel(),
    tools=[Tool("greet", fn=greet, description="Greet someone by name")],
    policy=Policy(max_steps=5, max_cost_usd=0.05),
)

result = bear.run(Task(goal="Greet the user Alice"))
print(result.summary())
==================================================
  BAREBEAR RUN REPORT
==================================================
  Task ID:    a1b2c3d4
  Status:     completed
  Steps:      2
  Tokens:     160
  Cost:       $0.0000
  Duration:   0.00s
--------------------------------------------------
  STEPS:
    1. [tool_call] Called greet (tool: greet)
    2. [response] Task completed successfully.
--------------------------------------------------
==================================================

No API key needed — MockModel auto-calls available tools and produces a final response. Swap in OpenAIModel("gpt-4o-mini") or OpenRouterModel("meta-llama/llama-4-scout") when you're ready for the real thing.

The 7 Primitives

Primitive What it does
Bear The agent. Holds model, tools, policy, state. Runs tasks.
Task A unit of work: a goal string, input dict, optional context.
State Explicit key-value store with snapshot history and change tracking.
Tool A callable with declared risk, side effects, and approval requirements.
Policy Constraints: step limits, cost caps, blocked tools, approval lists.
Checkpoint A saved pause-point for human approval before high-risk actions.
Report The run receipt: every step, token count, cost, assumptions, uncertainties.

That's it. No chains, no graphs, no planners, no routers. Just the bones.

Features

Policy-first tool execution

Every tool call passes through policy before it runs. Block tools, require approval, or reject external side effects — all in one declaration.

policy = Policy(
    max_steps=10,
    blocked_tools=["delete_account"],
    require_approval_for=["send_email"],
    allow_external_side_effects=False,
)

Budget tracking

Step counts, tool calls, token usage, and dollar cost — all tracked against limits you set. If a run hits its budget, it stops cleanly with a budget_exceeded status.

policy = Policy(max_steps=8, max_cost_usd=0.10, max_tokens=50000)

Checkpoint / approval gates

High-risk tools pause the run and create a Checkpoint. You approve or reject, then resume.

bear = Bear(
    model=model,
    tools=[
        Tool("send_email", fn=send_email, risk="high",
             side_effects="external", requires_approval=True),
    ],
    policy=Policy(require_approval_for=["send_email"]),
)

result = bear.run(task)

if result.status == "paused":
    checkpoint = bear.checkpoints.get(result.checkpoint_id)
    result = bear.resume(checkpoint, approved=True)

Side-effect staging

Tools declare side_effects="none", "internal", or "external". Policy can block external side effects entirely, so your agent can propose actions without executing them.

Tool("propose_patch", fn=propose, description="Suggest a code change",
     side_effects="none")
Tool("apply_patch", fn=apply, description="Apply the change",
     side_effects="external")

Run receipts

Every bear.run() returns a Report — a full trace you can print, serialise to JSON, or store for auditing.

result = bear.run(task)

print(result.summary())       # human-readable receipt
print(result.to_json())       # full JSON trace
print(result.total_cost_usd)  # what it cost
print(result.steps)            # list of every step taken

Honest uncertainty

Bears track what they don't know. Assumptions and missing information are first-class data in the report, not buried in log noise.

result = bear.run(task)
print(result.assumptions)     # ["Customer tier assumed from email domain"]
print(result.uncertainties)   # ["Could not verify account status"]

Examples

All examples run with MockModel by default — no API keys needed. Pass --live to use OpenAI.

export PYTHONPATH=src
Example What it shows Run it
Research Assistant Tool use, multi-step reasoning, report generation python examples/research_assistant/run.py
Email Approval requires_approval, checkpoint gates, side-effect policy python examples/email_approval/run.py
File Patcher Side-effect staging (propose allowed, apply blocked) python examples/file_patcher/run.py
Ticket Triage Multi-step classification, budget tracking across calls python examples/ticket_triage/run.py

Architecture

┌─────────────────────────────────────────────────┐
│                   bear.run(task)                 │
├─────────────────────────────────────────────────┤
│                                                 │
│   Task ──► System Prompt + User Message         │
│                     │                           │
│                     ▼                           │
│              ┌─────────────┐                    │
│              │  Run Loop   │◄── Budget check    │
│              └──────┬──────┘    each step       │
│                     │                           │
│          ┌──────────┴──────────┐                │
│          ▼                     ▼                │
│     Text response         Tool calls            │
│     ──► Report            ──► Policy check      │
│                               │                 │
│                    ┌──────────┼──────────┐      │
│                    ▼          ▼          ▼      │
│                 Allowed    Blocked    Approval   │
│                 ──► Execute ──► Skip  ──► Pause  │
│                     │                    │      │
│                     ▼                    ▼      │
│                Tool result          Checkpoint   │
│                ──► Loop              (resume)    │
│                                                 │
├─────────────────────────────────────────────────┤
│  State (explicit, snapshotable, diffable)       │
│  Budget (steps, tokens, cost — all tracked)     │
│  Report (full receipt of everything that happened)│
└─────────────────────────────────────────────────┘

See docs/architecture.md for the full breakdown.

Philosophy

Most frameworks optimise for demos. BareBear optimises for reality.

Read the manifesto.

Contributing

See CONTRIBUTING.md. We welcome PRs, bug reports, and honest feedback.

License

MIT — Richey Malhotra, 2026.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

barebear-0.1.1.tar.gz (899.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

barebear-0.1.1-py3-none-any.whl (23.3 kB view details)

Uploaded Python 3

File details

Details for the file barebear-0.1.1.tar.gz.

File metadata

  • Download URL: barebear-0.1.1.tar.gz
  • Upload date:
  • Size: 899.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for barebear-0.1.1.tar.gz
Algorithm Hash digest
SHA256 c53403dfd43bb9ae1fc780797769b658b630e9a78a0eee098ec3b5775a220162
MD5 06252a4a079eaac45e11b4303fd0b4ad
BLAKE2b-256 5ed67ab0baa6e02e316d9266b203683229c0b2830f6b05f6df8b2096c955b694

See more details on using hashes here.

File details

Details for the file barebear-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: barebear-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 23.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for barebear-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 36aa478afec6fb925e4ef7813e7e451076669805a782c85be9b5e3c9fdbdfa8d
MD5 147eef3f61425435e9f03f7d80908d86
BLAKE2b-256 e3af3bfeb5bb9a1d8d6ca2054a9c0121de871dc582df63460131032a476d0923

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page