Skip to main content

Full-stack agent framework — voice, memory, tools, security — runs on your hardware, no cloud required

Project description

harombe

The full-stack agent framework that runs entirely on your hardware.

Manages the complete agent lifecycle — tool execution in containers, network-level security, voice I/O, persistent memory with semantic search, and multi-node cluster routing. No cloud required.

CI License Python 3.11+

Quick Start

pip install harombe
harombe init          # detects hardware, writes config
ollama pull qwen2.5:7b
harombe chat          # autonomous agent with tools

Or use it as a library:

import asyncio
from harombe.agent.loop import Agent
from harombe.llm.ollama import OllamaClient
from harombe.tools.registry import get_enabled_tools

async def main():
    llm = OllamaClient(model="qwen2.5:7b")
    tools = get_enabled_tools(shell=True, filesystem=True, web_search=True)
    agent = Agent(llm=llm, tools=tools, system_prompt="You are a helpful assistant.")
    response = await agent.run("Find all Python files in src/ and count them.")
    print(response)

asyncio.run(main())

See examples/ for more.

What is harombe?

Harombe manages the full stack for AI agents — hardware detection, tool execution (in containers), security (network-level), I/O (voice + CLI + API), memory (SQLite + vector search), and networking (multi-node clusters with service discovery). Think of it as an operating system for AI agents — as a pip install.

Security

Harombe's security is infrastructure-level, not bolt-on:

  • Every tool runs in its own Docker container with resource limits
  • Per-container network egress filtering (iptables, DNS allowlists)
  • Audit logging with automatic credential redaction
  • Human-in-the-loop approval gates with risk classification
  • Secret management via Vault/SOPS
  • ZKP-based audit proofs (experimental)

What harombe manages

Responsibility What it does
Execution ReAct agent loop, tool calling (shell, filesystem, web search, browser, code exec)
Voice I/O Whisper STT + Piper TTS, push-to-talk, VAD, WebSocket streaming
Memory SQLite conversations, ChromaDB vectors, cross-session semantic search
Security Container isolation, network filtering, audit logging, HITL gates, anomaly detection
Networking Multi-node clusters, mDNS discovery, complexity-based routing, circuit breakers
Privacy PII detection, data sanitization, local/hybrid/cloud routing modes
Extensibility MCP server + client, container-isolated plugins with auto-generated MCP scaffolds

Architecture

┌─────────────────────────────────────┐
│  Layer 6: Clients                   │  Voice, iOS, Web, CLI
├─────────────────────────────────────┤
│  Layer 5: Privacy Router            │  Hybrid local/cloud AI
│  PII detection, context sanitizer   │  local-only / hybrid / cloud
├─────────────────────────────────────┤
│  Layer 4: Agent & Memory            │  ReAct loop, tools, memory
├─────────────────────────────────────┤
│  Layer 3: Security                  │  Defense-in-depth
│  MCP Gateway, container isolation   │  Credential vault, audit log
│  Per-tool egress, secret scanning   │  HITL gates, browser pre-auth
├─────────────────────────────────────┤
│  Layer 2: Orchestration             │  Smart routing, health monitoring
│  Cluster config, mDNS discovery     │  Circuit breakers, metrics
├─────────────────────────────────────┤
│  Layer 1: Runtimes                  │  llama.cpp, Whisper, TTS, embeddings
└─────────────────────────────────────┘

Each layer only talks to its neighbors. Security (Layer 3) wraps every tool invocation — there is no path from the agent to a tool that bypasses the gateway. See ARCHITECTURE.md for full design documentation.

Security Deep Dive

Harombe enforces the Capability-Container Pattern: agents never execute tools directly. Every tool call goes through the MCP Gateway, which routes it to an isolated Docker container.

Agent  ──→  MCP Gateway  ──→  [ Container: shell ]
                          ──→  [ Container: browser ]
                          ──→  [ Container: web_search ]

Container Isolation — Each tool runs in its own Docker container with CPU/memory limits, read-only filesystem mounts, and no host network access.

Network Egress — Per-container iptables rules and DNS allowlists. A web_search container can reach DuckDuckGo; a filesystem container can reach nothing.

Audit Logging — Every tool call, approval decision, and security event is logged to SQLite with automatic redaction of API keys, passwords, and tokens.

HITL Gates — Operations are risk-classified (LOW/MEDIUM/HIGH/CRITICAL). High-risk operations require explicit human approval with default-deny on timeout.

Anomaly Detection — Per-agent Isolation Forest models learn baseline behavior and flag deviations. Integrated with SIEM (Splunk, Elasticsearch, Datadog).

See docs/security-quickstart.md for setup instructions.

Features

Agent Core

ReAct agent loop with autonomous planning, tool calling, and multi-step execution. Tools: shell, read/write files, web search, browser automation. Configurable step limits and confirmation gates.

Voice

Whisper STT (tiny to large-v3) + Piper TTS. Push-to-talk CLI (harombe voice), REST + WebSocket API, voice activity detection. Cross-platform audio I/O.

Memory & RAG

SQLite conversation persistence with token-based context windowing. ChromaDB vector store with sentence-transformers embeddings (local, no API calls). Semantic search across sessions. RAG-enabled agents auto-inject relevant context.

Distributed Clusters

Define nodes in YAML, route queries by complexity. mDNS auto-discovery, health monitoring, circuit breakers, load balancing. Works with any hardware mix: Apple Silicon, NVIDIA, AMD, CPU.

MCP Protocol

JSON-RPC 2.0 MCP server and client. Gateway-mediated tool execution with per-tool container isolation. Compatible with the broader MCP ecosystem.

Plugins

Container-isolated plugins with auto-generated MCP scaffolds. ZKP audit proofs, compliance reporting, and container-based extensions ship as built-in plugins. See src/harombe/plugins/ for examples.

Privacy

PII detection and data sanitization. Three routing modes: local-only (nothing leaves your machine), hybrid (sensitive data stays local, general queries can use cloud), and cloud. Configurable per-agent.

Examples

# Example Description
01 simple_agent.py Basic single-node agent with all tools
02 api_usage.py Programmatic agent creation and tool usage
03 data_pipeline.py Data processing with autonomous agents
04 code_review.py Automated code review workflows
05 research_agent.py Research automation with web search
06 memory_conversation.py Persistent conversation history
07 cluster_routing.py Task-based routing across nodes
08 semantic_memory.py Semantic search and RAG
09 voice_assistant.py Voice-enabled assistant (STT + TTS)

Configuration

Configuration lives at ~/.harombe/harombe.yaml. Minimal example:

model:
  name: qwen2.5:7b
  temperature: 0.7

tools:
  shell: true
  filesystem: true
  web_search: true
  confirm_dangerous: true

memory:
  enabled: true
  storage_path: ~/.harombe/memory.db

All fields have sensible defaults — you can run with no config at all. See harombe.yaml.example for the full reference.

Status

Production Experimental Planned
ReAct agent loop Hardware security modules iOS/Web clients
Tool execution (shell, fs, web) Distributed cryptography Multi-modal vision
Container isolation + egress
Code execution sandbox
ZKP audit proofs
Audit logging + secret management
HITL approval gates
Conversation memory + RAG
Voice (STT + TTS)
Multi-node clusters
MCP server + client
Privacy router + PII detection
Anomaly detection + SIEM

2400+ tests. Python 3.11-3.13. CI on Ubuntu + macOS. See docs/roadmap.md for full phase history.

Development

git clone https://github.com/smallthinkingmachines/harombe.git
cd harombe && pip install -e ".[dev]"
pytest

See docs/DEVELOPMENT.md for detailed setup and docs/CONTRIBUTING.md for contribution guidelines.

License

Apache 2.0 — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

harombe-0.3.0.tar.gz (903.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

harombe-0.3.0-py3-none-any.whl (381.7 kB view details)

Uploaded Python 3

File details

Details for the file harombe-0.3.0.tar.gz.

File metadata

  • Download URL: harombe-0.3.0.tar.gz
  • Upload date:
  • Size: 903.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for harombe-0.3.0.tar.gz
Algorithm Hash digest
SHA256 9b41cc48aa1f7b107a752b9e527b3b774bf264f9ba24f873822ab016e94c716b
MD5 fd7ec51c8ee96b51df2af956b44ef78f
BLAKE2b-256 cac05370cc748adf97a8364aec1feedf7fc0cd346487aedde9cd369f68354d97

See more details on using hashes here.

Provenance

The following attestation bundles were made for harombe-0.3.0.tar.gz:

Publisher: publish.yml on smallthinkingmachines/harombe

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file harombe-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: harombe-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 381.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for harombe-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9219d77a8887773a8f6fffb619900a3e49bf8fd3269de4c45cd04fbbedf26618
MD5 e19607d9b10e369e13088cb2f768f72a
BLAKE2b-256 1f6e2efc746a9d5bb484aaf0dcd6782b60ae98e0ba21998251df8a8f202fe12f

See more details on using hashes here.

Provenance

The following attestation bundles were made for harombe-0.3.0-py3-none-any.whl:

Publisher: publish.yml on smallthinkingmachines/harombe

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page