Skip to main content

The Hummingbird Coding Agent โ€” minimalist, powerful, model-agnostic, and composable.

Project description

๐Ÿฆ hummcode

The Hummingbird Coding Agent โ€” minimalist, powerful, model-agnostic, and composable.

PyPI Python 3.10+ License: MIT Observability: Langfuse

hummcode is an open-source AI coding agent built from first principles. No framework magic. No black boxes. Just a clean, composable core โ€” a fast, precise tool you can fully read, understand, and extend.

Inspired by the hummingbird: tiny, aerodynamic, capable of hovering with perfect surgical precision, yet powerful enough to outperform much larger systems.


Table of Contents


Why hummcode?

Most coding agents are black boxes sitting on top of heavyweight frameworks. You can't see inside them, can't trust what they do with your files, and can't change how they think.

hummcode was built differently โ€” studying reference implementations like Geoffry Huntley's Go coding agent, the Pi coding agent, and TeenyCode โ€” then synthesising the best ideas into a clean, pip-installable Python agent.

Every line has a reason. Every decision has an alternative it beat.

Principle What it means
Minimal at core A while True loop + a small set of composable primitive tools. No 400-line framework initialisation.
Model agnostic LiteLLM under the hood. Swap Claude, GPT-4o, Mistral, or Ollama by changing one env variable.
Context is king Tree-based memory with sliding-window + LLM summarisation. The agent never crashes on token limits.
Observable by default Every token, tool call, and cost traced to Langfuse from day one โ€” not bolted on later.
Permission-first Dangerous operations (file edits, bash) require explicit approval. No silent mutations.
Language agnostic The agent reads and edits plain text files. It works on Python, Go, Rust, TypeScript, or any language.

How It Works

Traditional coding agents run a single LLM call and hope for the best. hummcode runs a structured agent loop:

   Simple Chatbot                     hummcode Agent
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”           โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ User โ†’ LLM       โ”‚           โ”‚ User โ†’ CLI Router          โ”‚
โ”‚ โ†’ text response  โ”‚           โ”‚   โ†’ SessionTree (memory)   โ”‚
โ”‚ (no tools)       โ”‚           โ”‚   โ†’ Context Compaction     โ”‚
โ”‚ (no memory)      โ”‚           โ”‚   โ†’ Inner Execution Loop:  โ”‚
โ”‚ (no permissions) โ”‚           โ”‚     LLM โ†’ tool_calls?      โ”‚
โ”‚                  โ”‚           โ”‚     Yes โ†’ Permission Gate  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜           โ”‚       โ†’ ToolRegistry       โ”‚
                               โ”‚       โ†’ Result โ†’ Loop      โ”‚
                               โ”‚     No  โ†’ Final answer     โ”‚
                               โ”‚   โ†’ TUI (chat + tool pane) โ”‚
                               โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
Simple Chatbot hummcode
Memory Flat list, crashes at limit Tree-based with rewind + compaction
Tools None read, list, edit (surgical), bash, oracle
Permissions Silent โ€” does whatever Gated โ€” y/n/all before any file or shell action
Model Hardcoded Agnostic โ€” swap mid-session with /model
Observability print() statements Full Langfuse trace โ€” tokens, cost, latency
UI Terminal print Dual-pane Textual TUI + headless CLI mode
Multi-model No Oracle pattern โ€” delegate sub-tasks to cheaper model

Technical Architecture

hummcode is built around a core insight: a coding agent is a loop, not a chain. The LLM doesn't finish when it generates text โ€” it finishes when it has no more tools to call. Every architectural decision flows from managing that loop safely, cheaply, and transparently.

System Overview

flowchart TD
    UI[User Input\nTUI or --cli] --> ROUTER[CLI Command Router\n/model /clear /rewind /key]
    ROUTER -->|slash command| SYSTEM[System Response\nyield event]
    ROUTER -->|prompt| TREE[SessionTree\nadd_message]
    TREE --> COMPACT{Token Limit\nExceeded?}
    COMPACT -->|yes| COMPACTOR[Sliding Window Compaction\nOracle summarises old nodes]
    COMPACT -->|no| LOOP
    COMPACTOR --> LOOP

    subgraph LOOP [Inner Agent Execution Loop]
        LLM[LLMClient.generate\nLiteLLM โ†’ any provider]
        CHECK{tool_calls\nin response?}
        GATE[PermissionManager\ny / n / all]
        REG[ToolRegistry.execute\nasync dispatcher]
        RESULT[Append tool result\nto SessionTree]
        LLM --> CHECK
        CHECK -->|yes| GATE
        GATE -->|approved| REG
        REG --> RESULT
        RESULT --> LLM
        CHECK -->|no| DONE[yield final message]
    end

    DONE --> DISPLAY[TUI Chat Pane\nor stdout]
    RESULT --> TOOLLOG[TUI Tool Log Pane]
    LLM -->|every call| LANGFUSE[Langfuse\nTokens ยท Cost ยท Latency]

The system is divided into four logical layers connected by unidirectional data flow. The input layer (top) handles both typed prompts and slash commands, routing each to the correct handler before any LLM token is spent. The memory layer manages the SessionTree and decides whether compaction is needed before inference. The execution loop (centre) is the agent's heartbeat โ€” it runs until the LLM produces a response with no tool calls. The output layer routes events to the correct UI pane or stdout.

The critical observation is that the LLM is called inside the loop, not once per user turn. A single user prompt may trigger 5โ€“6 LLM calls as the agent reads files, checks outputs, and refines its approach. The ToolRegistry and PermissionManager sit between each call and the filesystem, ensuring no action is taken without a valid tool schema and, for dangerous operations, explicit user consent.


Agent Execution Loop

sequenceDiagram
    participant User
    participant Core as HummcodeAgent
    participant Memory as SessionTree
    participant LLM as LLMClient (LiteLLM)
    participant Perm as PermissionManager
    participant Reg as ToolRegistry
    participant Langfuse

    User->>Core: "Refactor auth.py to use JWT"
    Core->>Memory: add_message(user)
    Core->>Memory: compact()? token check
    Memory-->>Core: context list (walk-to-root)

    loop Inner Execution Loop
        Core->>LLM: generate(context, tools)
        LLM-->>Langfuse: trace tokens + cost + latency
        LLM-->>Core: ai_message

        alt tool_calls present
            Core->>Memory: add_message(ai_message)
            Core->>Perm: check_permission(tool, details)
            Perm-->>User: Modal or [y/n/all] prompt
            User-->>Perm: approved
            Perm-->>Core: True
            Core->>Reg: execute(tool_name, args)
            Reg-->>Core: result string
            Core->>Memory: add_message(tool_result)
            Note over Core: Loop restarts โ€” result fed back to LLM
        else no tool_calls
            Core->>Memory: add_message(ai_message)
            Core-->>User: yield final message
            Note over Core: Inner loop exits
        end
    end

The sequence shows why the inner loop is essential. When the agent reads auth.py, it generates a tool call for read_file. The result is appended to the tree and the LLM is called again โ€” now with the file contents in context. It then generates a tool call for edit_file. The permission gate pauses execution. Once approved, the edit is applied, the result is appended, and the LLM is called a third time to confirm the change looks correct. Only then does it produce a plain-text final response and exit the loop.

Why this matters: Without an inner loop, the agent could only call one tool per user turn. Real coding tasks require sequences: list files โ†’ read file โ†’ edit file โ†’ run tests โ†’ fix errors. The inner loop handles this naturally, and the SessionTree keeps the full chain of evidence in context for every subsequent LLM call.


Tool Execution & Permission Flow

sequenceDiagram
    participant LLM as LLM Response
    participant Core as Agent Loop
    participant Perm as PermissionManager
    participant User as User (TUI Modal or CLI)
    participant Reg as ToolRegistry
    participant FS as Filesystem / Shell

    LLM->>Core: tool_call: execute_bash("pytest tests/")
    Core->>Perm: check_permission("execute_bash", "pytest tests/")

    alt auto_approve = True (user chose "all" earlier)
        Perm-->>Core: True (auto, no prompt)
    else auto_approve = False
        Perm->>User: "execute_bash: pytest tests/" [y/n/all]
        alt User chooses "y"
            User-->>Perm: yes
            Perm-->>Core: True
        else User chooses "n"
            User-->>Perm: no
            Perm-->>Core: False
            Core-->>LLM: "Error: user denied permission. Ask what to do next."
        else User chooses "a"
            User-->>Perm: all
            Perm->>Perm: auto_approve = True
            Perm-->>Core: True
        end
    end

    Core->>Reg: execute("execute_bash", args)
    Reg->>FS: subprocess.run("pytest tests/", timeout=120)
    FS-->>Reg: stdout + stderr + returncode
    Reg-->>Core: result string
    Core->>Core: add_message(tool_result)

The permission flow shows three distinct paths. The auto-approve fast path (user already chose "all") has zero overhead โ€” it never prompts again for the rest of the session. The deny path is critical: rather than crashing or silently skipping, the agent receives the denial as a tool result and adjusts its plan. The approve path feeds the result back into the loop.

Why this matters: Coding agents that execute bash without gates are dangerous. The PermissionManager is not an optional safety add-on โ€” it's a first-class architectural component. Its stateful auto_approve flag eliminates consent fatigue once you've established trust for a session, without compromising safety for users who haven't.


Memory Tree & Compaction

graph TD
    ROOT["๐ŸŸข ROOT\nsession start"]
    N1["Node: User โ€” refactor auth\nid: a1"]
    N2["Node: Claude โ€” I'll read auth.py\nid: a2"]
    N3["Node: Tool โ€” read_file result\nid: a3"]
    N4["Node: Claude โ€” edit line 42\nid: a4"]
    N5_DEAD["Node: Tool denied โŒ\nid: a5  dead branch"]
    N5["Node: Tool edit approved โœ…\nid: a5-prime"]
    N6["Node: User โ€” now add tests\nid: b1"]
    SUMMARY["๐Ÿ”ต SUMMARY NODE\nLLM-generated compaction\nof old context"]

    ROOT --> N1
    N1 --> N2
    N2 --> N3
    N3 --> N4
    N4 --> N5_DEAD
    N4 --> N5
    N5 --> N6
    SUMMARY -.->|replaces old nodes above threshold| N6

    style N5_DEAD fill:#ff6b6b,color:#fff
    style N5 fill:#51cf66,color:#fff
    style SUMMARY fill:#339af0,color:#fff
    style ROOT fill:#a7f3d0,color:#333

The memory tree shows three key properties. First, every message is a Node with a parent_id โ€” not a position in an array. Second, the dead branch (denied tool call) still exists in self.nodes but is unreachable from the active leaf, so it is never included in get_llm_context(). Third, when token count crosses the threshold, compact() summarises all nodes outside the sliding window into a single blue Summary Node, which becomes the new root. The sliding window nodes are relinked to point to this summary as their new parent.

Why a tree over a flat list:

Aspect Flat List (messages = []) SessionTree (hummcode)
Failed attempts Permanently in context โ€” confuses the LLM Excluded if on a dead branch
Rewinding Impossible without manual splicing memory.rewind(node_id) โ€” one line
Token overflow Crashes with 400 Bad Request compact() summarises old nodes seamlessly
Branching Cannot explore two approaches simultaneously Natural โ€” just move the leaf pointer
Auditability Context is whatever was appended Full tree preserved; every node inspectable

Subsystem Breakdown

1. Core Event Loop (core.py)

HummcodeAgent is the orchestrator. Its async process_prompt() method is a generator โ€” it yields typed events (status, tool_result, message, system) rather than printing them directly.

Why generators, not print: The TUI and CLI consume the same agent. The TUI routes status events to the right-hand tool pane and message events to the left chat pane. The CLI prints everything to stdout. The agent brain knows nothing about how it's being displayed โ€” it just yields typed events. Swapping the TUI for a web API requires zero changes to core.py.


2. LLM Client (llm.py)

LLMClient wraps LiteLLM's completion() call into a single generate(messages, tools, model) method. The default_model is loaded from DEFAULT_MODEL env var at instantiation. It can be overridden mid-session via /model.

Why LiteLLM: LiteLLM provides a single unified API surface for 100+ providers. Switching from anthropic/claude-sonnet-4-5-20250929 to openai/gpt-4o to ollama/llama3 requires changing one string. The llm.py wrapper also creates a single choke point for future features: retry logic, fallback models, and cost budgeting can all be added here without touching core.py.


3. Tool Registry (tools/registry.py) โญ

ToolRegistry exposes two static methods: get_tools() returns the LiteLLM-compatible JSON schema array; async execute() parses arguments, applies permission gates, dispatches to the correct Python function, and returns a result string.

Why a registry, not inline if/elif: Without the registry, core.py contained 80+ lines of tool dispatch logic. With it, the inner loop collapses to three lines regardless of how many tools exist. Adding a new tool is: write the function, register the schema and route. The core loop never changes.


4. Tool Primitives (tools/file_ops.py, tools/shell.py) โญ

Four composable primitive tools โ€” each does exactly one thing:

Tool What it does Why it's designed this way
read_file Read a file's full contents No truncation โ€” the LLM decides what's relevant. Truncating at 1,000 lines hides the bug on line 1,001.
list_files Recursive listing, skipping noisy dirs node_modules can contain 50,000 files. Skip them or lose the whole context window.
edit_file Surgical search-and-replace 500-line file, 5-line change = 5 tokens of diff. Full rewrite = 500 tokens + regression risk. Uniqueness validation prevents wrong-occurrence edits.
execute_bash Any shell command, 120s timeout Returns non-zero exit as a string so the LLM can self-correct, not crash.

5. Oracle Pattern (tools/oracle.py) โญ

ask_oracle delegates isolated questions to a secondary model. The oracle receives only the question โ€” not the full SessionTree history. Defaults to ORACLE_MODEL env var, falls back to the main model if not set.

Why this matters: The main agent's context window fills with tool results and conversation history. Delegating isolated lookups (summarise this document, look up this API) to a cheap secondary model keeps the main context lean. If you only have one API key, the oracle falls back gracefully โ€” it never crashes.


6. Permission Manager (tools/permissions.py)

PermissionManager holds an auto_approve flag and an optional ask_callback. In CLI mode: input() blocks for a keystroke. In TUI mode: HummcodeApp injects an ask_callback that pops a Textual ModalScreen and awaits a button click without freezing the UI.

Why a callback pattern: The agent brain doesn't know it's talking to a TUI. Replacing the TUI modal with a web API permission endpoint is a one-line change: set a different ask_callback. Zero changes to core.py or registry.py.


7. Memory System (memory.py) โญ

SessionTree manages a dict of Node objects and a current_leaf_id pointer. get_llm_context() walks parent_id from leaf to root and reverses โ€” producing the correct linear history for LiteLLM regardless of how many branches or rewinds have occurred. See Memory System for full detail.


8. Terminal UI (ui/tui.py)

HummcodeApp extends Textual's App. The @work decorator runs on_input_submitted as a background async worker โ€” the UI stays responsive while the LLM thinks. Events from process_prompt() are routed: message and system to the chat pane; status and tool_result to the tool log pane.

Why separate panes: Tool noise (Thinking..., [๐Ÿ”ง list_files], Result:...) would overwhelm the chat conversation if mixed together. Routing them to the right pane keeps the left side a clean, readable conversation history.


Key Design Decisions

Decision Alternative Considered Reason
Tree-based session memory Flat messages = [] list Dead branches excluded from context; rewind without list surgery; compaction without index gymnastics
Surgical search-and-replace edit Full file rewrite Tokens scale with change size, not file size; no risk of unchanged-line regression
ToolRegistry dispatcher Inline if/elif in core loop Core loop stays 3 lines regardless of tool count; tools are independently testable
LiteLLM abstraction Direct Anthropic/OpenAI SDK Single API surface for 100+ providers; model swap is a one-string change
async generator (yield events) Direct print() calls Brain is display-agnostic; same agent powers TUI, CLI, and future web API
Permission ask_callback injection Direct input() calls in agent TUI modal and CLI prompt are interchangeable; zero changes to agent or registry
Oracle fallback to main model Crash if ORACLE_MODEL not set Works out of the box with one key; cheap oracle model is an upgrade, not a requirement
Skip noisy dirs in list_files List everything node_modules/.git blow up context window before a single source file is read

Features

Feature TUI --cli
Model-agnostic LLM (100+ providers via LiteLLM) โœ… โœ…
Tree-based memory with rewind โœ… โœ…
Sliding window + LLM compaction โœ… โœ…
read_file, list_files tools โœ… โœ…
Surgical edit_file (search-and-replace) โœ… โœ…
execute_bash with timeout โœ… โœ…
Oracle pattern (secondary model) โœ… โœ…
Permission gates (y/n/all) โœ… Modal โœ… stdin
Langfuse observability โœ… โœ…
Slash commands (/model, /key, /rewindโ€ฆ) โœ… โœ…
Dual-pane TUI (chat + tool log) โœ… โ€”
Extensible CSS themes โœ… โ€”
BYOK (keys never leave your machine) โœ… โœ…

Installation

Requirements: Python 3.10+

pip install hummcode

Quick Start

1. Create a .env file in your working directory:

# At least one LLM provider key is required
ANTHROPIC_API_KEY=sk-ant-...

# Optional: other providers (model switching, oracle)
OPENAI_API_KEY=sk-...

# Default model (LiteLLM format)
DEFAULT_MODEL=anthropic/claude-sonnet-4-5-20250929

# Optional: dedicated cheaper model for oracle + compaction
ORACLE_MODEL=anthropic/claude-haiku-4-5

# Optional: Langfuse observability
LANGFUSE_PUBLIC_KEY=pk-lf-...
LANGFUSE_SECRET_KEY=sk-lf-...
LANGFUSE_BASE_URL=https://cloud.langfuse.com

2. Launch:

# Rich TUI (default)
hummcode

# Headless CLI
hummcode --cli

3. Try your first tasks:

You: List the files in this project, then read the pyproject.toml
You: Create a new file called hello.py that prints "Hello from hummcode!"
You: Run it with bash and show me the output
You: /model openai/gpt-4o
You: Now rewrite hello.py in TypeScript

Configuration

Variable Required Description Default
ANTHROPIC_API_KEY Yes* Anthropic API key โ€”
OPENAI_API_KEY No OpenAI API key โ€”
DEFAULT_MODEL No Primary model string (LiteLLM format) anthropic/claude-sonnet-4-5-20250929
ORACLE_MODEL No Model for oracle calls and compaction Falls back to DEFAULT_MODEL
LANGFUSE_PUBLIC_KEY No Langfuse public key โ€”
LANGFUSE_SECRET_KEY No Langfuse secret key โ€”
LANGFUSE_BASE_URL No Langfuse host https://cloud.langfuse.com

*At least one provider key required. Works with Anthropic, OpenAI, Mistral, Ollama, and any other LiteLLM-supported provider.

Supported model string examples:

anthropic/claude-sonnet-4-5-20250929
anthropic/claude-haiku-4-5
openai/gpt-4o
openai/gpt-4o-mini
ollama/llama3
mistral/mistral-large-latest

โš ๏ธ Security note: litellm versions 1.82.7 and 1.82.8 contained a supply chain backdoor (March 2026). hummcode pins >=1.83.0 in pyproject.toml. Always install from PyPI โ€” never from an unofficial mirror.


Running hummcode

TUI Mode (default)

hummcode
โ”Œโ”€ Hummcode โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ Active: claude-sonnet-4-5 โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                                                                                      โ”‚
โ”‚  Chat                                         โ”‚  Tool Log                            โ”‚
โ”‚  โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€    โ”‚  โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€  โ”‚
โ”‚  ๐Ÿฆ Welcome to Hummcode!                      โ”‚                                      โ”‚
โ”‚  The Hummingbird Coding Agent built           โ”‚  Thinking...                         โ”‚
โ”‚  from first principles.                       โ”‚                                      โ”‚
โ”‚                                               โ”‚  [๐Ÿ”ง] list_files('.')                โ”‚
โ”‚  You: list files and read pyproject.toml      โ”‚  Result: .env .gitignore src/...     โ”‚
โ”‚                                               โ”‚                                      โ”‚
โ”‚  Hummcode: Here are your project files.       โ”‚  [๐Ÿ”ง] read_file('pyproject.toml')    โ”‚
โ”‚  Your pyproject.toml configures...            โ”‚  Result: [project] name=hummco...    โ”‚
โ”‚                                               โ”‚                                      โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
  Type a message, or type / for commands...
  Commands: /info | /list | /model | /key | /clear | /rewind

Headless CLI Mode

hummcode --cli

Standard terminal I/O โ€” useful for SSH sessions, scripting, or piping output.


CLI Commands

Type any command directly into the input bar. Tab-autocomplete works after typing /.

Command What it does Why it exists
/info Show hummcode version and tagline Sanity check โ€” confirm which version is running
/list List all slash commands Discoverability โ€” no hidden commands
/model <provider/name> Switch LLM mid-session Test a cheaper model on a simple task; upgrade for a complex one
/key <NAME> <value> Save an API key to .env live Add a new provider without restarting
/clear Reset SessionTree, start fresh Clear context when switching to an unrelated task
/rewind Undo last turn โ€” move tree pointer back Abandon a bad approach without the failed attempt poisoning the context
exit, quit, :q Exit hummcode Standard exit conventions

Tools Reference

hummcode exposes a small, composable set of primitive tools. The LLM decides which to call and when.

read_file

Read the full contents of a file by relative path.

Args:    path (str)
Returns: file contents as string, or descriptive error

Why no truncation: The LLM is better at deciding what's relevant in a file than a truncation heuristic. Truncating at 1,000 lines silently hides the bug on line 1,001.


list_files

Recursive directory listing. Automatically skips .git, .venv, __pycache__, node_modules, build, dist.

Args:    path (str, optional) โ€” defaults to "."
Returns: sorted newline-separated list of relative paths

Why skip those directories: A node_modules folder can contain 50,000 files. Including it consumes the entire context window before the agent reads a single source file.


edit_file โญ

Surgical search-and-replace. The LLM provides the exact old_str to replace with new_str. Validates uniqueness before writing โ€” refuses if the string appears 0 or 2+ times.

Args:       path (str), old_str (str), new_str (str)
Returns:    success message, or descriptive error
Permission: required โ€” prompts before any write

Passing an empty old_str creates a new file or appends to an existing one.

Why surgical over full rewrite: A full-file rewrite for a 500-line file costs 500 lines of tokens. A surgical edit for changing one function signature costs 5 lines. Uniqueness validation prevents the LLM from accidentally editing the wrong occurrence.


execute_bash โญ

Run any bash command. Returns combined stdout + stderr. Prefixes with Command failed (exit code N) on non-zero exit, so the LLM can self-correct. Hard timeout: 120 seconds.

Args:       command (str)
Returns:    combined output string
Permission: required โ€” prompts before any execution

Why 120 seconds: Long enough for pytest, cargo build, or npm install. Short enough to prevent a hung process from blocking the session indefinitely.


ask_oracle

Delegate an isolated question or sub-task to a secondary model. The oracle receives only the question โ€” not the full conversation history.

Args:       question (str), model (str, optional)
Returns:    oracle's answer string
Permission: not required โ€” only makes an API call

Why this matters: If the agent has accumulated 30,000 tokens of context, asking it to also summarise a README wastes expensive capacity. The oracle handles isolated lookups cheaply. Falls back to the main model if ORACLE_MODEL is not set โ€” never crashes.


Memory System

The Problem with Flat Lists

Every coding agent tutorial starts with messages = []. It works for five turns. Then:

  • The agent reads a large file โ†’ token count spikes
  • The agent tries five different bug fixes โ†’ all five failed attempts stay visible to the LLM
  • The token limit is hit โ†’ 400 Bad Request

hummcode solves all three with the SessionTree.

Tree-Based Memory

Every message becomes a Node:

@dataclass
class Node:
    data: Dict[str, Any]      # The message: role, content, tool_calls, etc.
    parent_id: Optional[str]  # Links to the previous message
    id: str                   # UUID โ€” unique identifier

get_llm_context() walks parent_id from the active leaf to the root and reverses the list โ€” producing the correct linear history for LiteLLM regardless of branches or rewinds.

Rewinding

/rewind

Moves current_leaf_id back to the node recorded before your last prompt. The failed branch still exists in self.nodes but is unreachable from the active path โ€” the LLM never sees it again.

Context Compaction

When accumulated tokens cross max_tokens (default: 40,000):

  1. All nodes outside the sliding window (default: last 10 turns) are collected
  2. Their content is sent to ORACLE_MODEL with a summarisation prompt
  3. A new Summary Node is created from the oracle's summary
  4. The oldest sliding-window node has its parent_id relinked to the Summary Node
  5. All old nodes become unreachable โ€” cleanly dropped from future context

The tree is surgically trimmed. The LLM sees compact summary of the past, full detail for the last 10 turns.


Observability

hummcode integrates Langfuse automatically when your keys are set. Because LiteLLM's success_callback is used, every generate() call is traced with zero extra code in the agent logic:

litellm.success_callback = ["langfuse"]

Each trace includes: full prompt + response, token counts (input + output), cost in USD, tool call chains, and latency per call.

No Langfuse account? Omit the keys from .env. hummcode works identically โ€” observability is additive, never required.


UI Themes

All styling lives in .tcss (Textual CSS) files in src/hummcode/ui/themes/. Zero style lives in Python code โ€” making themes fully portable and community-contributable.

Default: "Executive Cyber" theme

Element Colour
Background Deep charcoal #1e1e1e
Header / status bar Slate dark #0f172a
Chat accent (You) Cyan #38bdf8
Agent responses Emerald #a7f3d0
Tool log Slate #475569
Permission warnings Amber #fbbf24

To create a custom theme:

cp src/hummcode/ui/themes/default.tcss src/hummcode/ui/themes/mytheme.tcss
# Edit colours in mytheme.tcss
# Update CSS_PATH in tui.py to point to mytheme.tcss

Community theme contributions are welcome โ€” open a PR with your .tcss file.


Project Structure

hummcode/
โ”œโ”€โ”€ pyproject.toml               # Package config, deps, CLI entry point
โ”œโ”€โ”€ PLAN.md                      # Living architecture document
โ”œโ”€โ”€ AGENTS.md                    # Agent system prompt and behavioural rules
โ”œโ”€โ”€ .env                         # Your API keys (gitignored)
โ”œโ”€โ”€ .env.example                 # Template โ€” copy to .env to start
โ”œโ”€โ”€ .gitignore
โ””โ”€โ”€ src/
    โ””โ”€โ”€ hummcode/
        โ”œโ”€โ”€ __init__.py
        โ”œโ”€โ”€ core.py              # HummcodeAgent, async process_prompt(), main()
        โ”œโ”€โ”€ llm.py               # LLMClient โ€” LiteLLM wrapper
        โ”œโ”€โ”€ memory.py            # Node, SessionTree, compact()
        โ”œโ”€โ”€ tools/
        โ”‚   โ”œโ”€โ”€ __init__.py
        โ”‚   โ”œโ”€โ”€ registry.py      # ToolRegistry: schemas + async dispatcher
        โ”‚   โ”œโ”€โ”€ file_ops.py      # read_file, list_files, edit_file
        โ”‚   โ”œโ”€โ”€ shell.py         # execute_bash
        โ”‚   โ”œโ”€โ”€ oracle.py        # ask_oracle (secondary model pattern)
        โ”‚   โ””โ”€โ”€ permissions.py   # PermissionManager (async + callback)
        โ””โ”€โ”€ ui/
            โ”œโ”€โ”€ __init__.py
            โ”œโ”€โ”€ tui.py           # HummcodeApp, PermissionModal
            โ””โ”€โ”€ themes/
                โ””โ”€โ”€ default.tcss # "Executive Cyber" dark theme

Key dependencies:

Package Version Purpose
litellm >=1.83.0 Model-agnostic LLM API (100+ providers)
langfuse <3.0.0 Observability and tracing
pydantic Latest Tool input schema validation
textual Latest Terminal UI framework
python-dotenv Latest .env file loading

Development Setup

git clone https://github.com/0xchamin/hummcode.git
cd hummcode

python3 -m venv .venv
source .venv/bin/activate     # Windows: .venv\Scripts\activate

pip install -e .

cp .env.example .env          # fill in your keys

hummcode                      # TUI
hummcode --cli                # headless

Roadmap

  • /save / /load โ€” Persist and restore session trees to JSON
  • Command autocomplete dropdown โ€” Slack-style popup when typing /
  • Streaming LLM responses โ€” Token-by-token streaming into the chat pane
  • Token + cost display โ€” Live counter in TUI header
  • Settings modal โ€” In-TUI model and key management
  • Multi-agent swarms โ€” Parallel sub-agents on independent tree branches
  • /branch command โ€” Explicit tree fork for A/B approach comparison
  • Session persistence โ€” Resume work across terminal restarts

Acknowledgements

hummcode stands on the shoulders of brilliant reference implementations:

  • Geoffry Huntley โ€” foundational coding agent architecture in Go and the progressive enhancement approach (chat โ†’ read โ†’ edit โ†’ bash)
  • Yang Shun / TeenyCode โ€” the principle that a full coding agent can be under 200 lines
  • Amp's "How to Build an Agent" โ€” composable primitive tool philosophy
  • Pi Coding Agent โ€” tree-based session memory and "Context is King"
  • LiteLLM โ€” model-agnostic LLM abstraction
  • Textual โ€” the TUI framework that makes terminal apps feel like desktop apps

License

MIT ยฉ Chamin Hewage / BlackEagleLabs.ai


Built with precision. Like a hummingbird. ๐Ÿฆ

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hummcode-0.1.0.tar.gz (34.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hummcode-0.1.0-py3-none-any.whl (27.5 kB view details)

Uploaded Python 3

File details

Details for the file hummcode-0.1.0.tar.gz.

File metadata

  • Download URL: hummcode-0.1.0.tar.gz
  • Upload date:
  • Size: 34.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for hummcode-0.1.0.tar.gz
Algorithm Hash digest
SHA256 cbc4c733926970d8a2c88137432e7e7ac50cf089bbf8d8ee236c19e6e23cbdc9
MD5 7defa55acb97c6b771fc15f5bfb79508
BLAKE2b-256 f4f7c3b30bed57684c4ae43dc8b7a490ac192290e25e4c8925ef99c9f802368c

See more details on using hashes here.

File details

Details for the file hummcode-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: hummcode-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 27.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for hummcode-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d0a415345699f4e0d1aa9f2efccdcb7d21f5a0cc8d7970a39d969025d0e9a38a
MD5 e33d96f18ed3e1ea93588d8b18a8765d
BLAKE2b-256 f25941f02a254742cc94d1bb865e79fb59ca4e8b23f11ad50efba74327d8041a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page