A local CLI agentic coding tool written in python. Simple enough that anyone can customize it.
Project description
██████╗ ██╗███████╗██████╗ █████╗ ████████╗ ██████╗██╗ ██╗
██╔══██╗██║██╔════╝██╔══██╗██╔══██╗╚══██╔══╝██╔════╝██║ ██║
██║ ██║██║███████╗██████╔╝███████║ ██║ ██║ ███████║
██║ ██║██║╚════██║██╔═══╝ ██╔══██║ ██║ ██║ ██╔══██║
██████╔╝██║███████║██║ ██║ ██║ ██║ ╚██████╗██║ ██║
╚═════╝ ╚═╝╚══════╝╚═╝ ╚═╝ ╚═╝ ╚═╝ ╚═════╝╚═╝ ╚═╝
A local AI agent harness written in python, built on ollama; with tool calling, streaming, and a persistent memory system.
Dispatch does not intend to compete with Claude Code, DeepAgents, OpenCode or other famous CLIs.
It is a tool I built for the love of the game, and is meant to be an easy-to-understand, easily-to-modify, light-weight, local CLI agent that you can study to understand how popular agentic systems and famous Agentic AI CLIs work.
Index
- Index - This section
- Installation - Setup and prerequisites
- Project Structure - Directory tree overview
- How It Works - Boot and main loop explained
- Tool Registry - Available tools
- Slash Command Registry - All slash commands
- What's Next - Planned features
- How to build on top of Dispatch - Adding custom tools/commands
- License
Installation
Developed and tested on macOS and Linux. Windows should work but is untested.
Prerequisites
Before installing Dispatch, ensure you have:
1. Ollama installed and running
Download from ollama.com or install via package manager:
# macOS
brew install ollama
# Linux
curl -fsSL https://ollama.com/install.sh | sh
# Windows
# Download installer from https://ollama.com/download
Start the server:
ollama serve
2. A model that supports tool calling
# Recommended options:
ollama pull qwen3.5:9b # Fast, excellent tool calling
ollama pull gemma4:e4b # Strong reasoning
Install Dispatch
Option A: From PyPI (easiest)
pip install dispatch-agent
dispatch
Option B: From source
git clone https://github.com/santiagomora2/dispatch.git
cd dispatch
uv tool install --editable .
(Requires uv. Install with: curl -LsSf https://astral.sh/uv/install.sh | sh)
Configure
Update config.json in the dispatch directory to match your Ollama model:
{
"model": "qwen3.5:9b",
"context_limit": 6000,
"mode": "auto",
"version": "0.1.1"
}
Then run:
dispatch
Dispatch operates on your current directory while keeping memory and config at the project root.
Project Structure
dispatch/
├── agent/
│ ├── cmd/
│ │ ├── __init__.py # registry + dispatch
│ │ ├── arg_completers.py # contains arg_completer functions for commands
│ │ ├── files.py # file commands (/tree, /ls etc.)
│ │ ├── memory.py. # memory commands (/note, /forget, etc.)
│ │ ├── plan.py # plan command
│ │ └── session.py # session commands (/clear, /compact, /model, etc.)
│ ├── tools/
│ │ ├── __init__.py # registry + dispatch + get_schemas
│ │ ├── files.py # file tools (read_file, patch_file, tree, etc.)
│ │ ├── memory.py # memory tools (add_fact, forget_fact, etc.)
│ │ ├── session.py # compact conversation (not callable, handled in main loop)
│ │ ├── shell.py. # shell tools (run_shell)
│ │ └── web.py # web search tools (web_search, fetch_url)
│ ├── plans/ # directory where agent's plans and statuses are logged
│ ├── __init__.py
│ ├── agent.py # main loop
│ ├── completer.py # slash commands auto-completer
│ ├── fancy_banner.py # fancy welcome banner, ways to goodbye
│ ├── main.py # CLI entrypoint (typer)
│ ├── paths.py # ROOT-anchored file
│ └── system_prompt.py # system prompt
├── README.md
├── config.json # model, context_limit, mode
├── memory.md # persistent agent memory
├── pyproject.toml # entry point: `dispatch` command
├── session.json # last compact summary
└── uv.lock
How It Works
Boot
dispatchis invoked from anywhere in the terminalmain.pycallsrun()inagent.pyagent.pyloadsconfig.json(model, context limit, mode)memory.mdis read and injected into the system prompt- The message history is initialized with the system prompt
- The main loop starts
Main Loop
START
│
├── pending_tool_response = False?
│ YES ──> get user input
│ ├── "/" ──> run slash command ──> back to START
│ ├── empty ──> back to START
│ └── normal ──> append to messages
│
├── check token estimate > 80% limit?
│ YES ──> compact conversation (summarize history) ──> continue
│
├── call ollama.chat(stream=True, tools=get_schemas())
│ └── stream chunks to terminal as they arrive
│ ├── text content ──> print immediately
│ └── tool_calls ──> accumulate, execute after stream ends
│
├── tool_calls found?
│ YES ──> for each call:
│ ├── print [dim] tool: name(args)
│ ├── dispatch(name, args)
│ ├── append result to messages (role=tool)
│ └── set pending_tool_response = True ──> back to START (skip user input)
│
│ NO ──> response already streamed
wait for next user input ──> back to START
Tool Registry
Every tool is a decorated Python function in tools/:
@tool(schema_dict, lazy=False)
def read_file(path: str):
...
- The
@tooldecorator registers the function and its JSON schema intoTOOLS = {}or intoLAZY{}iflazy=True(tool disabled). - The modules are imported at the bottom of
tools/__init__.pyso decorators run on startup.get_schemas()returns all schemas to pass to Ollama. dispatch(name, args)looks up and calls the function, always returning{"error": "..."}on failure instead of raising.
Slash Command Registry
Every slash command is a decorated Python function in cmd/:
@command("note", description="Append a fact to memory", usage="")
def cmd_note(arg, ctx):
...
- The
@commanddecorator registers the function intoCOMMANDS = {}with its name, description, and usage hint. - Modules are imported in
cmd/__init__.pyso decorators run on startup. dispatch_command(raw_input, ctx)parses the command name and argument, looks up the function, and calls it.ctxis a dict holding live references tomessages,model,config, andsystem_promptso commands can mutate agent state directly.- The
SlashCompleterreads fromCOMMANDSat runtime so any new command automatically appears in the/menu.
Flow note: if a command and a tool perform the same function, the command file should import from the corresponding tool file (Example:
/noteusessave_memory()tool).
Path Anchoring
paths.pyuses__file__to resolve all Dispatch-internal files (config, memory, session) to the project root; regardless of where you invokedispatchfrom.- File operations the agent performs use
os.getcwd()captured at startup as the working directory.
Current Tools
| Tool | File | What it does |
|---|---|---|
read_file |
tools/files.py |
Reads a file, returns content with line numbers |
create_file |
tools/files.py |
Creates a file with an initial skeleton |
patch_file |
tools/files.py |
Replaces, inserts, or deletes content via old_str/new_str |
append_file |
tools/files.py |
Appends content to end of existing file |
find_pattern |
tools/files.py |
Glob search for files matching a pattern |
list_dir |
tools/files.py |
Lists files and dirs at a path |
tree |
tools/files.py |
Prints a directory tree up to a given depth |
update_memory |
tools/memory.py |
Update a section of the agent's persistent memory markdown file |
web_search |
tools/web.py |
Searches the web for relevant URLs |
fetch_url |
tools/web.py |
Fetches the content from a given URL, parses it as Markdown (jina + tralifatura fallback) |
run_shell |
tools/shell.py |
Run a shell command with human confirmation, streaming output line by line |
Current Slash Commands
| Command | Usage | What it does |
|---|---|---|
/memory |
/memory |
Print current memory.md |
/note |
/note <text> |
Append a fact to memory directly |
/forget |
/forget <section> |
Clear a memory section |
/clear |
/clear |
Reset messages, keep memory |
/reset |
/reset |
Reset messages and memory |
/compact |
/compact |
Summarize session and replace history |
/compact_tools |
/compact_tools |
Compact tool results into a summary |
/tools |
/tools [enable/disable] <tool> |
List, enable, or disable tools |
/model |
/model [name] |
Show or switch the active Ollama model |
/tree |
/tree <path> <depth> |
Print directory tree |
/ls |
/ls <path> |
List directory contents |
/plan |
/plan <task> |
Generate and execute a step-by-step plan |
/help |
/help |
List all available commands |
/exit |
/exit |
Quit Dispatch |
What's Next
/mode- toggle careful/auto HITL aggressiveness/retry- resend last user message/history- print condensed message log/plan- prevent context window limit, optimize token usage
- Maybe in the future MCP servers and custom skills idk
How to build on top of Dispatch
I know it may seem complicated but bear with me, customizing it is actually simple.
Add a tool
- Create your tool as a python function, return a
dicteven if the function doesn't return anything. Make sure to wrap it in atry, exceptblock that returns an errordict.
def read_file(path: str):
try:
with open(path) as f:
lines = f.readlines()
numbered = "".join(f"{i+1}: {l}" for i, l in enumerate(lines))
return {"content": numbered}
# or if the function shouldn't return anything
return {"read_file": path}
except Exception as e:
return {"error": f"An error occurred while reading the file: {str(e)}"}
- Add the
@tooldecorator with an adequate description.- The description: think which of these are necessary to tell the agent: what the tool does, when to use it, which tools it may use before/after, examples of usage, etc.
- The parameters the function recieves: their type and description; which ones are required
- Whether or not the tool is disabled on startup (
lazy)
@tool({
"type": "function",
"function": {
"name": "read_file",
"description": "Read a file and return its contents with line numbers",
"parameters": {
"type": "object",
"properties": {
"path": {"type": "string", "description": "Path to the file"}
},
"required": ["path"]
}
}
}, lazy = False)
def read_file(path: str):
try:
with open(path) as f:
lines = f.readlines()
numbered = "".join(f"{i+1}: {l}" for i, l in enumerate(lines))
return {"content": numbered}
except Exception as e:
return {"error": f"An error occurred while reading the file: {str(e)}"}
It seems like a lot but is really quite simple.
- If it was in a new file, add the import to
/agent/tools/__init__.py
from agent.tools import files, memory, your_new_file #noqa
HITL gated
- If you need a human-in-the-loop confirmation, add the following lines before executing the function that needs to be confirmed in the tool and returning.
from rich.prompt import Confirm # this,
@tool({...})
def read_file(path: str):
try:
if not Confirm.ask(f"Read file {path}?"): # this
return {"error": "aborted"} # and this
with open(path) as f:
lines = f.readlines()
numbered = "".join(f"{i+1}: {l}" for i, l in enumerate(lines))
return {"content": numbered}
except Exception as e:
return {"error": f"An error occurred while reading the file: {str(e)}"}
Add a slash command
Works very similar to the tool creation.
- Create your command as a python function, only now you may execute some function and then print the result or that the function was executed correctly or completed.
Note: if a command and a tool perform the same function, just import the function from the corresponding tool file (Example:
/read_file <path>uses theread_file(path)tool we just defined).
from agent.tools.files import read_file
from rich.console import Console
console = Console()
def cmd_read_file(path: str):
content = read_file(path).get("content")
console.print(content)
- Add the
@commanddecorator with an adequate description.- The description will be seen by the user when using
/helpor the auto-completer suggests/commands. - The usage (optional) is also seen by the user and is basically the argument (or arguments) of your function.
- arg_completer (also optional) is a function that will give the user the options they can choose as their argument(s).
- The description will be seen by the user when using
@command("read", description="Print a file's content", usage="<path>",
arg_completer=None)
def cmd_read_file(path: str):
content = read_file(path).get("content")
console.print(content)
- If it was in a new file, add the import to
/agent/cmd/__init__.py
from agent.cmd import memory, files, your_new_file #noqa
This was maybe a bad example because this command isn't actually implemented (why print it in console if you can just open the file) but you get the gist.
Arg completer
It is, again, a function that will give the user the options they can choose as their argument(s). For example, the models they can choose when using /model:
import ollama
def get_available_models():
try:
return [m.model for m in ollama.list().models]
except Exception:
return []
@command("model",
description="Switch the active Ollama model",
usage="<model>",
arg_completer=get_available_models)
def cmd_model(arg, ctx):
ctx["model"] = arg
console.print(f"[green]Switched to: {arg}[/green]")
# Oversimplified version for illustrative purposes
# check agent/cmd/session.py for the actual implementation
Prefer putting it into arg_completers.py unless definable as a lambdafunction inside the decorator.
Everything else
All of the other stuff like welcome banner in the fancy_banner.py file or the main loop in agent.py can also be modified to make your own version of dispatch and learn how agentic AI and tools work.
My favorite feature? (personal note)
I really enjoyed implementing /plan.
Because of the natural constraints of running models locally on a computer without that much memory capacity, Mixture of Experts (MoE) models are an attractive option due to their speed and lower memory usage with only some parameters active at inference.
A possible problem when trying to implement a big change suddenly is prompt trajectory: the early tokens heavily influence which experts activate, and a poorly structured initial prompt locks you into a suboptimal expert path for the entire generation.
Hence /plan, which breaks the task into a guided reasoning sequence before any action is taken:
1. Understand & Decompose: Restate the task and break it into discrete subtasks.
2. Sequence & Assess Risks: Order the subtasks and note potential risks.
3. Finalize Plan: Output a structured markdown plan with steps and placeholders for decisions/artifacts.
4. Execute Sequentially: For each step, run it in isolation, allowing tool calls, and update the plan file with progress and artifacts.
I've found it works pretty well with both dense and MoE models.
The reason for the plan file is not filling up KV cache too quickly, but still having a shared memory between the sub-agents which implement the task (and also have logs for implemented plans); and for MoE models, also having each step activate the appropriate experts for the task.
License
MIT License. Use, modify, teach, learn. It's all yours.
Built with ❤️, as always.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dispatch_agent-0.1.1.tar.gz.
File metadata
- Download URL: dispatch_agent-0.1.1.tar.gz
- Upload date:
- Size: 19.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d6430828e6899a1aafb237dc1b0fb442b89b5ff1c7051ef4b7254e0bb6281d43
|
|
| MD5 |
8e26223c1f1564d484df894a918d19c3
|
|
| BLAKE2b-256 |
49ec77ed43d5baec63052414db2226646b67533e19837a8469ac32b8a05c922f
|
File details
Details for the file dispatch_agent-0.1.1-py3-none-any.whl.
File metadata
- Download URL: dispatch_agent-0.1.1-py3-none-any.whl
- Upload date:
- Size: 14.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d69ceeddfc878c6394e86b3b3de93904494d6c68a5e374f4d4a6c85365b51eaf
|
|
| MD5 |
f9189cf36d6d2a8b9dc4c47dcb81ccb8
|
|
| BLAKE2b-256 |
60f74bc36f87c86b9a9f5752d88d7d0947bdd5b7410eaca2eb2021da6527dad6
|