SAM — Smart Agentic Model: CLI coding agent for open-source LLMs
Project description
SAM
Smart Agentic Model
An open-source CLI coding agent for your own LLMs.
Installation • Quickstart • Configuration • Plan Mode • Architecture • Contributing
SAM is a terminal-based AI coding agent that connects to any OpenAI-compatible API — vLLM, Ollama, LM Studio, or a remote endpoint. It reads your codebase, reasons about changes, calls tools, edits files, and verifies its own work. Your code stays on your machine.
╭─ You ──────────────────────────────────────────────────────╮
│ ❯ add input validation to the signup endpoint │
╰────────────────────────────────────────────────────────────╯
grep → "def signup" in src/
read_file → src/routes/auth.py (lines 45-80)
read_file → src/schemas.py
edit_file → src/routes/auth.py
+ Added email format validation
+ Added password length check
run_command → pytest tests/test_auth.py
12 passed
Added email format and password length validation to the signup endpoint.
Why SAM?
| SAM | Proprietary agents | |
|---|---|---|
| Model freedom | Any model — Qwen, DeepSeek, Mistral, LLaMA, your fine-tunes | Locked to one provider |
| Privacy | Runs against local inference. Code never leaves your machine | Code sent to third-party APIs |
| Cost | Free with local GPU / self-hosted | Per-token billing |
| API keys | api_key: "not-needed" |
Required |
| Open source | MIT licensed, fully extensible | Closed source |
Features
| Feature | Description |
|---|---|
| ReAct Agent Loop | Think → Act → Observe cycle with automatic tool calling and multi-step reasoning |
| 10+ Built-in Tools | File read/write/edit, shell execution, grep, glob, git status/diff, user interaction |
| 4-Layer Fuzzy Editing | Exact → whitespace-normalized → indentation-flexible → fuzzy matching. Built for open-source models. |
| Plan Mode | Read-only codebase exploration that produces a structured plan for approval before execution |
| Repo Mapping | Tree-sitter symbol extraction with PageRank ranking — the model sees your codebase structure |
| Context Condensation | Automatic history summarization at 75% context usage — long sessions stay coherent |
| Architect / Editor | Two-model workflow: a large model plans, a smaller model executes |
| Sessions | Save and resume conversations across terminal restarts |
| Rich UI | Markdown rendering, syntax highlighting, loading spinners, @file autocomplete |
Installation
From PyPI:
pip install sam-agent
With tree-sitter (enhanced repo mapping):
pip install "sam-agent[tree-sitter]"
From source:
git clone https://github.com/SecFathy/SAM.git
cd SAM
pip install -e ".[dev]"
Quickstart
1. Start an inference server
| vLLM | Ollama | LM Studio |
vllm serve Qwen/Qwen2.5-Coder-32B-Instruct
|
ollama run qwen2.5-coder:32b
|
Start server in LM Studio GUI on |
2. Launch SAM
sam
SAM opens an interactive REPL. Describe what you need — it reads code, makes edits, runs commands, and verifies the result.
3. One-shot mode
sam chat "fix the failing test in tests/test_auth.py"
4. Other commands
sam models # List configured model presets
sam sessions # List saved sessions
sam -m deepseek-coder # Use a specific model preset
sam --api-base http://gpu:8000/v1 # Point to a different server
Configuration
SAM loads config.yaml from (in order):
- Current working directory
- Parent directories (walking upward)
~/.sam/config.yaml(global fallback)
Environment variables with SAM_ prefix override any config value (e.g. SAM_API_BASE, SAM_MODEL).
Example config.yaml
# ── API ─────────────────────────────────────────────
api_base: "http://localhost:8000/v1"
api_key: "not-needed"
# ── Model ───────────────────────────────────────────
model: "qwen-coder"
# ── Agent ───────────────────────────────────────────
max_iterations: 25 # Max tool-call loops per turn
temperature: 0.0 # 0.0 = deterministic
max_tokens: 4096 # Max tokens per LLM response
repo_map_tokens: 2048 # Token budget for repo map
# ── Model presets ───────────────────────────────────
models:
qwen-coder:
model_id: "Qwen/Qwen2.5-Coder-32B-Instruct"
context_window: 131072
description: "Qwen 32B — recommended"
qwen3-coder:
model_id: "Qwen/Qwen3-Coder-480B-A35B-Instruct"
context_window: 262144
description: "Qwen 480B MoE — highest quality"
deepseek-coder:
model_id: "deepseek-ai/DeepSeek-Coder-V2-Instruct"
context_window: 131072
description: "DeepSeek Coder V2"
qwen-coder-7b:
model_id: "Qwen/Qwen2.5-Coder-7B-Instruct"
context_window: 32768
description: "Lightweight — good for editor role"
CLI Reference
Usage: sam [OPTIONS] COMMAND [ARGS]...
SAM — Smart Agentic Model: CLI coding agent for open-source LLMs.
Commands:
chat Send a one-shot message to SAM
models List available model presets
sessions List saved sessions
Options:
-m, --model TEXT Model preset or exact model ID
--api-base TEXT API base URL (e.g. http://localhost:8000/v1)
-s, --session TEXT Resume a saved session by ID
--temperature FLOAT Sampling temperature (default: 0.0)
--max-tokens INTEGER Max tokens per response (default: 4096)
--response-time Print LLM response latency
--help Show this message and exit
Interactive Commands
| Command | Description |
|---|---|
/help |
Show available commands |
/plan |
Toggle plan mode (read-only exploration) |
/clear |
Clear the terminal |
/reset |
Reset conversation history |
/model |
Show current model and API info |
/status |
Show token usage, mode, and session info |
/exit |
Exit SAM |
Keyboard shortcuts: Ctrl+C cancel current turn • Ctrl+D exit • Alt+Enter newline in prompt
Plan Mode
Plan mode lets you review what SAM intends to do before it touches any code.
Toggle it with /plan. SAM switches to read-only tools (grep, glob, read_file, etc.), explores the codebase, and outputs a structured implementation plan.
/plan
╭─ You [PLAN MODE] ──────────────────────────────────────────╮
│ ❯ add rate limiting to the API │
╰─────────────────────────────────────────────────────────────╯
grep → "app.middleware" in src/
read_file → src/app.py
read_file → src/middleware.py
read_file → requirements.txt
### Summary
Add token-bucket rate limiting as ASGI middleware.
### Files to Modify
- src/middleware.py — New `RateLimiter` class
- src/app.py — Register middleware in `create_app()`
- requirements.txt — Add `limits>=3.0`
### Implementation Steps
1. Add `limits` to requirements.txt
2. Create RateLimiter class in src/middleware.py
3. Register middleware in src/app.py:create_app()
4. Add tests in tests/test_rate_limit.py
### Verification
- pytest tests/test_rate_limit.py
- Manual: hit endpoint >10 times/min, expect 429
Approve plan? (y = execute, n = discard, edit = revise)
❯ y
Plan approved — executing with full tool access...
| Input | Action |
|---|---|
y |
Exit plan mode, execute the plan with full tools |
n |
Discard the plan, stay in plan mode |
edit |
Give feedback — SAM revises the plan |
Built-in Tools
| Tool | Description |
|---|---|
read_file |
Read file contents with optional line-range pagination |
write_file |
Create or completely overwrite a file |
edit_file |
Search/replace edits with 4-layer fuzzy matching |
run_command |
Execute shell commands (tests, builds, git, etc.) |
grep |
Search file contents with regex patterns |
glob |
Find files matching glob patterns (**/*.py) |
list_directory |
List directory contents |
git_status |
Show git working tree status |
git_diff |
Show staged and unstaged changes |
ask_user |
Ask the user a question with optional choices |
Supported Models
SAM works with any model served through an OpenAI-compatible API. Tested with:
| Model | Params | Context | Recommendation |
|---|---|---|---|
| Qwen2.5-Coder-32B-Instruct | 32B | 128K | Best all-around for single-GPU setups |
| Qwen3-Coder-480B-A35B | 480B MoE | 256K | Highest quality — needs multi-GPU |
| DeepSeek-Coder-V2 | 236B MoE | 128K | Strong alternative to Qwen |
| Qwen2.5-Coder-7B | 7B | 32K | Fast — good as editor in two-model split |
| CodeLlama-34B | 34B | 16K | Meta's code model |
| Mistral-Large | 123B | 128K | Strong general-purpose |
Add your own models in
config.yamlunder themodels:key.
Architecture
sam/
├── cli.py # Click CLI entry point + interactive REPL
├── config.py # Pydantic settings + YAML config loading
├── context.py # @file mention resolution
│
├── agent/
│ ├── loop.py # Core ReAct agent loop (Think → Act → Observe)
│ ├── history.py # Conversation history with token tracking
│ ├── planner.py # Architect/Editor two-model workflow
│ └── condensation.py # Context condensation at 75% capacity
│
├── tools/
│ ├── base.py # Tool ABC, ToolResult, ToolRegistry
│ ├── file_read.py # Read with pagination
│ ├── file_write.py # Create / overwrite files
│ ├── file_edit.py # 4-layer fuzzy search/replace
│ ├── shell.py # Shell execution with timeout + safety
│ ├── grep_tool.py # Regex content search
│ ├── glob_tool.py # File pattern matching
│ ├── git.py # git status + git diff
│ └── ask_user.py # Interactive user questions
│
├── models/
│ ├── provider.py # OpenAI SDK wrapper (vLLM / Ollama / any)
│ ├── streaming.py # Stream accumulator + tool call parsing
│ ├── registry.py # Model preset registry
│ └── tool_protocol.py # Tool call protocol definitions
│
├── repo/
│ ├── mapper.py # Tree-sitter + PageRank repo mapping
│ ├── tags.py # Symbol extraction from source files
│ ├── graph.py # Dependency graph construction
│ └── languages.py # Language detection + grammar registry
│
├── session/
│ ├── storage.py # Session persistence (JSON)
│ └── manager.py # Session lifecycle management
│
├── ui/
│ ├── console.py # Rich console + output helpers
│ ├── display.py # Markdown + code rendering
│ ├── spinner.py # Loading spinners
│ └── prompt.py # Prompt utilities
│
└── prompts/
├── system.md # Main system prompt template
└── plan_mode.md # Plan mode system prompt template
Contributing
Contributions are welcome. To get started:
git clone https://github.com/SecFathy/SAM.git
cd SAM
pip install -e ".[dev]"
# Run tests
pytest
# Lint
ruff check sam/
Guidelines:
- Follow existing code style and conventions
- Add tests for new tools or agent behavior changes
- Keep PRs focused — one feature or fix per PR
Roadmap
- Multi-file diff preview before applying edits
- MCP (Model Context Protocol) server support
- Plugin system for custom tools
- Web UI dashboard
- VS Code / JetBrains extension
- Streaming output in agent loop
- Auto-detect inference server (vLLM, Ollama, LM Studio)
License
This project is licensed under the MIT License.
Built by @SecFathy
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sam_agent-0.1.0.tar.gz.
File metadata
- Download URL: sam_agent-0.1.0.tar.gz
- Upload date:
- Size: 51.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
13342d197ce5b7fe4c6691ce310ebd24a3dca249dddd23490b2d5a26c80845e0
|
|
| MD5 |
dc74de4e5b47049e599f5cce962b59ce
|
|
| BLAKE2b-256 |
dcc18470a760eaaf97143717f9469fc8744d5f533e97e6c44debe5f27cbc6d65
|
File details
Details for the file sam_agent-0.1.0-py3-none-any.whl.
File metadata
- Download URL: sam_agent-0.1.0-py3-none-any.whl
- Upload date:
- Size: 59.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ef244ca82a6b89573f98292a85038350dcd25e8042b498965aba416bd8f781e5
|
|
| MD5 |
62a7980ee4047e112c661670c133d862
|
|
| BLAKE2b-256 |
471cbc09a02a7ed96c0009714c73ee82e7389d7f49f6768aeb8c146c1ef319d7
|