Skip to main content

A local autonomous coding agent CLI powered by Ollama.

Project description

๐Ÿง  DevAgent

A Lightweight Local Open-Source Miniature of Claude Code CLI

License: MIT PyPI version Python 3.11+ Ollama PRs Welcome GitHub stars

A production-grade local coding agent that finds bugs, writes patches, reviews its own code, and validates with tests โ€” all offline, all local, zero API costs.

Quick Start โ€ข Architecture โ€ข Benchmarks โ€ข Roadmap โ€ข Contributing


๐Ÿ›ก๏ธ Why DevAgent?

DevAgent is built on a "Safety-First" architecture for high-integrity developer infrastructure. Unlike chatbots that guess code, DevAgent is a local autonomous agent that operates within a strictly observed environment.

๐Ÿ“Š Empirical Validation

We don't hide our limitations. DevAgent v3.2.3 has been stress-tested against real-world, "messy" repositories, providing a transparent Benchmark Report.

Other Agents DevAgent
Safety Isolation โŒ โœ… Strict Sandbox Mode
Recovery โŒ โœ… Git-native + Snapshot Rollback
Validation โŒ โœ… Empirical Stress Tests
Transparency โŒ โœ… Visible Failure Taxonomy
Privacy โŒ โœ… 100% Local (Ollama)
Costs ๐Ÿ’ธ โœ… Zero API Costs

Philosophy: Safety > Intelligence. Observability > Reasoning. Retrieval > Huge Context. Reliability > Hype.


โœจ Features

๐Ÿ” ReAct Loop โ€” Thought โ†’ Action โ†’ Observation โ†’ Fix โ†’ Review โ†’ Test cycle

๐Ÿง  Planner โ€” LLM generates an action plan before coding

๐Ÿ” Semantic Search โ€” FAISS + sentence-transformers code retrieval

๐Ÿ”Ž Code Search โ€” ripgrep-powered with cross-platform fallback

๐Ÿ“ Self-Review โ€” LLM critiques its own fixes, revises until approved

๐Ÿฉน Patch Engine โ€” Line-level unified diffs instead of full file rewrites

๐Ÿงช Test-Driven โ€” Runs pytest after every fix, retries on failure

๐Ÿ–๏ธ Sandbox Mode โ€” Agent works in an isolated copy, applies changes only on success

๐Ÿ“Š Benchmarks โ€” 5 built-in benchmark suites with automated evaluation

๐Ÿ“ˆ Metrics โ€” Latency, token estimates, retries, and performance tracking

๐Ÿ“‹ Full Audit Trail โ€” Every step logged to logs/run.json

๐Ÿ”’ 100% Offline โ€” Runs on Ollama with small models (2-4 GB)

โšก Low Resource โ€” Works on RTX 3050 (4 GB VRAM) / 16 GB RAM


๐Ÿš€ Quick Start

Prerequisites

Install & Setup

# 1. Clone
git clone https://github.com/VedantJadhav701/Developer-Code-Intelligence-Agent.git
cd Developer-Code-Intelligence-Agent

# 2. Install
pip install devagent-cli  # (Coming soon to PyPI)
# Or locally: pip install -e .

# 3. Verify System (CRITICAL)
# This checks your Python environment, Ollama connection, and dependencies
devagent doctor

# 4. Pull the model
ollama pull qwen2.5-coder:3b

# 5. Run!
devagent run --task "Fix the divide-by-zero bug" --root ./demo_project

CLI Subcommands

Command Description
devagent run Execute a coding task on a project
devagent benchmark Run the automated benchmark suite
devagent doctor Check system health and dependencies
devagent models List available Ollama models
devagent version Show current version

๐Ÿ›ก๏ธ Reliability & Safety

DevAgent is built for production-grade reliability:

  • Isolated Sandbox: Agent works in sandbox_workspace/, keeping your source clean until success.
  • Auto-Snapshot: Creates a safety restore point before every execution.
  • Instant Rollback: Revert agent changes with devagent rollback.
  • Traceability: Every thought and tool call is logged to logs/run.json.
  • Environment Awareness: Detects and uses your project's Python environment automatically.

๐Ÿ•น๏ธ Interactive Mode

Run with --interactive (or -i) to review colorized diffs before they are applied to your project.

devagent run --task "Fix bug" --interactive

๐Ÿ—๏ธ Architecture

graph TD
    CLI[DevAgent CLI] --> Orchestrator[ReAct Orchestrator]
    Orchestrator --> Safety[Safety Manager: Snapshots]
    Orchestrator --> Memory[Working Memory]
    Orchestrator --> Retrieval[Semantic Retrieval FAISS]
    Orchestrator --> Tools[Tool Suite: pytest, ripgrep, git]
    Orchestrator --> Reviewer[Self-Review Loop]
    Reviewer --> Patch[Surgical Patch Engine]
    Patch --> Sandbox[Sandbox Environment]

No API keys. No sign-ups. No cloud.

Optional: Enable Semantic Search

pip install faiss-cpu sentence-transformers

Without these, DevAgent falls back to keyword search โ€” still fully functional.


๐ŸŽฌ Demo

 ____              _                    _
|  _ \  _____   __/ \   __ _  ___ _ __ | |_
| | | |/ _ \ \ / / _ \ / _` |/ _ \ '_ \| __|
| |_| |  __/\ V / ___ \ (_| |  __/ | | | |_
|____/ \___| \_/_/   \_\__, |\___|_| |_|\__|
                       |___/

+==========================================================+
|        DEVELOPER CODE INTELLIGENCE AGENT                 |
|        Model: qwen2.5-coder:3b                          |
|        Sandbox: OFF                                      |
+==========================================================+

  [PLAN] LIKELY_FILES: calculator.py
  1. search_code: divide
  2. read_file: calculator.py
  3. run_tests

  ----------------------------------------
  ITERATION 1/5
  ----------------------------------------
  [TOOL] Executing: search_code(divide)
  >> Found: calculator.py:10:def divide(a, b)
  [REVIEW] #1: APPROVED
  >> Tests: 5 passed โœ“

  [OK]  AGENT COMPLETED SUCCESSFULLY

  Status:     success
  Steps used: 1/5
  Patches:    1
  Time:       8.2s

๐Ÿ—๏ธ Architecture

                    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                    โ”‚       CLI (main.py)          โ”‚
                    โ”‚  --task --root --model       โ”‚
                    โ”‚  --sandbox --benchmark       โ”‚
                    โ”‚  --auto-commit --auto-push   โ”‚
                    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                               โ”‚
                    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                    โ”‚     Planner Layer            โ”‚
                    โ”‚  Identifies files + strategy โ”‚
                    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                               โ”‚
                    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                    โ”‚  Retrieval Layer (Memory)    โ”‚
                    โ”‚  FAISS + Sentence-Transformersโ”‚
                    โ”‚  Chunk โ†’ Embed โ†’ Top-K       โ”‚
                    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                               โ”‚
                    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                    โ”‚     ReAct Agent Loop         โ”‚
                    โ”‚                              โ”‚
                    โ”‚  1. THOUGHT   (LLM)          โ”‚
                    โ”‚  2. ACTION    (Tool)          โ”‚
                    โ”‚  3. OBSERVATION               โ”‚
                    โ”‚  4. FIX       (LLM)          โ”‚
                    โ”‚  5. REVIEW    (LLM)          โ”‚
                    โ”‚  6. PATCH     (Diff Engine)   โ”‚
                    โ”‚  7. TEST      (pytest)        โ”‚
                    โ”‚                              โ”‚
                    โ”‚  if FAIL โ†’ retry              โ”‚
                    โ”‚  if PASS โ†’ done โœ“            โ”‚
                    โ””โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                       โ”‚              โ”‚
              โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”
              โ”‚   Tools   โ”‚    โ”‚   Ollama    โ”‚
              โ”‚           โ”‚    โ”‚  (Local)    โ”‚
              โ”‚ โ€ข search  โ”‚    โ”‚             โ”‚
              โ”‚ โ€ข semanticโ”‚    โ”‚ qwen2.5-    โ”‚
              โ”‚ โ€ข read    โ”‚    โ”‚  coder:3b   โ”‚
              โ”‚ โ€ข patch   โ”‚    โ”‚ phi3:mini   โ”‚
              โ”‚ โ€ข pytest  โ”‚    โ”‚ mistral:7b  โ”‚
              โ”‚ โ€ข flake8  โ”‚    โ”‚             โ”‚
              โ”‚ โ€ข git_diffโ”‚    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
              โ”‚ โ€ข sandbox โ”‚
              โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

9-Layer Architecture

Layer Module Purpose
1. CLI main.py Argument parsing, mode selection, banner
2. Planner app/planner.py Task interpretation, file identification
3. Retrieval app/memory.py FAISS index, semantic chunking, Top-K search
4. Tools tools/* 8 real tools: search, semantic_search, read, write, test, lint, git, sandbox
5. Agent app/agent.py ReAct orchestration loop
6. Review app/reviewer.py Self-critique with APPROVED/REVISE
7. Validation tools/test_runner.py pytest + flake8 execution feedback
8. Logging utils/logger.py Structured JSON audit trail
9. Safety app/sandbox.py Isolated workspace, path validation

๐Ÿ“ Project Structure

Developer-Code-Intelligence-Agent/
โ”œโ”€โ”€ app/
โ”‚   โ”œโ”€โ”€ agent.py            # Core ReAct agent engine
โ”‚   โ”œโ”€โ”€ planner.py          # Task planning layer
โ”‚   โ”œโ”€โ”€ reviewer.py         # Self-review module
โ”‚   โ”œโ”€โ”€ llm.py              # Ollama integration
โ”‚   โ”œโ”€โ”€ memory.py           # FAISS retrieval + working memory
โ”‚   โ”œโ”€โ”€ patcher.py          # Unified diff patch engine
โ”‚   โ”œโ”€โ”€ sandbox.py          # Sandbox workspace manager
โ”‚   โ””โ”€โ”€ state.py            # Shared state dataclass
โ”œโ”€โ”€ tools/
โ”‚   โ”œโ”€โ”€ search.py           # Code search (ripgrep + fallbacks)
โ”‚   โ”œโ”€โ”€ semantic_search.py  # FAISS semantic search
โ”‚   โ”œโ”€โ”€ file_ops.py         # Safe file read/write
โ”‚   โ”œโ”€โ”€ test_runner.py      # pytest runner
โ”‚   โ”œโ”€โ”€ linter.py           # flake8 linter
โ”‚   โ”œโ”€โ”€ git_tools.py        # Git diff/commit/push
โ”‚   โ””โ”€โ”€ benchmark_runner.py # Benchmark evaluation
โ”œโ”€โ”€ utils/
โ”‚   โ”œโ”€โ”€ logger.py           # Structured JSON logger
โ”‚   โ”œโ”€โ”€ config.py           # Centralized configuration
โ”‚   โ””โ”€โ”€ metrics.py          # Performance metrics
โ”œโ”€โ”€ benchmarks/
โ”‚   โ”œโ”€โ”€ divide_by_zero/     # Benchmark: zero division guard
โ”‚   โ”œโ”€โ”€ missing_validation/ # Benchmark: input validation
โ”‚   โ”œโ”€โ”€ syntax_error/       # Benchmark: syntax fix
โ”‚   โ”œโ”€โ”€ import_bug/         # Benchmark: wrong import
โ”‚   โ””โ”€โ”€ edge_case/          # Benchmark: empty list handling
โ”œโ”€โ”€ demo_project/           # Sample buggy project
โ”œโ”€โ”€ docs/
โ”‚   โ””โ”€โ”€ USER_GUIDE.md       # Full usage guide
โ”œโ”€โ”€ main.py                 # CLI entry point
โ”œโ”€โ”€ devagent.py             # Global CLI wrapper
โ”œโ”€โ”€ devagent.bat            # Windows global shortcut
โ”œโ”€โ”€ requirements.txt
โ”œโ”€โ”€ CONTRIBUTING.md
โ”œโ”€โ”€ CHANGELOG.md
โ”œโ”€โ”€ CODE_OF_CONDUCT.md
โ”œโ”€โ”€ SECURITY.md
โ”œโ”€โ”€ LICENSE
โ””โ”€โ”€ README.md

๐Ÿ’ป CLI Reference

python main.py --task "TASK" --root ./project [OPTIONS]
Flag Default Description
--task, -t (required) The coding task for the agent
--root, -r . Project root directory
--model qwen2.5-coder:3b Any Ollama model
--max-steps, -m 5 Max ReAct iterations
--benchmark off Run benchmark suite
--sandbox off Run in isolated sandbox
--auto-commit off Git commit on success
--auto-push off Git push after commit
--verbose, -v off Verbose output

Examples

# Fix a specific bug
python main.py -t "Fix the TypeError in user_service.py" -r ./backend

# Run in sandbox mode (safe โ€” doesn't touch real files until success)
python main.py -t "Fix divide-by-zero bug" -r ./project --sandbox

# Auto-commit changes on success
python main.py -t "Add input validation" -r ./api --auto-commit

# Use a stronger model
python main.py -t "Refactor auth middleware" -r ./server --model mistral:7b

# Run benchmarks
python main.py --benchmark

# More retries for complex tasks
python main.py -t "Make all tests pass" -r ./project --max-steps 10

๐Ÿ“– Full User Guide โ†’


๐Ÿ“Š Benchmarks

DevAgent includes 5 built-in benchmarks to evaluate agent performance:

Benchmark Bug Type Difficulty
divide_by_zero Missing guard clause Easy
missing_validation No input validation Medium
syntax_error Broken syntax Medium
import_bug Wrong module name Easy
edge_case Empty list crash Medium

Run benchmarks:

python main.py --benchmark
python main.py --benchmark --model phi3:mini

๐Ÿ”ง Supported Models

Model Size Speed Quality Best For
qwen2.5-coder:3b 1.9 GB โšก Fast โ˜…โ˜…โ˜…โ˜… Default โ€” best for code
qwen2.5:3b 1.9 GB โšก Fast โ˜…โ˜…โ˜…โ˜† General fallback
phi3:mini 2.2 GB โšก Fast โ˜…โ˜…โ˜…โ˜† Good reasoning
qwen3:4b 2.5 GB โšก Fast โ˜…โ˜…โ˜…โ˜… Better understanding
gemma2:2b 1.6 GB โšกโšก โ˜…โ˜…โ˜†โ˜† Ultra-low resource
mistral:7b 4.4 GB ๐Ÿข โ˜…โ˜…โ˜…โ˜…โ˜… Best quality (8GB+ RAM)

๐Ÿ—บ๏ธ Roadmap

โœ… Completed (v2.0)

  • Core ReAct agent loop
  • Self-review module
  • Tool system (9 tools)
  • Planner layer
  • Semantic retrieval (FAISS)
  • Patch engine (unified diffs)
  • Sandbox mode
  • Benchmark system (5 suites)
  • Metrics + structured logging
  • Git integration
  • CLI with all flags

๐Ÿ”œ Coming Next

  • Multi-file support โ€” Agent works across multiple files simultaneously
  • Language support โ€” JavaScript, TypeScript, Go, Rust
  • Plugin system โ€” Custom tools via YAML/Python
  • Watch mode โ€” Auto-fix on test failure (--watch)
  • VS Code extension โ€” Run agent from your editor
  • Conversation memory โ€” Learn from past runs
  • Multi-agent mode โ€” Planner + Coder + Reviewer + Evaluator agents

๐Ÿค Contributing

We welcome contributions! See CONTRIBUTING.md for details.

git checkout -b feature/your-feature
# ... make changes ...
python -m pytest demo_project/ -v
git commit -m "feat: your feature"
git push origin feature/your-feature

Good first issues are tagged and waiting: Browse good first issues โ†’


๐Ÿ“œ License

MIT โ€” use it however you want. See LICENSE.


โญ Star History

If DevAgent helps you, give it a star! It helps others discover the project.

Star History Chart


Built with ๐Ÿง  by Vedant Jadhav

A lightweight local open-source miniature of Claude Code CLI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

devagent_cli-3.3.0.tar.gz (45.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

devagent_cli-3.3.0-py3-none-any.whl (49.2 kB view details)

Uploaded Python 3

File details

Details for the file devagent_cli-3.3.0.tar.gz.

File metadata

  • Download URL: devagent_cli-3.3.0.tar.gz
  • Upload date:
  • Size: 45.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for devagent_cli-3.3.0.tar.gz
Algorithm Hash digest
SHA256 dc7b343e22a79fe111882668e625f874111143806369f61a517feb4998588479
MD5 47c5dca48cfd41db50bcb7854dff5ddf
BLAKE2b-256 525fe98dc3b73ff2fa25713a28fc7caef18fb828299226068cd1bfab4cf083c2

See more details on using hashes here.

Provenance

The following attestation bundles were made for devagent_cli-3.3.0.tar.gz:

Publisher: publish.yml on VedantJadhav701/Developer-Code-Intelligence-Agent

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file devagent_cli-3.3.0-py3-none-any.whl.

File metadata

  • Download URL: devagent_cli-3.3.0-py3-none-any.whl
  • Upload date:
  • Size: 49.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for devagent_cli-3.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 804177e5033044fb146c4025aa1b477566c32118738d99258aafc9c650af5a14
MD5 ec62e4b819e46c468a02af4a30f8457c
BLAKE2b-256 560bcb62dd3e8a4d07e83e2130283cd704edec71ddc8c33ffb060b55eb14ad47

See more details on using hashes here.

Provenance

The following attestation bundles were made for devagent_cli-3.3.0-py3-none-any.whl:

Publisher: publish.yml on VedantJadhav701/Developer-Code-Intelligence-Agent

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page