Skip to main content

A local autonomous coding agent CLI powered by Ollama.

Project description

๐Ÿง  DevAgent

A Lightweight Local Open-Source Miniature of Claude Code CLI

License: MIT PyPI version Python 3.11+ Ollama PRs Welcome GitHub stars

A production-grade local coding agent that finds bugs, writes patches, reviews its own code, and validates with tests โ€” all offline, all local, zero API costs.

Quick Start โ€ข Architecture โ€ข Benchmarks โ€ข Roadmap โ€ข Contributing


๐Ÿ›ก๏ธ Why DevAgent?

DevAgent is built on a "Safety-First" architecture for high-integrity developer infrastructure. Unlike chatbots that guess code, DevAgent is a local autonomous agent that operates within a strictly observed environment.

๐Ÿ“Š Empirical Validation

We don't hide our limitations. DevAgent v3.2.3 has been stress-tested against real-world, "messy" repositories, providing a transparent Benchmark Report.

Other Agents DevAgent
Safety Isolation โŒ โœ… Strict Sandbox Mode
Recovery โŒ โœ… Git-native + Snapshot Rollback
Validation โŒ โœ… Empirical Stress Tests
Transparency โŒ โœ… Visible Failure Taxonomy
Privacy โŒ โœ… 100% Local (Ollama)
Costs ๐Ÿ’ธ โœ… Zero API Costs

Philosophy: Safety > Intelligence. Observability > Reasoning. Retrieval > Huge Context. Reliability > Hype.


โœจ Features

๐Ÿ” ReAct Loop โ€” Thought โ†’ Action โ†’ Observation โ†’ Fix โ†’ Review โ†’ Test cycle

๐Ÿง  Planner โ€” LLM generates an action plan before coding

๐Ÿ” Semantic Search โ€” FAISS + sentence-transformers code retrieval

๐Ÿ”Ž Code Search โ€” ripgrep-powered with cross-platform fallback

๐Ÿ“ Self-Review โ€” LLM critiques its own fixes, revises until approved

๐Ÿฉน Patch Engine โ€” Line-level unified diffs instead of full file rewrites

๐Ÿงช Test-Driven โ€” Runs pytest after every fix, retries on failure

๐Ÿ–๏ธ Sandbox Mode โ€” Agent works in an isolated copy, applies changes only on success

๐Ÿ“Š Benchmarks โ€” 5 built-in benchmark suites with automated evaluation

๐Ÿ“ˆ Metrics โ€” Latency, token estimates, retries, and performance tracking

๐Ÿ“‹ Full Audit Trail โ€” Every step logged to logs/run.json

๐Ÿ”’ 100% Offline โ€” Runs on Ollama with small models (2-4 GB)

โšก Low Resource โ€” Works on RTX 3050 (4 GB VRAM) / 16 GB RAM


๐Ÿš€ Quick Start

Prerequisites

Install & Setup

# 1. Clone
git clone https://github.com/VedantJadhav701/Developer-Code-Intelligence-Agent.git
cd Developer-Code-Intelligence-Agent

# 2. Install
pip install devagent-cli  # (Coming soon to PyPI)
# Or locally: pip install -e .

# 3. Verify System (CRITICAL)
# This checks your Python environment, Ollama connection, and dependencies
devagent doctor

# 4. Pull the model
ollama pull qwen2.5-coder:3b

# 5. Run!
devagent run --task "Fix the divide-by-zero bug" --root ./demo_project

CLI Subcommands

Command Description
devagent run Execute a coding task on a project
devagent benchmark Run the automated benchmark suite
devagent doctor Check system health and dependencies
devagent models List available Ollama models
devagent version Show current version

๐Ÿ›ก๏ธ Reliability & Safety

DevAgent is built for production-grade reliability:

  • Isolated Sandbox: Agent works in sandbox_workspace/, keeping your source clean until success.
  • Auto-Snapshot: Creates a safety restore point before every execution.
  • Instant Rollback: Revert agent changes with devagent rollback.
  • Traceability: Every thought and tool call is logged to logs/run.json.
  • Environment Awareness: Detects and uses your project's Python environment automatically.

๐Ÿ•น๏ธ Interactive Mode

Run with --interactive (or -i) to review colorized diffs before they are applied to your project.

devagent run --task "Fix bug" --interactive

๐Ÿ—๏ธ Architecture

graph TD
    CLI[DevAgent CLI] --> Orchestrator[ReAct Orchestrator]
    Orchestrator --> Safety[Safety Manager: Snapshots]
    Orchestrator --> Memory[Working Memory]
    Orchestrator --> Retrieval[Semantic Retrieval FAISS]
    Orchestrator --> Tools[Tool Suite: pytest, ripgrep, git]
    Orchestrator --> Reviewer[Self-Review Loop]
    Reviewer --> Patch[Surgical Patch Engine]
    Patch --> Sandbox[Sandbox Environment]

No API keys. No sign-ups. No cloud.

Optional: Enable Semantic Search

pip install faiss-cpu sentence-transformers

Without these, DevAgent falls back to keyword search โ€” still fully functional.


๐ŸŽฌ Demo

 ____              _                    _
|  _ \  _____   __/ \   __ _  ___ _ __ | |_
| | | |/ _ \ \ / / _ \ / _` |/ _ \ '_ \| __|
| |_| |  __/\ V / ___ \ (_| |  __/ | | | |_
|____/ \___| \_/_/   \_\__, |\___|_| |_|\__|
                       |___/

+==========================================================+
|        DEVELOPER CODE INTELLIGENCE AGENT                 |
|        Model: qwen2.5-coder:3b                          |
|        Sandbox: OFF                                      |
+==========================================================+

  [PLAN] LIKELY_FILES: calculator.py
  1. search_code: divide
  2. read_file: calculator.py
  3. run_tests

  ----------------------------------------
  ITERATION 1/5
  ----------------------------------------
  [TOOL] Executing: search_code(divide)
  >> Found: calculator.py:10:def divide(a, b)
  [REVIEW] #1: APPROVED
  >> Tests: 5 passed โœ“

  [OK]  AGENT COMPLETED SUCCESSFULLY

  Status:     success
  Steps used: 1/5
  Patches:    1
  Time:       8.2s

๐Ÿ—๏ธ Architecture

                    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                    โ”‚       CLI (main.py)          โ”‚
                    โ”‚  --task --root --model       โ”‚
                    โ”‚  --sandbox --benchmark       โ”‚
                    โ”‚  --auto-commit --auto-push   โ”‚
                    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                               โ”‚
                    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                    โ”‚     Planner Layer            โ”‚
                    โ”‚  Identifies files + strategy โ”‚
                    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                               โ”‚
                    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                    โ”‚  Retrieval Layer (Memory)    โ”‚
                    โ”‚  FAISS + Sentence-Transformersโ”‚
                    โ”‚  Chunk โ†’ Embed โ†’ Top-K       โ”‚
                    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                               โ”‚
                    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                    โ”‚     ReAct Agent Loop         โ”‚
                    โ”‚                              โ”‚
                    โ”‚  1. THOUGHT   (LLM)          โ”‚
                    โ”‚  2. ACTION    (Tool)          โ”‚
                    โ”‚  3. OBSERVATION               โ”‚
                    โ”‚  4. FIX       (LLM)          โ”‚
                    โ”‚  5. REVIEW    (LLM)          โ”‚
                    โ”‚  6. PATCH     (Diff Engine)   โ”‚
                    โ”‚  7. TEST      (pytest)        โ”‚
                    โ”‚                              โ”‚
                    โ”‚  if FAIL โ†’ retry              โ”‚
                    โ”‚  if PASS โ†’ done โœ“            โ”‚
                    โ””โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                       โ”‚              โ”‚
              โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”
              โ”‚   Tools   โ”‚    โ”‚   Ollama    โ”‚
              โ”‚           โ”‚    โ”‚  (Local)    โ”‚
              โ”‚ โ€ข search  โ”‚    โ”‚             โ”‚
              โ”‚ โ€ข semanticโ”‚    โ”‚ qwen2.5-    โ”‚
              โ”‚ โ€ข read    โ”‚    โ”‚  coder:3b   โ”‚
              โ”‚ โ€ข patch   โ”‚    โ”‚ phi3:mini   โ”‚
              โ”‚ โ€ข pytest  โ”‚    โ”‚ mistral:7b  โ”‚
              โ”‚ โ€ข flake8  โ”‚    โ”‚             โ”‚
              โ”‚ โ€ข git_diffโ”‚    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
              โ”‚ โ€ข sandbox โ”‚
              โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

9-Layer Architecture

Layer Module Purpose
1. CLI main.py Argument parsing, mode selection, banner
2. Planner app/planner.py Task interpretation, file identification
3. Retrieval app/memory.py FAISS index, semantic chunking, Top-K search
4. Tools tools/* 8 real tools: search, semantic_search, read, write, test, lint, git, sandbox
5. Agent app/agent.py ReAct orchestration loop
6. Review app/reviewer.py Self-critique with APPROVED/REVISE
7. Validation tools/test_runner.py pytest + flake8 execution feedback
8. Logging utils/logger.py Structured JSON audit trail
9. Safety app/sandbox.py Isolated workspace, path validation

๐Ÿ“ Project Structure

Developer-Code-Intelligence-Agent/
โ”œโ”€โ”€ app/
โ”‚   โ”œโ”€โ”€ agent.py            # Core ReAct agent engine
โ”‚   โ”œโ”€โ”€ planner.py          # Task planning layer
โ”‚   โ”œโ”€โ”€ reviewer.py         # Self-review module
โ”‚   โ”œโ”€โ”€ llm.py              # Ollama integration
โ”‚   โ”œโ”€โ”€ memory.py           # FAISS retrieval + working memory
โ”‚   โ”œโ”€โ”€ patcher.py          # Unified diff patch engine
โ”‚   โ”œโ”€โ”€ sandbox.py          # Sandbox workspace manager
โ”‚   โ””โ”€โ”€ state.py            # Shared state dataclass
โ”œโ”€โ”€ tools/
โ”‚   โ”œโ”€โ”€ search.py           # Code search (ripgrep + fallbacks)
โ”‚   โ”œโ”€โ”€ semantic_search.py  # FAISS semantic search
โ”‚   โ”œโ”€โ”€ file_ops.py         # Safe file read/write
โ”‚   โ”œโ”€โ”€ test_runner.py      # pytest runner
โ”‚   โ”œโ”€โ”€ linter.py           # flake8 linter
โ”‚   โ”œโ”€โ”€ git_tools.py        # Git diff/commit/push
โ”‚   โ””โ”€โ”€ benchmark_runner.py # Benchmark evaluation
โ”œโ”€โ”€ utils/
โ”‚   โ”œโ”€โ”€ logger.py           # Structured JSON logger
โ”‚   โ”œโ”€โ”€ config.py           # Centralized configuration
โ”‚   โ””โ”€โ”€ metrics.py          # Performance metrics
โ”œโ”€โ”€ benchmarks/
โ”‚   โ”œโ”€โ”€ divide_by_zero/     # Benchmark: zero division guard
โ”‚   โ”œโ”€โ”€ missing_validation/ # Benchmark: input validation
โ”‚   โ”œโ”€โ”€ syntax_error/       # Benchmark: syntax fix
โ”‚   โ”œโ”€โ”€ import_bug/         # Benchmark: wrong import
โ”‚   โ””โ”€โ”€ edge_case/          # Benchmark: empty list handling
โ”œโ”€โ”€ demo_project/           # Sample buggy project
โ”œโ”€โ”€ docs/
โ”‚   โ””โ”€โ”€ USER_GUIDE.md       # Full usage guide
โ”œโ”€โ”€ main.py                 # CLI entry point
โ”œโ”€โ”€ devagent.py             # Global CLI wrapper
โ”œโ”€โ”€ devagent.bat            # Windows global shortcut
โ”œโ”€โ”€ requirements.txt
โ”œโ”€โ”€ CONTRIBUTING.md
โ”œโ”€โ”€ CHANGELOG.md
โ”œโ”€โ”€ CODE_OF_CONDUCT.md
โ”œโ”€โ”€ SECURITY.md
โ”œโ”€โ”€ LICENSE
โ””โ”€โ”€ README.md

๐Ÿ’ป CLI Reference

python main.py --task "TASK" --root ./project [OPTIONS]
Flag Default Description
--task, -t (required) The coding task for the agent
--root, -r . Project root directory
--model qwen2.5-coder:3b Any Ollama model
--max-steps, -m 5 Max ReAct iterations
--benchmark off Run benchmark suite
--sandbox off Run in isolated sandbox
--auto-commit off Git commit on success
--auto-push off Git push after commit
--verbose, -v off Verbose output

Examples

# Fix a specific bug
python main.py -t "Fix the TypeError in user_service.py" -r ./backend

# Run in sandbox mode (safe โ€” doesn't touch real files until success)
python main.py -t "Fix divide-by-zero bug" -r ./project --sandbox

# Auto-commit changes on success
python main.py -t "Add input validation" -r ./api --auto-commit

# Use a stronger model
python main.py -t "Refactor auth middleware" -r ./server --model mistral:7b

# Run benchmarks
python main.py --benchmark

# More retries for complex tasks
python main.py -t "Make all tests pass" -r ./project --max-steps 10

๐Ÿ“– Full User Guide โ†’


๐Ÿ“Š Benchmarks

DevAgent includes 5 built-in benchmarks to evaluate agent performance:

Benchmark Bug Type Difficulty
divide_by_zero Missing guard clause Easy
missing_validation No input validation Medium
syntax_error Broken syntax Medium
import_bug Wrong module name Easy
edge_case Empty list crash Medium

Run benchmarks:

python main.py --benchmark
python main.py --benchmark --model phi3:mini

๐Ÿ”ง Supported Models

Model Size Speed Quality Best For
qwen2.5-coder:3b 1.9 GB โšก Fast โ˜…โ˜…โ˜…โ˜… Default โ€” best for code
qwen2.5:3b 1.9 GB โšก Fast โ˜…โ˜…โ˜…โ˜† General fallback
phi3:mini 2.2 GB โšก Fast โ˜…โ˜…โ˜…โ˜† Good reasoning
qwen3:4b 2.5 GB โšก Fast โ˜…โ˜…โ˜…โ˜… Better understanding
gemma2:2b 1.6 GB โšกโšก โ˜…โ˜…โ˜†โ˜† Ultra-low resource
mistral:7b 4.4 GB ๐Ÿข โ˜…โ˜…โ˜…โ˜…โ˜… Best quality (8GB+ RAM)

๐Ÿ—บ๏ธ Roadmap

โœ… Completed (v2.0)

  • Core ReAct agent loop
  • Self-review module
  • Tool system (9 tools)
  • Planner layer
  • Semantic retrieval (FAISS)
  • Patch engine (unified diffs)
  • Sandbox mode
  • Benchmark system (5 suites)
  • Metrics + structured logging
  • Git integration
  • CLI with all flags

๐Ÿ”œ Coming Next

  • Multi-file support โ€” Agent works across multiple files simultaneously
  • Language support โ€” JavaScript, TypeScript, Go, Rust
  • Plugin system โ€” Custom tools via YAML/Python
  • Watch mode โ€” Auto-fix on test failure (--watch)
  • VS Code extension โ€” Run agent from your editor
  • Conversation memory โ€” Learn from past runs
  • Multi-agent mode โ€” Planner + Coder + Reviewer + Evaluator agents

๐Ÿค Contributing

We welcome contributions! See CONTRIBUTING.md for details.

git checkout -b feature/your-feature
# ... make changes ...
python -m pytest demo_project/ -v
git commit -m "feat: your feature"
git push origin feature/your-feature

Good first issues are tagged and waiting: Browse good first issues โ†’


๐Ÿ“œ License

MIT โ€” use it however you want. See LICENSE.


โญ Star History

If DevAgent helps you, give it a star! It helps others discover the project.

Star History Chart


Built with ๐Ÿง  by Vedant Jadhav

A lightweight local open-source miniature of Claude Code CLI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

devagent_cli-3.3.1.tar.gz (45.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

devagent_cli-3.3.1-py3-none-any.whl (49.2 kB view details)

Uploaded Python 3

File details

Details for the file devagent_cli-3.3.1.tar.gz.

File metadata

  • Download URL: devagent_cli-3.3.1.tar.gz
  • Upload date:
  • Size: 45.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for devagent_cli-3.3.1.tar.gz
Algorithm Hash digest
SHA256 8de46c4203554c99ad217aefe3f74d1cb067e55f5833c345eab37f5827c8cd0f
MD5 5ab1974ebc3b27bdce5580d60576beee
BLAKE2b-256 8133aeda10d81473e9ab7a3622de5fb900755f4cb8ac315031dbd263313aeb33

See more details on using hashes here.

Provenance

The following attestation bundles were made for devagent_cli-3.3.1.tar.gz:

Publisher: publish.yml on VedantJadhav701/Developer-Code-Intelligence-Agent

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file devagent_cli-3.3.1-py3-none-any.whl.

File metadata

  • Download URL: devagent_cli-3.3.1-py3-none-any.whl
  • Upload date:
  • Size: 49.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for devagent_cli-3.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 3eb13b61f1c33f87cd99791c3ce66236b28b3ff61a0d1e77128c9330c2ff0912
MD5 4ba789c36e84f4c37bc0b46ed6e755b5
BLAKE2b-256 7ddf12f42630ec1926451f3b7bc02d5b505b14f02118f16ff66fc83cd674826a

See more details on using hashes here.

Provenance

The following attestation bundles were made for devagent_cli-3.3.1-py3-none-any.whl:

Publisher: publish.yml on VedantJadhav701/Developer-Code-Intelligence-Agent

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page