Skip to main content

Professional Local autonomous coding agent powered by Ollama

Project description

๐Ÿง  DevAgent

A Lightweight Local Open-Source Miniature of Claude Code CLI

License: MIT Python 3.11+ Ollama PRs Welcome GitHub stars

A production-grade local coding agent that finds bugs, writes patches, reviews its own code, and validates with tests โ€” all offline, all local, zero API costs.

Quick Start โ€ข Architecture โ€ข Benchmarks โ€ข Roadmap โ€ข Contributing


๐Ÿค” Why DevAgent?

Most AI coding tools are chatbots โ€” they suggest code, you copy-paste, you pray.

DevAgent is a real agent with a retrieval-first, tool-grounded architecture:

Chatbot DevAgent
Searches your codebase โŒ โœ… ripgrep + semantic search
Retrieves relevant code โŒ โœ… FAISS embeddings
Plans before coding โŒ โœ… Planner layer
Generates patches โŒ โœ… Unified diffs
Reviews its own output โŒ โœ… Self-critique loop
Runs your tests โŒ โœ… pytest integration
Retries on failure โŒ โœ… Up to N iterations
Works in sandbox โŒ โœ… Isolated workspace
Works offline โŒ โœ… 100% local via Ollama
Costs money ๐Ÿ’ธ โœ… Free forever

Philosophy: Execution > Reasoning. Tools > Hallucination. Retrieval > Huge Context. Reliability > Intelligence.


โœจ Features

๐Ÿ” ReAct Loop โ€” Thought โ†’ Action โ†’ Observation โ†’ Fix โ†’ Review โ†’ Test cycle

๐Ÿง  Planner โ€” LLM generates an action plan before coding

๐Ÿ” Semantic Search โ€” FAISS + sentence-transformers code retrieval

๐Ÿ”Ž Code Search โ€” ripgrep-powered with cross-platform fallback

๐Ÿ“ Self-Review โ€” LLM critiques its own fixes, revises until approved

๐Ÿฉน Patch Engine โ€” Line-level unified diffs instead of full file rewrites

๐Ÿงช Test-Driven โ€” Runs pytest after every fix, retries on failure

๐Ÿ–๏ธ Sandbox Mode โ€” Agent works in an isolated copy, applies changes only on success

๐Ÿ“Š Benchmarks โ€” 5 built-in benchmark suites with automated evaluation

๐Ÿ“ˆ Metrics โ€” Latency, token estimates, retries, and performance tracking

๐Ÿ“‹ Full Audit Trail โ€” Every step logged to logs/run.json

๐Ÿ”’ 100% Offline โ€” Runs on Ollama with small models (2-4 GB)

โšก Low Resource โ€” Works on RTX 3050 (4 GB VRAM) / 16 GB RAM


๐Ÿš€ Quick Start

Prerequisites

Install & Setup

# 1. Clone
git clone https://github.com/VedantJadhav701/Developer-Code-Intelligence-Agent.git
cd Developer-Code-Intelligence-Agent

# 2. Install
pip install devagent-cli  # (Coming soon to PyPI)
# Or locally: pip install -e .

# 3. Verify System (CRITICAL)
# This checks your Python environment, Ollama connection, and dependencies
devagent doctor

# 4. Pull the model
ollama pull qwen2.5-coder:3b

# 5. Run!
devagent run --task "Fix the divide-by-zero bug" --root ./demo_project

CLI Subcommands

Command Description
devagent run Execute a coding task on a project
devagent benchmark Run the automated benchmark suite
devagent doctor Check system health and dependencies
devagent models List available Ollama models
devagent version Show current version

โœจ New: Trust & Safety

๐Ÿ›ก๏ธ Reliability Hardening (v3.2.1+)

DevAgent is now built for Enterprise-grade reliability in complex projects:

  • Path Anchoring: Automatically corrects "root hallucinations." If the agent targets a file in a subdirectory but assumes it's at the root, the system auto-anchors it to the correct project location.
  • Forensic Test Detection: Built-in intelligence to "see through" environment noise. It detects successful test runs even if unrelated parts of the repository have collection errors.
  • Confidence Scoring: Every fix is graded (0-100%) based on test results, surgical precision, and self-review quality.

๐Ÿ•น๏ธ Interactive Mode

Run with --interactive (or -i) to review diffs before they are applied to your project.

devagent run --task "Fix bug" --interactive

๐Ÿ—๏ธ Architecture

graph TD
    CLI[DevAgent CLI] --> Orchestrator[ReAct Orchestrator]
    Orchestrator --> Memory[Working Memory]
    Orchestrator --> Retrieval[Semantic Retrieval FAISS]
    Orchestrator --> Tools[Tool Suite: pytest, ripgrep, git]
    Orchestrator --> Reviewer[Self-Review Loop]
    Reviewer --> Patch[Surgical Patch Engine]
    Patch --> Sandbox[Sandbox Environment]

No API keys. No sign-ups. No cloud.

Optional: Enable Semantic Search

pip install faiss-cpu sentence-transformers

Without these, DevAgent falls back to keyword search โ€” still fully functional.


๐ŸŽฌ Demo

 ____              _                    _
|  _ \  _____   __/ \   __ _  ___ _ __ | |_
| | | |/ _ \ \ / / _ \ / _` |/ _ \ '_ \| __|
| |_| |  __/\ V / ___ \ (_| |  __/ | | | |_
|____/ \___| \_/_/   \_\__, |\___|_| |_|\__|
                       |___/

+==========================================================+
|        DEVELOPER CODE INTELLIGENCE AGENT                 |
|        Model: qwen2.5-coder:3b                          |
|        Sandbox: OFF                                      |
+==========================================================+

  [PLAN] LIKELY_FILES: calculator.py
  1. search_code: divide
  2. read_file: calculator.py
  3. run_tests

  ----------------------------------------
  ITERATION 1/5
  ----------------------------------------
  [TOOL] Executing: search_code(divide)
  >> Found: calculator.py:10:def divide(a, b)
  [REVIEW] #1: APPROVED
  >> Tests: 5 passed โœ“

  [OK]  AGENT COMPLETED SUCCESSFULLY

  Status:     success
  Steps used: 1/5
  Patches:    1
  Time:       8.2s

๐Ÿ—๏ธ Architecture

                    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                    โ”‚       CLI (main.py)          โ”‚
                    โ”‚  --task --root --model       โ”‚
                    โ”‚  --sandbox --benchmark       โ”‚
                    โ”‚  --auto-commit --auto-push   โ”‚
                    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                               โ”‚
                    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                    โ”‚     Planner Layer            โ”‚
                    โ”‚  Identifies files + strategy โ”‚
                    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                               โ”‚
                    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                    โ”‚  Retrieval Layer (Memory)    โ”‚
                    โ”‚  FAISS + Sentence-Transformersโ”‚
                    โ”‚  Chunk โ†’ Embed โ†’ Top-K       โ”‚
                    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                               โ”‚
                    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                    โ”‚     ReAct Agent Loop         โ”‚
                    โ”‚                              โ”‚
                    โ”‚  1. THOUGHT   (LLM)          โ”‚
                    โ”‚  2. ACTION    (Tool)          โ”‚
                    โ”‚  3. OBSERVATION               โ”‚
                    โ”‚  4. FIX       (LLM)          โ”‚
                    โ”‚  5. REVIEW    (LLM)          โ”‚
                    โ”‚  6. PATCH     (Diff Engine)   โ”‚
                    โ”‚  7. TEST      (pytest)        โ”‚
                    โ”‚                              โ”‚
                    โ”‚  if FAIL โ†’ retry              โ”‚
                    โ”‚  if PASS โ†’ done โœ“            โ”‚
                    โ””โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                       โ”‚              โ”‚
              โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”
              โ”‚   Tools   โ”‚    โ”‚   Ollama    โ”‚
              โ”‚           โ”‚    โ”‚  (Local)    โ”‚
              โ”‚ โ€ข search  โ”‚    โ”‚             โ”‚
              โ”‚ โ€ข semanticโ”‚    โ”‚ qwen2.5-    โ”‚
              โ”‚ โ€ข read    โ”‚    โ”‚  coder:3b   โ”‚
              โ”‚ โ€ข patch   โ”‚    โ”‚ phi3:mini   โ”‚
              โ”‚ โ€ข pytest  โ”‚    โ”‚ mistral:7b  โ”‚
              โ”‚ โ€ข flake8  โ”‚    โ”‚             โ”‚
              โ”‚ โ€ข git_diffโ”‚    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
              โ”‚ โ€ข sandbox โ”‚
              โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

9-Layer Architecture

Layer Module Purpose
1. CLI main.py Argument parsing, mode selection, banner
2. Planner app/planner.py Task interpretation, file identification
3. Retrieval app/memory.py FAISS index, semantic chunking, Top-K search
4. Tools tools/* 8 real tools: search, semantic_search, read, write, test, lint, git, sandbox
5. Agent app/agent.py ReAct orchestration loop
6. Review app/reviewer.py Self-critique with APPROVED/REVISE
7. Validation tools/test_runner.py pytest + flake8 execution feedback
8. Logging utils/logger.py Structured JSON audit trail
9. Safety app/sandbox.py Isolated workspace, path validation

๐Ÿ“ Project Structure

Developer-Code-Intelligence-Agent/
โ”œโ”€โ”€ app/
โ”‚   โ”œโ”€โ”€ agent.py            # Core ReAct agent engine
โ”‚   โ”œโ”€โ”€ planner.py          # Task planning layer
โ”‚   โ”œโ”€โ”€ reviewer.py         # Self-review module
โ”‚   โ”œโ”€โ”€ llm.py              # Ollama integration
โ”‚   โ”œโ”€โ”€ memory.py           # FAISS retrieval + working memory
โ”‚   โ”œโ”€โ”€ patcher.py          # Unified diff patch engine
โ”‚   โ”œโ”€โ”€ sandbox.py          # Sandbox workspace manager
โ”‚   โ””โ”€โ”€ state.py            # Shared state dataclass
โ”œโ”€โ”€ tools/
โ”‚   โ”œโ”€โ”€ search.py           # Code search (ripgrep + fallbacks)
โ”‚   โ”œโ”€โ”€ semantic_search.py  # FAISS semantic search
โ”‚   โ”œโ”€โ”€ file_ops.py         # Safe file read/write
โ”‚   โ”œโ”€โ”€ test_runner.py      # pytest runner
โ”‚   โ”œโ”€โ”€ linter.py           # flake8 linter
โ”‚   โ”œโ”€โ”€ git_tools.py        # Git diff/commit/push
โ”‚   โ””โ”€โ”€ benchmark_runner.py # Benchmark evaluation
โ”œโ”€โ”€ utils/
โ”‚   โ”œโ”€โ”€ logger.py           # Structured JSON logger
โ”‚   โ”œโ”€โ”€ config.py           # Centralized configuration
โ”‚   โ””โ”€โ”€ metrics.py          # Performance metrics
โ”œโ”€โ”€ benchmarks/
โ”‚   โ”œโ”€โ”€ divide_by_zero/     # Benchmark: zero division guard
โ”‚   โ”œโ”€โ”€ missing_validation/ # Benchmark: input validation
โ”‚   โ”œโ”€โ”€ syntax_error/       # Benchmark: syntax fix
โ”‚   โ”œโ”€โ”€ import_bug/         # Benchmark: wrong import
โ”‚   โ””โ”€โ”€ edge_case/          # Benchmark: empty list handling
โ”œโ”€โ”€ demo_project/           # Sample buggy project
โ”œโ”€โ”€ docs/
โ”‚   โ””โ”€โ”€ USER_GUIDE.md       # Full usage guide
โ”œโ”€โ”€ main.py                 # CLI entry point
โ”œโ”€โ”€ devagent.py             # Global CLI wrapper
โ”œโ”€โ”€ devagent.bat            # Windows global shortcut
โ”œโ”€โ”€ requirements.txt
โ”œโ”€โ”€ CONTRIBUTING.md
โ”œโ”€โ”€ CHANGELOG.md
โ”œโ”€โ”€ CODE_OF_CONDUCT.md
โ”œโ”€โ”€ SECURITY.md
โ”œโ”€โ”€ LICENSE
โ””โ”€โ”€ README.md

๐Ÿ’ป CLI Reference

python main.py --task "TASK" --root ./project [OPTIONS]
Flag Default Description
--task, -t (required) The coding task for the agent
--root, -r . Project root directory
--model qwen2.5-coder:3b Any Ollama model
--max-steps, -m 5 Max ReAct iterations
--benchmark off Run benchmark suite
--sandbox off Run in isolated sandbox
--auto-commit off Git commit on success
--auto-push off Git push after commit
--verbose, -v off Verbose output

Examples

# Fix a specific bug
python main.py -t "Fix the TypeError in user_service.py" -r ./backend

# Run in sandbox mode (safe โ€” doesn't touch real files until success)
python main.py -t "Fix divide-by-zero bug" -r ./project --sandbox

# Auto-commit changes on success
python main.py -t "Add input validation" -r ./api --auto-commit

# Use a stronger model
python main.py -t "Refactor auth middleware" -r ./server --model mistral:7b

# Run benchmarks
python main.py --benchmark

# More retries for complex tasks
python main.py -t "Make all tests pass" -r ./project --max-steps 10

๐Ÿ“– Full User Guide โ†’


๐Ÿ“Š Benchmarks

DevAgent includes 5 built-in benchmarks to evaluate agent performance:

Benchmark Bug Type Difficulty
divide_by_zero Missing guard clause Easy
missing_validation No input validation Medium
syntax_error Broken syntax Medium
import_bug Wrong module name Easy
edge_case Empty list crash Medium

Run benchmarks:

python main.py --benchmark
python main.py --benchmark --model phi3:mini

๐Ÿ”ง Supported Models

Model Size Speed Quality Best For
qwen2.5-coder:3b 1.9 GB โšก Fast โ˜…โ˜…โ˜…โ˜… Default โ€” best for code
qwen2.5:3b 1.9 GB โšก Fast โ˜…โ˜…โ˜…โ˜† General fallback
phi3:mini 2.2 GB โšก Fast โ˜…โ˜…โ˜…โ˜† Good reasoning
qwen3:4b 2.5 GB โšก Fast โ˜…โ˜…โ˜…โ˜… Better understanding
gemma2:2b 1.6 GB โšกโšก โ˜…โ˜…โ˜†โ˜† Ultra-low resource
mistral:7b 4.4 GB ๐Ÿข โ˜…โ˜…โ˜…โ˜…โ˜… Best quality (8GB+ RAM)

๐Ÿ—บ๏ธ Roadmap

โœ… Completed (v2.0)

  • Core ReAct agent loop
  • Self-review module
  • Tool system (9 tools)
  • Planner layer
  • Semantic retrieval (FAISS)
  • Patch engine (unified diffs)
  • Sandbox mode
  • Benchmark system (5 suites)
  • Metrics + structured logging
  • Git integration
  • CLI with all flags

๐Ÿ”œ Coming Next

  • Multi-file support โ€” Agent works across multiple files simultaneously
  • Language support โ€” JavaScript, TypeScript, Go, Rust
  • Plugin system โ€” Custom tools via YAML/Python
  • Watch mode โ€” Auto-fix on test failure (--watch)
  • VS Code extension โ€” Run agent from your editor
  • Conversation memory โ€” Learn from past runs
  • Multi-agent mode โ€” Planner + Coder + Reviewer + Evaluator agents

๐Ÿค Contributing

We welcome contributions! See CONTRIBUTING.md for details.

git checkout -b feature/your-feature
# ... make changes ...
python -m pytest demo_project/ -v
git commit -m "feat: your feature"
git push origin feature/your-feature

Good first issues are tagged and waiting: Browse good first issues โ†’


๐Ÿ“œ License

MIT โ€” use it however you want. See LICENSE.


โญ Star History

If DevAgent helps you, give it a star! It helps others discover the project.

Star History Chart


Built with ๐Ÿง  by Vedant Jadhav

A lightweight local open-source miniature of Claude Code CLI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

devagent_cli-3.2.1.tar.gz (42.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

devagent_cli-3.2.1-py3-none-any.whl (45.1 kB view details)

Uploaded Python 3

File details

Details for the file devagent_cli-3.2.1.tar.gz.

File metadata

  • Download URL: devagent_cli-3.2.1.tar.gz
  • Upload date:
  • Size: 42.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for devagent_cli-3.2.1.tar.gz
Algorithm Hash digest
SHA256 cc497a8ac97fb751641251a2f27bb305d83e8a3bb5d260063c03d65cc5c342d6
MD5 f63e544e723ee35c1578124d944cf039
BLAKE2b-256 523e12d7a0c409d11e4f843c53146612a676d4a51ef7a4b98f01dbac6af67d52

See more details on using hashes here.

Provenance

The following attestation bundles were made for devagent_cli-3.2.1.tar.gz:

Publisher: publish.yml on VedantJadhav701/Developer-Code-Intelligence-Agent

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file devagent_cli-3.2.1-py3-none-any.whl.

File metadata

  • Download URL: devagent_cli-3.2.1-py3-none-any.whl
  • Upload date:
  • Size: 45.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for devagent_cli-3.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 7902e79236f3f3b5787725f5eca26e0087dd106f3eaa305218673f7a2e2b1062
MD5 02cee0e4b23aaeadb92d3ef41d91b9cc
BLAKE2b-256 f277caacb30dd8ccc708eee1c1d8120774cb1e75725968a672287cf71a7cea85

See more details on using hashes here.

Provenance

The following attestation bundles were made for devagent_cli-3.2.1-py3-none-any.whl:

Publisher: publish.yml on VedantJadhav701/Developer-Code-Intelligence-Agent

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page