Auzek — an autonomous coding agent that plans, executes, self-verifies and self-heals across multiple LLM providers.
Project description
Auzek
An autonomous coding agent by Azaan (Auzek).
Auzek is an autonomous coding agent that understands the repo, plans before it codes, executes one step at a time, verifies its own work, and self-heals on failure before moving on. It runs on any major LLM provider — bring your own API key (Anthropic, OpenAI, Groq, Google, Mistral, DeepSeek, or local Ollama).
It is built on LangGraph (orchestration) and LiteLLM (provider gateway).
pip install auzek
auzek run "add input validation to the /signup endpoint" --provider groq
Why it's different from a "blind" coding bot
| Naive agent | This agent |
|---|---|
| Starts editing immediately | Onboards to the repo first (stack, tests, layout, git history) |
| Holds the plan in context | Writes the plan to disk (.agent/plan.md) — survives crashes |
| "Looks done" after writing | Marks a step done only after running its verification |
| Retries forever | Hard stop after N recovery attempts, then escalates |
| One giant change | Atomic steps, optionally micro-committed |
| "Done" = code written | "Done" = full test/lint/typecheck pass + diff reviewed vs. the task |
The lifecycle (a LangGraph state machine)
context → planning → [human approval] → execution ⇄ recovery → verification → report
- Context – lists/reads files, searches code, reads git history → a briefing.
- Planning – emits a structured, ordered, atomic plan (
submit_plantool). - Approval – optional human gate (pause/approve the plan).
- Execution – implements one step, then runs its verification.
- Recovery – on failure, widens investigation and retries (capped).
- Verification – runs the full suite, reviews the whole diff vs. the task.
- Report – writes an honest
.agent/report.md.
State and plan live in .agent/ so a run is inspectable and resumable.
Install
# from PyPI (once published)
pip install auzek
# or with pipx so the `auzek` command is globally available, isolated
pipx install auzek
From source (for development):
cd Autonomous_Agent
python -m venv .venv && . .venv/Scripts/activate # Windows
# or: source .venv/bin/activate # macOS/Linux
pip install -e .
Configure keys
cp .env.example .env
# fill in the provider(s) you use, e.g. GROQ_API_KEY=...
Check what's wired up:
auzek providers
Run
# operate on the current repo
auzek run "Add input validation to the /signup endpoint and a test for it"
# pick a provider/model explicitly (Groq example)
auzek run "Refactor utils.py to remove the duplicated date parsing" \
--provider groq --model llama-3.3-70b-versatile
# point at another repo, auto-approve the plan, micro-commit each step
auzek run "Fix the failing login test" \
--workspace ../my-project --yes --auto-commit
Useful flags: --provider, --model, --api-key, --workspace, --yes
(auto-approve), --no-approval, --max-steps, --auto-commit, --temperature.
Inspect the plan any time:
auzek plan-show --workspace ../my-project
Configuration (config.yaml)
Verification commands auto-detect when blank; set them to be explicit:
provider: anthropic
model: claude-sonnet-4-6
max_recovery_attempts: 3
max_steps: 40
auto_commit: false
require_plan_approval: true
test_command: "pytest -q"
lint_command: "ruff check ."
typecheck_command: "mypy ."
Resolution order: CLI flags > env vars (AGENT_*) > config.yaml > defaults.
Project layout
src/auzek/
cli.py # Typer CLI, approval gate, output
config.py # layered config
llm.py # multi-provider gateway (LiteLLM) + key handling
runtime.py # shared deps + the core tool-calling loop
state.py # LangGraph state schema
graph.py # the state machine (nodes + conditional edges)
prompts.py # per-phase system prompts
memory/plan_store.py # the durable plan (json + markdown)
tools/ # read/write/edit, list, search, shell, git
nodes/ # context, planning, approval, execution, recovery,
# verification, report
Adding a provider
Add one line to PROVIDERS in llm.py:
"xai": ProviderSpec("xai", "XAI_API_KEY", "grok-2-latest"),
LiteLLM handles the wire format; nothing else changes.
Safety
- All file access is sandboxed to the workspace;
deny_globsblocks.env,.git,node_modules, etc. - The shell tool has a destructive-command guardrail and output/time limits — but it is not a security boundary. For untrusted tasks, run in a container or VM.
A note on SWE-bench / "beating" other models
This is a strong, production-shaped harness. On agentic benchmarks the
score is dominated by (a) the underlying model and (b) harness discipline —
plan/verify/self-heal loops, tight diffs, real test execution — all of which
this implements. To actually measure it, wire auzek run to the SWE-bench
task format (clone repo at the given commit, feed the issue as the task, export
the resulting git diff as the prediction patch) and run the official
evaluation. Treat any ranking as something you measure, not assume.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file auzek-0.1.1.tar.gz.
File metadata
- Download URL: auzek-0.1.1.tar.gz
- Upload date:
- Size: 32.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
10fddcb2b134eef5a4660baf56be8dbdaeadbcc4c4b617e021efe1b958b4c31d
|
|
| MD5 |
befba8c1f8307e3d0fd149f2d8d71466
|
|
| BLAKE2b-256 |
d8b50ccfc3bbd69bb1a8dfd6801ce2a86044a80f7431e834cc8d4a8e60abc853
|
File details
Details for the file auzek-0.1.1-py3-none-any.whl.
File metadata
- Download URL: auzek-0.1.1-py3-none-any.whl
- Upload date:
- Size: 37.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
49c6c33c13250a695d61f2d02c4de5f1dd711dd2ce7ed405afe3043115c2f513
|
|
| MD5 |
d60e6846f17b9a9bd9e41d2a37dfc9f7
|
|
| BLAKE2b-256 |
5cfe242ab374ab3cfa64fa45a75d558553aacf1da8ab3277b5c82a2dafbc12ac
|