Skip to main content

A minimal, transparent AI coding agent — multi-provider (Claude, GPT, Groq, Gemini, GLM, Ollama), sandboxed, with a planner, sub-agents, skills, data analysis, and slide generation.

Project description

🤖 pi-agent

Open-source AI coding agent with persistent memory, self-review, and autonomous tool-use —
any model (Claude · GPT · free tiers · local Ollama), skills, sub-agents, data → slides.
Minimal and transparent: you see every tool call, and the whole core fits in your head.

CI PyPI Downloads Python License: MIT Open in Streamlit

🚀 Try it live  ·  Install  ·  📖 Usage guide  ·  Skills  ·  Architecture  ·  Roadmap


pi-agent web demo — free Groq default, BYO key

pi lets an LLM read, edit, and run code in your working directory through a tool-use loop — and shows you everything it does. It speaks to Claude, GPT, free models (Groq · OpenRouter · Gemini), and local Ollama through one provider-neutral core. Inspired by the Pi philosophy: lean, hackable, no bloat.

Built as a learning + portfolio project. The core loop is ~150 lines; the transcript is provider-neutral, so adding a tool or a model is trivial.

🔁 How it works

flowchart TD
    U(["🧑 You — prompt"]) --> AG["🤖 Agent loop"]
    AG -->|"ask (+ tools + skills)"| LLM{{"LLM"}}
    LLM -->|"tool calls"| T["🔧 Run tools"]
    T -->|"results"| AG
    LLM -->|"no more tools"| OUT(["✅ Answer + live plan + token cost"])
    AG -. "retry transient errors (≤5×)" .-> LLM

The model plans, calls tools, observes the results, and repeats until done — streaming text, a live to-do checklist, and the running token cost as it goes.

✨ Features

🧠 Multi-provider Claude · GPT · Groq · OpenRouter · Gemini · EURI · GLM (free) · Ollama (local, no key) — switch mid-chat with /model; bare pi auto-detects from your env keys
🧬 Persistent memory the agent saves project facts to .pi/memory.md (remember tool) and recalls them next session — day 5 continues day 1
🔍 Self-review --reflect: after answering, one bounded pass that re-checks the work and fixes real problems
🎯 Skill routing only the most relevant skills are inlined per prompt — leaner prompts, better adherence, cheaper free tiers
📋 Planner + live todos declares a plan via update_plan; the web app renders a live ⬜→⏳→✅ checklist
🤝 Sub-agents delegate a focused subtask to a sequential sub-agent (no recursion) for big jobs
📦 Project ZIP upload drop a zipped repo (zip-slip-safe) → "explain this project" (purpose, flow, components)
📊 Data analysis analyze_data profiles a CSV/Excel like a data scientist (stats, missing %, correlations)
📑 Slide generation make_slides builds a downloadable .pptx from an outline
🔁 Resilient transient errors (429/5xx/timeout) auto-retry ≤5× w/ jittered backoff; bad key/request fail fast; long sessions trim history to fit the context window
🌊 Streaming + cost token-by-token streaming on every provider (Anthropic + all OpenAI-compatible), per-turn token counts, estimated session cost (/cost)
🔧 git + web read-only git inspection and an SSRF-guarded web_fetch, locally
📜 Skills SKILL.md files inlined into the prompt — 12 bundled, add your own with zero code
🔒 Sandboxed & safe paths confined to the workspace; public web demo runs no raw shell

Tools: update_plan · delegate · remember (persistent memory, local) · read_file · write_file · edit_file · apply_patch (atomic multi-file) · list_dir · grep · git (read-only, local) · web_fetch (SSRF-guarded, local) · run_command (restricted, public-safe) · run_bash (full shell, local only) · analyze_data · make_slides.

🚀 Install

pipx install pi-coding-agent        # recommended for the CLI
uv tool install pi-coding-agent     # or with uv
pip install "pi-coding-agent[data]" # or plain pip (+ data analysis & slides)

(For hacking on it: git clone https://github.com/Ashutosh0428/pi-agent && cd pi-agent && pip install -e ".[data,dev]" — see CONTRIBUTING.md.)

No key? Just run pi. It auto-detects whichever provider key you've set — and with none at all it shows a quick-setup panel with three free paths (Groq, Gemini, Ollama) instead of an error.

Pick a provider and set its key (env var, or cp .env.example .env):

Provider Cost Setup
Anthropic paid export ANTHROPIC_API_KEY=sk-ant-...
OpenAI paid export OPENAI_API_KEY=sk-...
Groq 🆓 free export GROQ_API_KEY=... · get a key
OpenRouter 🆓 free export OPENROUTER_API_KEY=... · get a key
Gemini 🆓 free + paid export GEMINI_API_KEY=... · get a key
EURI 🆓 free export EURI_API_KEY=... · get a key · 40+ models (OpenAI-compatible)
GLM (Z.ai) 🆓 free + paid export ZAI_API_KEY=... · get a key · glm-4.5-flash free, glm-5.1 paid
Ollama 🆓 local, no key install Ollama → ollama pull llama3.1 (runs at localhost:11434)

Any model works — the web app has a per-provider model dropdown (+ a custom field), and --model takes any id the provider offers, free or paid: gemini-3.5-flash (free) / gemini-3.1-pro (paid), gpt-4o-mini / gpt-4o, claude-sonnet-4-6 / claude-opus-4-8, llama-3.3-70b-versatile, etc.

pi                                                          # REPL — provider auto-detected from your env keys
pi "explain this repo"                                      # one-shot, same auto-detection
pi --provider groq --model llama-3.3-70b-versatile "explain this repo"
pi --provider gemini --model gemini-3.5-flash "summarise what this project does"  # free
pi --provider gemini --model gemini-3.1-pro  "deep-review this module"             # paid (student Pro)
pi --provider ollama --model qwen2.5-coder:7b "write a string-reverse fn and a test"
pi --skills-dir ./skills "review src/pi_agent/llm.py"
pi --reflect "refactor utils.py, keep behavior identical"   # + one self-review pass
pi "remember that this repo uses pytest fixtures, never mocks"  # persists to .pi/memory.md
pi --no-shell                                               # safe mode (disable run_bash)

REPL commands: /help · /tools · /model <id> · /think · /cost · /reset · /exit. Flags: --provider · --model · --dir · --yes · --no-shell · --no-stream · --think · --skills-dir · --version.

🐳 Docker

docker build -t pi-agent .
docker run -it --rm -e GROQ_API_KEY -v "$PWD":/work pi-agent "explain this repo"

Keys never touch the repo — read from the environment only, never stored or logged; .env is gitignored.

🖥️ Why Ollama instead of a cloud AI tool (Copilot / ChatGPT / Cursor)?

  • 🔒 Private — your code never leaves your machine. Ideal for proprietary/regulated code you can't paste into a cloud tool.
  • 💸 Free, no limits — no key, no per-token bill, no rate limits, no subscription.
  • 📴 Offline — works on a plane or air-gapped network.
  • ⚖️ Honest trade-off — local models are smaller/slower than frontier Claude/GPT; great for everyday review/refactor, switch to a cloud model (one flag) for the hardest reasoning.

🌐 Web demo

A public-safe slice of pi (live) — or run it yourself:

pip install -r requirements.txt
streamlit run streamlit_app.py      # http://localhost:8501
  • Free by default — opens on Groq (🆓 key, no card); one-click starter prompts on an empty chat.
  • Live streaming — the answer renders token by token while tool steps stay visible.
  • Bring your own key — used only for the session; never stored, logged, or committed.
  • No raw shell — visitors get run_command (read-only allowlist, no network, sandboxed) instead of run_bash.
  • Upload a file, a project .zip, or a CSV — then review, explain the project, or analyze the data and make a deck.
  • Sandboxed — file tools + ZIP extraction confined to a fresh per-session temp dir (zip-slip-guarded).

Locally the web app can also reach Ollama; the hosted demo can't (no localhost Ollama on the cloud server).

🧩 Architecture

flowchart LR
    CLI["💻 CLI / REPL"] --> AG
    WEB["🌐 Streamlit<br/>(BYO key)"] --> AG
    AG["🤖 Agent<br/>provider/UI-agnostic<br/>neutral transcript + retry"]
    AG --> P["🧠 Providers<br/>Claude · GPT · Groq · OpenRouter<br/>Gemini · EURI · GLM · Ollama"]
    AG --> T["🔧 Tools<br/>plan · fs · grep · delegate · git · web_fetch<br/>run_command · run_bash<br/>analyze_data · make_slides"]
    AG --> S["📜 Skills<br/>SKILL.md inlined"]
    T --> SB["🔒 Sandbox<br/>path-confined workspace"]
src/pi_agent/
  config.py        # AgentConfig + system prompt
  sandbox.py       # path-safety boundary (the security choke-point)
  llm.py           # provider registry + neutral transcript ↔ each wire format; usage/cost
  agent.py         # the tool-use loop: ReAct, retry, delegate (provider/UI-agnostic)
  skills.py        # load SKILL.md files, inline them into the system prompt
  upload.py        # zip-slip-safe project extraction
  repl.py          # terminal front-end (rich): streaming, /model, /cost, /think
  cli.py           # `pi` entry point
  tools/
    base.py registry.py        # Tool spec + dispatch
    planning.py                # update_plan (live todos)
    filesystem.py search.py    # read/write/edit/list + grep
    _subprocess.py             # shared path-guard + confined runner
    shell.py safe_exec.py      # run_bash (local) + run_command (public-safe)
    vcs.py web.py              # git (read-only) + web_fetch (SSRF-guarded)
    subagent.py                # delegate
    datasci.py                 # analyze_data + make_slides
streamlit_app.py   # public web demo (BYO key, no shell, temp sandbox)
skills/            # 12 SKILL.md skills

The agent keeps its transcript provider-neutral (user / assistant / tool); each provider translates it to its own wire format (Anthropic content blocks vs OpenAI tool_calls). That single seam is what lets one conversation move between Claude, GPT, Groq, OpenRouter, and Ollama — even mid-chat.

📜 Skills

A skill is a SKILL.md describing how to do one task well; pi inlines a skill index + contents into the system prompt, so the model applies them without spending a tool call to read them.

skills/<name>/SKILL.md   # frontmatter (name, description, trigger) + When / How / Avoid / Done-well

Bundled (18): planning · orchestrate · write-tests · code-review · security-review · performance-review · refactor · debug · fix-ci · explain-code · explain-project · architecture · write-docs · write-readme · commit-message · api-design · data-analysis · make-deck. Add your own by dropping a new folder — no code changes.

Auto-routing: pi scores skills against your prompt and inlines only the top 3 in full (the index of all 18 stays visible) — leaner prompts, cheaper free tiers. --skills-top-k 0 restores inline-everything.

🛠️ Extending it (the whole point)

Add a tool — write a handler + a Tool, register it:

Tool(
    name="word_count",
    description="Count words in a file.",
    input_schema={"type": "object", "properties": {"path": {"type": "string"}}, "required": ["path"]},
    handler=lambda args, sb: str(len(sb.resolve(args["path"]).read_text().split())),
)

Add a provider — implement LLMProvider.complete(...) (translate the neutral transcript, call the API, return an AssistantResponse). The agent loop is unchanged.

✅ Testing

pytest          # 136 tests — scripted fake provider, no API key, no network

Covers the sandbox boundary, every tool (including the run_command RCE guard, the read-only git tool, and the web_fetch SSRF guard), the agent loop (tool execution, max-iteration guard, confirmation, events, usage, retry, history trimming, delegation), the provider translators, streaming chunk-assembly, and zip-slip safety.

🗺️ Roadmap

Next up: MCP server support, a local knowledge base (pi ingest docs/pi ask), browser automation, deep-research mode, parallel sub-agents, and pi benchmark — the full plan with phases lives in ROADMAP.md. Release history: CHANGELOG.md.


Built by Ashutosh Sharma.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pi_coding_agent-0.5.0.tar.gz (60.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pi_coding_agent-0.5.0-py3-none-any.whl (51.6 kB view details)

Uploaded Python 3

File details

Details for the file pi_coding_agent-0.5.0.tar.gz.

File metadata

  • Download URL: pi_coding_agent-0.5.0.tar.gz
  • Upload date:
  • Size: 60.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pi_coding_agent-0.5.0.tar.gz
Algorithm Hash digest
SHA256 87b4b26076facf63e410a50482cf4218c2713933ead30e6bd1333e05076c1239
MD5 e0152768d901638ef8fcf6748bc1ae1b
BLAKE2b-256 1079f01096bc93799bffb7decdeedc632a65e35fb41282be61afb9afc59300b7

See more details on using hashes here.

Provenance

The following attestation bundles were made for pi_coding_agent-0.5.0.tar.gz:

Publisher: publish.yml on Ashutosh0428/pi-agent

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pi_coding_agent-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: pi_coding_agent-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 51.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pi_coding_agent-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a005ea6d8db50068117763ae2bb2c0e86a95c624a8eae90123cdf7ff673c8daf
MD5 c06c68c12ab277a56bd548622fdcfd54
BLAKE2b-256 e70b3be959279812062ddf518f3f4f191f30637a9ed7efc6a28c4c4ca9ad3d21

See more details on using hashes here.

Provenance

The following attestation bundles were made for pi_coding_agent-0.5.0-py3-none-any.whl:

Publisher: publish.yml on Ashutosh0428/pi-agent

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page