Skip to main content

A minimal, transparent AI coding agent — multi-provider (Claude, GPT, Groq, Gemini, GLM, Ollama), sandboxed, with a planner, sub-agents, skills, data analysis, and slide generation.

Project description

🤖 pi-agent

Open-source AI coding agent with persistent memory, self-review, and autonomous tool-use —
any model (Claude · GPT · free tiers · local Ollama), skills, sub-agents, data → slides.
Minimal and transparent: you see every tool call, and the whole core fits in your head.

CI PyPI Downloads Python License: MIT Open in Streamlit

🚀 Try it live  ·  Install  ·  📖 Usage guide  ·  Skills  ·  Architecture  ·  Roadmap


pi-agent web demo — free Groq default, BYO key

pi lets an LLM read, edit, and run code in your working directory through a tool-use loop — and shows you everything it does. It speaks to Claude, GPT, free models (Groq · OpenRouter · Gemini), and local Ollama through one provider-neutral core. Inspired by the Pi philosophy: lean, hackable, no bloat.

Built as a learning + portfolio project. The core loop is ~150 lines; the transcript is provider-neutral, so adding a tool or a model is trivial.

🔁 How it works

flowchart TD
    U(["🧑 You — prompt"]) --> AG["🤖 Agent loop"]
    AG -->|"ask (+ tools + skills)"| LLM{{"LLM"}}
    LLM -->|"tool calls"| T["🔧 Run tools"]
    T -->|"results"| AG
    LLM -->|"no more tools"| OUT(["✅ Answer + live plan + token cost"])
    AG -. "retry transient errors (≤5×)" .-> LLM

The model plans, calls tools, observes the results, and repeats until done — streaming text, a live to-do checklist, and the running token cost as it goes.

✨ Features

🧠 Multi-provider Claude · GPT · Groq · OpenRouter · Gemini · EURI · GLM (free) · Ollama (local, no key) — switch mid-chat with /model; bare pi auto-detects from your env keys
🔌 MCP support connect any MCP server (GitHub, Postgres, Slack, …) via the standard mcpServers config — its tools appear as mcp__server__tool. Zero new dependencies
📚 Local knowledge base pi ingest docs/ → BM25 over sqlite → pi ask "how does auth work?" with citations. Fully offline
🛡️ Guardrails (on by default) blocks secret exfiltration, confirms destructive shell commands even under --yes, redacts secrets, and spotlights untrusted tool output against prompt injection
🧬 Persistent memory the agent saves project facts to .pi/memory.md (remember tool) and recalls them next session — day 5 continues day 1
🔍 Self-review --reflect: after answering, one bounded pass that re-checks the work and fixes real problems
🎯 Skill routing only the most relevant skills are inlined per prompt — leaner prompts, better adherence, cheaper free tiers
📋 Planner + live todos declares a plan via update_plan; the web app renders a live ⬜→⏳→✅ checklist
🤝 Sub-agents delegate a focused subtask to a sequential sub-agent (no recursion) for big jobs
📦 Project ZIP upload drop a zipped repo (zip-slip-safe) → "explain this project" (purpose, flow, components)
📊 Data analysis analyze_data profiles a CSV/Excel like a data scientist (stats, missing %, correlations)
📑 Slide generation make_slides builds a downloadable .pptx from an outline
🔁 Resilient transient errors (429/5xx/timeout) auto-retry ≤5× w/ jittered backoff; bad key/request fail fast; long sessions trim history to fit the context window
🌊 Streaming + cost token-by-token streaming on every provider (Anthropic + all OpenAI-compatible), per-turn token counts, estimated session cost (/cost)
🔧 git + web read-only git inspection and an SSRF-guarded web_fetch, locally
📜 Skills SKILL.md files inlined into the prompt — 12 bundled, add your own with zero code
🔒 Sandboxed & safe paths confined to the workspace; public web demo runs no raw shell

Tools: update_plan · delegate · remember (persistent memory, local) · read_file · write_file · edit_file · apply_patch (atomic multi-file) · list_dir · grep · git (read-only, local) · web_fetch (SSRF-guarded, local) · run_command (restricted, public-safe) · run_bash (full shell, local only) · analyze_data · make_slides.

🚀 Install

pipx install pi-coding-agent        # recommended for the CLI
uv tool install pi-coding-agent     # or with uv
pip install "pi-coding-agent[data]" # or plain pip (+ data analysis & slides)

(For hacking on it: git clone https://github.com/Ashutosh0428/pi-agent && cd pi-agent && pip install -e ".[data,dev]" — see CONTRIBUTING.md.)

No key? Just run pi. It auto-detects whichever provider key you've set — and with none at all it shows a quick-setup panel with three free paths (Groq, Gemini, Ollama) instead of an error.

Pick a provider and set its key (env var, or cp .env.example .env):

Provider Cost Setup
Anthropic paid export ANTHROPIC_API_KEY=sk-ant-...
OpenAI paid export OPENAI_API_KEY=sk-...
Groq 🆓 free export GROQ_API_KEY=... · get a key
OpenRouter 🆓 free export OPENROUTER_API_KEY=... · get a key
Gemini 🆓 free + paid export GEMINI_API_KEY=... · get a key
EURI 🆓 free export EURI_API_KEY=... · get a key · 40+ models (OpenAI-compatible)
GLM (Z.ai) 🆓 free + paid export ZAI_API_KEY=... · get a key · glm-4.5-flash free, glm-5.1 paid
Ollama 🆓 local, no key install Ollama → ollama pull llama3.1 (runs at localhost:11434)

Any model works — the web app has a per-provider model dropdown (+ a custom field), and --model takes any id the provider offers, free or paid: gemini-3.5-flash (free) / gemini-3.1-pro (paid), gpt-4o-mini / gpt-4o, claude-sonnet-4-6 / claude-opus-4-8, llama-3.3-70b-versatile, etc.

pi                                                          # REPL — provider auto-detected from your env keys
pi "explain this repo"                                      # one-shot, same auto-detection
pi --provider groq --model llama-3.3-70b-versatile "explain this repo"
pi --provider gemini --model gemini-3.5-flash "summarise what this project does"  # free
pi --provider gemini --model gemini-3.1-pro  "deep-review this module"             # paid (student Pro)
pi --provider ollama --model qwen2.5-coder:7b "write a string-reverse fn and a test"
pi --skills-dir ./skills "review src/pi_agent/llm.py"
pi --reflect "refactor utils.py, keep behavior identical"   # + one self-review pass
pi "remember that this repo uses pytest fixtures, never mocks"  # persists to .pi/memory.md
pi --no-shell                                               # safe mode (disable run_bash)

REPL commands: /help · /tools · /mcp · /model <id> · /think · /cost · /reset · /exit.

🔌 Connect MCP servers + your own docs

# Knowledge base — ingest your docs, then ask grounded questions (offline)
pi ingest ./docs
pi ask "how does authentication work?"        # answers with [source] citations

# MCP — drop a standard mcpServers config at .pi/mcp.json, then any server's
# tools (GitHub, Postgres, Slack, filesystem…) appear as mcp__server__tool
cat .pi/mcp.json
# {"mcpServers": {"github": {"command": "npx",
#   "args": ["-y", "@modelcontextprotocol/server-github"],
#   "env": {"GITHUB_TOKEN": "ghp_…"}}}}
pi   # /mcp lists the connected tools

Flags: --provider · --model · --dir · --yes · --no-shell · --no-stream · --think · --skills-dir · --version.

🐳 Docker

docker build -t pi-agent .
docker run -it --rm -e GROQ_API_KEY -v "$PWD":/work pi-agent "explain this repo"

Keys never touch the repo — read from the environment only, never stored or logged; .env is gitignored.

🖥️ Why Ollama instead of a cloud AI tool (Copilot / ChatGPT / Cursor)?

  • 🔒 Private — your code never leaves your machine. Ideal for proprietary/regulated code you can't paste into a cloud tool.
  • 💸 Free, no limits — no key, no per-token bill, no rate limits, no subscription.
  • 📴 Offline — works on a plane or air-gapped network.
  • ⚖️ Honest trade-off — local models are smaller/slower than frontier Claude/GPT; great for everyday review/refactor, switch to a cloud model (one flag) for the hardest reasoning.

🌐 Web demo

A public-safe slice of pi (live) — or run it yourself:

pip install -r requirements.txt
streamlit run streamlit_app.py      # http://localhost:8501
  • Free by default — opens on Groq (🆓 key, no card); one-click starter prompts on an empty chat.
  • Live streaming — the answer renders token by token while tool steps stay visible.
  • Bring your own key — used only for the session; never stored, logged, or committed.
  • No raw shell — visitors get run_command (read-only allowlist, no network, sandboxed) instead of run_bash.
  • Upload a file, a project .zip, or a CSV — then review, explain the project, or analyze the data and make a deck.
  • Sandboxed — file tools + ZIP extraction confined to a fresh per-session temp dir (zip-slip-guarded).

Locally the web app can also reach Ollama; the hosted demo can't (no localhost Ollama on the cloud server).

🧩 Architecture

flowchart LR
    CLI["💻 CLI / REPL"] --> AG
    WEB["🌐 Streamlit<br/>(BYO key)"] --> AG
    AG["🤖 Agent<br/>provider/UI-agnostic<br/>neutral transcript + retry"]
    AG --> P["🧠 Providers<br/>Claude · GPT · Groq · OpenRouter<br/>Gemini · EURI · GLM · Ollama"]
    AG --> T["🔧 Tools<br/>plan · fs · grep · delegate · git · web_fetch<br/>run_command · run_bash<br/>analyze_data · make_slides"]
    AG --> S["📜 Skills<br/>SKILL.md inlined"]
    T --> SB["🔒 Sandbox<br/>path-confined workspace"]
src/pi_agent/
  config.py        # AgentConfig + system prompt
  sandbox.py       # path-safety boundary (the security choke-point)
  llm.py           # provider registry + neutral transcript ↔ each wire format; usage/cost
  agent.py         # the tool-use loop: ReAct, retry, delegate (provider/UI-agnostic)
  skills.py        # load SKILL.md files, inline them into the system prompt
  upload.py        # zip-slip-safe project extraction
  repl.py          # terminal front-end (rich): streaming, /model, /cost, /think
  cli.py           # `pi` entry point
  tools/
    base.py registry.py        # Tool spec + dispatch
    planning.py                # update_plan (live todos)
    filesystem.py search.py    # read/write/edit/list + grep
    _subprocess.py             # shared path-guard + confined runner
    shell.py safe_exec.py      # run_bash (local) + run_command (public-safe)
    vcs.py web.py              # git (read-only) + web_fetch (SSRF-guarded)
    subagent.py                # delegate
    datasci.py                 # analyze_data + make_slides
streamlit_app.py   # public web demo (BYO key, no shell, temp sandbox)
skills/            # 12 SKILL.md skills

The agent keeps its transcript provider-neutral (user / assistant / tool); each provider translates it to its own wire format (Anthropic content blocks vs OpenAI tool_calls). That single seam is what lets one conversation move between Claude, GPT, Groq, OpenRouter, and Ollama — even mid-chat.

📜 Skills

A skill is a SKILL.md describing how to do one task well; pi inlines a skill index + contents into the system prompt, so the model applies them without spending a tool call to read them.

skills/<name>/SKILL.md   # frontmatter (name, description, trigger) + When / How / Avoid / Done-well

Bundled (18): planning · orchestrate · write-tests · code-review · security-review · performance-review · refactor · debug · fix-ci · explain-code · explain-project · architecture · write-docs · write-readme · commit-message · api-design · data-analysis · make-deck. Add your own by dropping a new folder — no code changes.

Auto-routing: pi scores skills against your prompt and inlines only the top 3 in full (the index of all 18 stays visible) — leaner prompts, cheaper free tiers. --skills-top-k 0 restores inline-everything.

🛠️ Extending it (the whole point)

Add a tool — write a handler + a Tool, register it:

Tool(
    name="word_count",
    description="Count words in a file.",
    input_schema={"type": "object", "properties": {"path": {"type": "string"}}, "required": ["path"]},
    handler=lambda args, sb: str(len(sb.resolve(args["path"]).read_text().split())),
)

Add a provider — implement LLMProvider.complete(...) (translate the neutral transcript, call the API, return an AssistantResponse). The agent loop is unchanged.

✅ Testing

pytest          # 174 tests — scripted fake provider/MCP server, no API key, no network

Covers the sandbox boundary, every tool (including the run_command RCE guard, the read-only git tool, and the web_fetch SSRF guard), the agent loop (tool execution, max-iteration guard, confirmation, events, usage, retry, history trimming, delegation), the provider translators, streaming chunk-assembly, and zip-slip safety.

🗺️ Roadmap

Shipped in 0.6: MCP server support · local knowledge base · guardrails. Next up: browser automation, deep-research mode, parallel sub-agents, and pi benchmark — the full plan with phases lives in ROADMAP.md. Release history: CHANGELOG.md.


Built by Ashutosh Sharma.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pi_coding_agent-0.6.0.tar.gz (74.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pi_coding_agent-0.6.0-py3-none-any.whl (62.8 kB view details)

Uploaded Python 3

File details

Details for the file pi_coding_agent-0.6.0.tar.gz.

File metadata

  • Download URL: pi_coding_agent-0.6.0.tar.gz
  • Upload date:
  • Size: 74.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pi_coding_agent-0.6.0.tar.gz
Algorithm Hash digest
SHA256 7c81b98118cb9c8b6f749a52d6ad76cb5b17b4e28644e7a0d760547a1797d0a2
MD5 78c2c4377b66ac573aa34dbc06986ecc
BLAKE2b-256 4c289c31ebda3571ebdb64264a167b48eeba029713861bac0ea17fc4544be836

See more details on using hashes here.

Provenance

The following attestation bundles were made for pi_coding_agent-0.6.0.tar.gz:

Publisher: publish.yml on Ashutosh0428/pi-agent

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pi_coding_agent-0.6.0-py3-none-any.whl.

File metadata

  • Download URL: pi_coding_agent-0.6.0-py3-none-any.whl
  • Upload date:
  • Size: 62.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pi_coding_agent-0.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0ced4a1f101d3b409849ccf5b4be69fca0569c0ba22c6b0cf5be11e1d75d487f
MD5 d4e73f32876dc20ce240ef5fbcaeac54
BLAKE2b-256 36c307ff95fdb836ec96f573c2dd0af0aad2fe6113f34409ea466ed163095c41

See more details on using hashes here.

Provenance

The following attestation bundles were made for pi_coding_agent-0.6.0-py3-none-any.whl:

Publisher: publish.yml on Ashutosh0428/pi-agent

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page