A code-native agentic framework for building robust AI agents.
Project description
Embeddable Autonomous Agent Runtime for Software Engineering
Embeddable Autonomous Agent (EAA) • Code-as-Interface Runtime • Terminal Multiplexer • MIT Licensed
pip install codepilot-ai
What CodePilot Is
CodePilot is a Python library for embedding autonomous software-engineering agents into your own products: CLIs, FastAPI services, hosted code-server workspaces, internal developer tools, CI repair systems, and local automation.
It is intentionally not a hosted chatbot UI. The package gives applications a runtime: model inference, tool execution, file editing, terminal control, persistence, hooks, and completion semantics. You bring the product surface, auth model, sandbox, database, and deployment strategy.
Version: 0.9.2
Full user documentation lives at: https://Jahanzeb-git.github.io/codepilot/
Quick Start
Create an agent.yaml:
agent:
name: CodePilot
role: Autonomous software engineering agent.
model:
provider: anthropic
name: claude-sonnet-4-5
api_key_env: ANTHROPIC_API_KEY
runtime:
work_dir: ./workspace
max_steps: 20
unsafe_mode: false
tools:
- name: read_file
enabled: true
- name: write_file
enabled: true
- name: execute
enabled: true
config:
require_permission: true
- name: read_output
enabled: true
- name: send_input
enabled: true
- name: terminate_terminal
enabled: true
- name: find
enabled: true
Run the agent:
from codepilot import Runtime, on_stream, on_finish
runtime = Runtime("agent.yaml", stream=True)
@on_stream(runtime)
def stream(text: str, **_):
print(text, end="", flush=True)
@on_finish(runtime)
def finish(summary: str, **_):
print(f"\nDone: {summary}\n")
summary = runtime.run("Inspect the project and fix the failing tests.")
print(summary)
Async applications should use AsyncRuntime:
from codepilot import AsyncRuntime
runtime = AsyncRuntime("agent.yaml", session="db", db=async_engine, stream=True)
summary = await runtime.run("Refactor the repository layer to use async SQLAlchemy.")
Architecture
CodePilot is designed as a library-first runtime that can be embedded under many product surfaces.
flowchart TD
A[Your app: CLI, FastAPI, code-server extension, desktop app] --> B[CodePilot Runtime]
B --> C[LLM Provider]
B --> D[Tool Registry]
D --> E[Filesystem Tools]
D --> F[Terminal Tools]
D --> G[Search and Context Tools]
B --> H[Session Backend]
H --> I[Memory]
H --> J[File JSON]
H --> K[SQLAlchemy Database]
F --> L[PTY / ConPTY]
L --> M[Unix Socket Multiplexer on POSIX]
For hosted web IDE deployments, the intended shape is a small control plane plus disposable per-user runtime machines:
flowchart LR
Browser --> FlyProxy[Fly Proxy]
FlyProxy --> CodeServer[code-server :8080]
CodeServer --> Extension[Custom code-server extension]
Extension --> RuntimeSock[/run/codepilot/runtime.sock]
RuntimeSock --> Daemon[CodePilot runtime daemon]
Daemon --> TerminalSock[/tmp/codepilot_main.sock]
Daemon --> Postgres[(Postgres / Neon)]
Daemon --> ObjectStore[(Backblaze B2 snapshots)]
Daemon --> Workspace[Workspace files]
Why Code-as-Interface
Most agent frameworks force the model to express actions as JSON function calls. CodePilot instead asks the model to write Python inside a fenced codepilot control block:
I will inspect the failing test first.
```codepilot
read_file("tests/test_api.py")
execute("main", "pytest tests/test_api.py -q", timeout=30)
```
The runtime executes only the codepilot block. Ordinary python markdown remains display text and is never executed.
This design is useful because software work is naturally procedural:
- Agents often need several tool calls in a deliberate order.
- Tool results need to feed control flow inside the same step.
- File writes need structured side-loaded payloads, not fragile escaped strings.
- Developers need observable execution results, not opaque function-call envelopes.
The model still operates under a strict protocol:
codepilotblock: executable control code.- Payload blocks: file content consumed by
write_file(). completionblock: explicit task-finished signal.
This aligns with research showing that LLM agents benefit from interleaving reasoning and environment actions, as in ReAct, and from well-designed agent-computer interfaces for software engineering tasks.
How File Editing Works
write_file() never accepts file content as an inline string. Content comes from the next payload block, in order. This avoids escaping failures, malformed JSON arguments, and partial string corruption.
Single file creation:
```codepilot
write_file("config.py", mode="w")
```
```python filename=config.py
TIMEOUT = 30
RETRIES = 3
```
Line-based edit:
```codepilot
read_file("config.py")
```
After observing exact line numbers:
```codepilot
write_file("config.py", mode="edit", start_line=1, end_line=1)
```
```python filename=config.py
TIMEOUT = 60
```
Multiple non-contiguous edits in one file:
```codepilot
write_file("routes/profile.py", mode="multi_edit", edits=[(42, 48), (55, 55)])
```
```python filename=routes/profile.py
# replacement for lines 42-48
```
```python filename=routes/profile.py
# replacement for line 55
```
Safety properties:
- Paths are constrained to
runtime.work_dirunlessunsafe_mode: true. - Edits are line-numbered and validated before mutation.
- Multiple edits to the same file are constrained to prevent line drift.
- Tool results are appended back into the conversation as ground truth.
How Terminal Tools Work
CodePilot starts a default terminal session named main when the runtime is created. The session persists across run() calls.
execute("main", "pytest tests/ -v", timeout=30)
Long-running commands return with status: running instead of hanging the agent:
execute("server", "uvicorn app.main:app --port 8000", timeout=4, new_terminal=True)
read_output("server", timeout=10)
execute("main", "pytest tests/test_api.py -v", timeout=30)
send_input("server", "\x03", timeout=5)
Terminal architecture:
- Linux/macOS use
pexpectand a PTY. - Windows 10 1809+ uses ConPTY through
pywinpty. - POSIX terminal sessions are exposed through a Unix socket multiplexer.
- Multiple clients can attach to the same terminal stream, enabling a code-server extension or xterm.js bridge to share the shell with the agent.
flowchart TD
Bash[bash process] <--> PTY[PTY master]
PTY <--> Mux[MuxServer]
Mux <--> AgentClient[CodePilot terminal tool client]
Mux <--> UIClient[code-server / xterm.js client]
Persistence Model
Session backends are selected at runtime construction:
Runtime("agent.yaml") # memory
Runtime("agent.yaml", session="file", session_id="demo") # JSON file
Runtime("agent.yaml", session="db", db_url="sqlite:///./codepilot.db")
For async web apps, pass the engine your application owns:
from sqlalchemy.ext.asyncio import create_async_engine
from codepilot import AsyncRuntime
engine = create_async_engine(
DATABASE_URL,
pool_size=5,
max_overflow=10,
pool_pre_ping=True,
)
runtime = AsyncRuntime("agent.yaml", session="db", db=engine)
Important deployment rule:
A SQLAlchemy engine is a local process object, not the database. Different processes or MicroVMs should create their own engine or receive their own engine from the application process, even if all engines point to the same Postgres database.
Observability and Product Integration
Hooks are the UI and orchestration contract:
from codepilot import EventType
runtime.hooks.register(
EventType.STREAM,
lambda text, **_: send_to_ui({"type": "stream", "text": text}),
)
runtime.hooks.register(
EventType.TOOL_CALL,
lambda tool, args, label="", **_: send_to_ui({
"type": "tool_call",
"tool": tool,
"label": label,
"args": args,
}),
)
runtime.hooks.register(
EventType.TOOL_RESULT,
lambda tool, result, **_: send_to_ui({
"type": "tool_result",
"tool": tool,
"result": result,
}),
)
This allows applications to stream progress, render tool timelines, request approvals, inject mid-task messages, and persist final summaries without coupling the UI to runtime internals.
Security Model
CodePilot gives agents real software-engineering capabilities. The runtime is not a security sandbox by itself.
Recommended production posture:
- Run untrusted workspaces inside containers, MicroVMs, or OS sandboxes.
- Use
unsafe_mode: falseby default. - Gate shell execution with
require_permission: true. - Use short-lived machine/session tokens in hosted workspaces.
- Keep user auth, runtime auth, and database credentials separate.
- Prefer disposable machines plus Postgres/object-storage persistence for hosted demos.
Research Grounding
CodePilot’s design is influenced by agent and tool-use research:
- ReAct: Synergizing Reasoning and Acting in Language Models motivates interleaving reasoning traces with environment actions.
- Toolformer: Language Models Can Teach Themselves to Use Tools studies when models should call tools, what arguments to pass, and how to incorporate results.
- SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering argues that software agents benefit from purpose-built interfaces for navigating repositories, editing files, and running programs.
- Voyager: An Open-Ended Embodied Agent with Large Language Models demonstrates the value of agents that accumulate skills while acting in an external environment.
CodePilot translates those ideas into a small Python library focused on practical software work: executable control blocks, payload-backed file edits, persistent terminals, observable hooks, and pluggable session storage.
Documentation
The README is intentionally architectural. Use the documentation site for library usage:
- Installation and AgentFile configuration
- Runtime and streaming behavior
- File, terminal, search, and context tools
- Session persistence
- Hooks and permission gating
- FastAPI and hosted workspace patterns
- API reference
Docs: https://Jahanzeb-git.github.io/codepilot/
License
MIT License. See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file codepilot_ai-0.9.2.tar.gz.
File metadata
- Download URL: codepilot_ai-0.9.2.tar.gz
- Upload date:
- Size: 97.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bd2eaebf333d7a9394d030e3272e66b3514ace18a017c1c6a3748228ced4a9f2
|
|
| MD5 |
9daafebc2aa0df17bf0b10228368598d
|
|
| BLAKE2b-256 |
86075cd1cc8e258c457a49d94d5a9b11f52c692906a5ce8b25256d95c2b22233
|
Provenance
The following attestation bundles were made for codepilot_ai-0.9.2.tar.gz:
Publisher:
publish.yml on Jahanzeb-git/codepilot
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
codepilot_ai-0.9.2.tar.gz -
Subject digest:
bd2eaebf333d7a9394d030e3272e66b3514ace18a017c1c6a3748228ced4a9f2 - Sigstore transparency entry: 1592539550
- Sigstore integration time:
-
Permalink:
Jahanzeb-git/codepilot@582e8cc31bba4427d11225382f6dc960a94dd583 -
Branch / Tag:
refs/tags/v0.9.2 - Owner: https://github.com/Jahanzeb-git
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@582e8cc31bba4427d11225382f6dc960a94dd583 -
Trigger Event:
push
-
Statement type:
File details
Details for the file codepilot_ai-0.9.2-py3-none-any.whl.
File metadata
- Download URL: codepilot_ai-0.9.2-py3-none-any.whl
- Upload date:
- Size: 103.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a1239853402fd10bd3dc0135abd7b7a579a6f0e2b9f40de719187f4e793483da
|
|
| MD5 |
afbbe010f42e224a34799f5f7cae157f
|
|
| BLAKE2b-256 |
51e02cff7902b696d791fa92cd0ed6519925642e9d68033ae48c5b47d2c67741
|
Provenance
The following attestation bundles were made for codepilot_ai-0.9.2-py3-none-any.whl:
Publisher:
publish.yml on Jahanzeb-git/codepilot
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
codepilot_ai-0.9.2-py3-none-any.whl -
Subject digest:
a1239853402fd10bd3dc0135abd7b7a579a6f0e2b9f40de719187f4e793483da - Sigstore transparency entry: 1592539593
- Sigstore integration time:
-
Permalink:
Jahanzeb-git/codepilot@582e8cc31bba4427d11225382f6dc960a94dd583 -
Branch / Tag:
refs/tags/v0.9.2 - Owner: https://github.com/Jahanzeb-git
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@582e8cc31bba4427d11225382f6dc960a94dd583 -
Trigger Event:
push
-
Statement type: