Python port of @mariozechner/pi-ai — unified LLM streaming API with OAuth authentication
Project description
piai
Python port of @mariozechner/pi-ai — use your ChatGPT Plus/Pro subscription to access GPT models from Python, without paying per-token API rates.
Authenticates via OAuth using your existing ChatGPT account, then streams completions from ChatGPT's internal backend. No OpenAI API key needed.
How it works
The library logs in to ChatGPT using the same OAuth flow the official web app uses. It stores a refresh token in auth.json locally and auto-refreshes it before each request. Your Plus/Pro subscription grants access — no separate API billing.
Requirements
- Python 3.12+
- uv (recommended) or pip
- A ChatGPT Plus or Pro subscription
Installation
From source
git clone https://github.com/Xplo8E/piai
cd piai
uv sync
As a dependency in your project
uv add pi-ai-py
Or with pip:
pip install pi-ai-py
Setup: Login
Run once to authenticate. Opens a browser for you to log in with your ChatGPT account.
uv run piai login
# or after installing as a package:
piai login
Credentials are saved to auth.json in your current working directory. Keep this file private — add it to .gitignore.
CLI usage
# Quick one-shot prompt
piai run "Explain async/await in Python"
# Specify a model
piai run "What is 2+2?" --model gpt-5.1
# With a system prompt
piai run "Summarize this" --system "You are a concise assistant"
# Check login status
piai status
# List available OAuth providers
piai list
# Log out
piai logout
Python API
stream(model_id, context, options?) → AsyncGenerator[StreamEvent]
Streams the model response as typed events. Handles auth and token refresh automatically.
import asyncio
from piai import stream
from piai.types import Context, UserMessage, TextDeltaEvent, DoneEvent
async def main():
ctx = Context(
system_prompt="You are a helpful assistant.",
messages=[UserMessage(content="What is the capital of France?")]
)
async for event in stream("gpt-5.1-codex-mini", ctx):
if isinstance(event, TextDeltaEvent):
print(event.text, end="", flush=True)
elif isinstance(event, DoneEvent):
print() # newline at end
print(f"Tokens used: {event.message.usage['input']} in, {event.message.usage['output']} out")
asyncio.run(main())
complete(model_id, context, options?) → AssistantMessage
Collects the full response and returns an AssistantMessage.
import asyncio
from piai import complete
from piai.types import Context, UserMessage
async def main():
ctx = Context(messages=[UserMessage(content="Write a haiku about Python.")])
msg = await complete("gpt-5.1-codex-mini", ctx)
for block in msg.content:
from piai.types import TextContent
if isinstance(block, TextContent):
print(block.text)
print(f"Stop reason: {msg.stop_reason}")
print(f"Usage: {msg.usage}")
asyncio.run(main())
complete_text(model_id, context, options?) → str
Simplest interface — returns the full response text as a string.
import asyncio
from piai import complete_text
from piai.types import Context, UserMessage
async def main():
ctx = Context(messages=[UserMessage(content="What is 2 + 2?")])
text = await complete_text("gpt-5.1-codex-mini", ctx)
print(text)
asyncio.run(main())
Multi-turn conversations
Append messages to context.messages to continue a conversation:
import asyncio
from piai import complete
from piai.types import Context, UserMessage
async def main():
ctx = Context(system_prompt="You are a helpful assistant.")
ctx.messages.append(UserMessage(content="My name is Vinay."))
response = await complete("gpt-5.1-codex-mini", ctx)
ctx.messages.append(response) # add assistant reply to history
ctx.messages.append(UserMessage(content="What's my name?"))
response = await complete("gpt-5.1-codex-mini", ctx)
ctx.messages.append(response)
from piai.types import TextContent
for block in response.content:
if isinstance(block, TextContent):
print(block.text)
asyncio.run(main())
Tool calling (function calling)
Define tools with a JSON Schema parameters dict:
import asyncio
import json
from piai import stream
from piai.types import (
Context, UserMessage, ToolResultMessage, Tool,
ToolCallStartEvent, ToolCallEndEvent, TextDeltaEvent, DoneEvent,
)
def get_weather(city: str) -> str:
return f"The weather in {city} is sunny, 22°C."
async def main():
ctx = Context(
system_prompt="You are a helpful assistant with access to weather data.",
messages=[UserMessage(content="What's the weather in London?")],
tools=[
Tool(
name="get_weather",
description="Get current weather for a city.",
parameters={
"type": "object",
"properties": {
"city": {"type": "string", "description": "City name"}
},
"required": ["city"],
},
)
],
)
# First turn — model calls the tool
tool_calls = []
async for event in stream("gpt-5.1-codex-mini", ctx):
if isinstance(event, ToolCallEndEvent):
tool_calls.append(event.tool_call)
elif isinstance(event, DoneEvent):
ctx.messages.append(event.message) # add assistant message to history
# Execute tools and feed results back
for tc in tool_calls:
result = get_weather(**tc.input)
ctx.messages.append(ToolResultMessage(tool_call_id=tc.id, content=result))
# Second turn — model produces final answer
async for event in stream("gpt-5.1-codex-mini", ctx):
if isinstance(event, TextDeltaEvent):
print(event.text, end="", flush=True)
print()
asyncio.run(main())
Stream events reference
All events yielded by stream():
| Event | Fields | Description |
|---|---|---|
TextStartEvent |
— | Model started producing text |
TextDeltaEvent |
text: str |
Incremental text chunk |
TextEndEvent |
text: str |
Full accumulated text for this block |
ThinkingDeltaEvent |
thinking: str |
Incremental reasoning chunk (reasoning models) |
ToolCallStartEvent |
tool_call: ToolCall |
Model started a tool call |
ToolCallDeltaEvent |
id: str, json_delta: str |
Partial tool call arguments |
ToolCallEndEvent |
tool_call: ToolCall |
Complete tool call with parsed input |
DoneEvent |
reason: str, message: AssistantMessage |
Stream complete |
ErrorEvent |
reason: str, error: AssistantMessage |
Stream failed |
DoneEvent.reason values: "stop", "length", "tool_use", "error", "aborted"
Options
Pass an options dict to stream() or complete():
options = {
"session_id": "my-session", # enables prompt caching across calls
"reasoning_effort": "high", # for reasoning models (gpt-5.x): low/medium/high
"reasoning_summary": "auto", # auto/concise/detailed/off
"text_verbosity": "medium", # low/medium/high
}
Note:
temperatureis not supported by the ChatGPT backend — the API will return an error if you pass it.
Supported models
Any model your ChatGPT Plus/Pro subscription can access. Common ones:
gpt-5.1-codex-mini— fast, defaultgpt-5.1— more capablegpt-5.1-codex-maxgpt-5.2,gpt-5.2-codexgpt-5.3-codex,gpt-5.3-codex-sparkgpt-5.4
The model ID is passed directly to the backend — use whatever ChatGPT shows in its model picker.
auth.json format
Credentials are stored as JSON, compatible with the original JS pi-ai SDK:
{
"openai-codex": {
"refresh": "<refresh_token>",
"access": "<access_token>",
"expires": 1234567890000,
"accountId": "<account_id>"
}
}
If you've already logged in using the JS CLI (npx @mariozechner/pi-ai login openai-codex), the same auth.json works with piai without re-logging in.
Never commit auth.json to version control.
MCP tool servers
piai has a native MCP (Model Context Protocol) client. Pass any MCP server — radare2, IDA Pro, filesystem, web search, or any custom server — and the agent auto-discovers tools and runs the agentic loop for you.
import asyncio
from piai import agent
from piai.mcp import MCPServer
from piai.types import Context, UserMessage, TextDeltaEvent
async def main():
ctx = Context(
system_prompt="You are an expert reverse engineer.",
messages=[UserMessage(content="Analyze /lib/target.so and report all JNI functions.")],
)
result = await agent(
model_id="gpt-5.1-codex-mini",
context=ctx,
mcp_servers=[
MCPServer.stdio("r2pm -r r2mcp"), # radare2
MCPServer.stdio("ida-mcp", name="ida"), # IDA Pro headless
MCPServer.http("http://127.0.0.1:13337/mcp"), # IDA Pro HTTP server
],
options={"reasoning_effort": "medium"},
max_turns=30,
on_event=lambda e: print(e.text, end="", flush=True) if isinstance(e, TextDeltaEvent) else None,
)
asyncio.run(main())
Transport types:
MCPServer.stdio("command --args")— spawns a local subprocessMCPServer.http("http://host/mcp")— Streamable HTTP (modern)MCPServer.sse("http://host/sse")— legacy SSE transport
Auth shorthand:
MCPServer.http("https://api.example.com/mcp", bearer_token="my-token")
MCPServer.stdio("my-server", env_extra={"API_KEY": "secret"})
Load from a TOML config file:
Create ~/.piai/config.toml (or any path you prefer):
[mcp_servers.r2]
command = "r2pm"
args = ["-r", "r2mcp"]
[mcp_servers.ida]
command = "ida-mcp"
[mcp_servers.ida-http]
url = "http://127.0.0.1:13337/mcp"
[mcp_servers.remote]
url = "https://api.example.com/mcp"
bearer_token = "my-token"
[mcp_servers.with-env]
command = "my-server"
[mcp_servers.with-env.env_extra]
API_KEY = "secret"
Then load in one line:
from piai.mcp import MCPServer
servers = MCPServer.from_toml("~/.piai/config.toml")
result = await agent(model_id="gpt-5.1-codex-mini", context=ctx, mcp_servers=servers)
agent() options:
result = await agent(
model_id="gpt-5.1-codex-mini",
context=ctx,
mcp_servers=[...],
options={"reasoning_effort": "medium"},
max_turns=20, # safety limit on agentic iterations
on_event=my_callback, # sync or async callback for every StreamEvent
require_all_servers=False, # True = raise if any server fails to connect
connect_timeout=60.0, # per-server connection timeout in seconds
tool_result_max_chars=32_000, # max chars per tool result (prevents context explosion)
)
Pre-defined tools + MCP: If you pass both context.tools and mcp_servers, they are merged. MCP tools take priority on name conflicts; your pre-defined tools are appended de-duplicated.
See docs/mcp.md for the full MCP reference.
LangChain integration
PiAIChatModel is a drop-in LangChain BaseChatModel backed by piai. Use it anywhere LangChain accepts a chat model — chains, agents, tools.
from piai.langchain import PiAIChatModel
from langchain_core.messages import HumanMessage
llm = PiAIChatModel(model_name="gpt-5.1-codex-mini")
# Invoke
result = llm.invoke([HumanMessage(content="What is 2+2?")])
print(result.content)
# Stream
async for chunk in llm.astream([HumanMessage(content="Tell me a joke")]):
print(chunk.content, end="", flush=True)
# With tools (works with any LangChain agent or tool framework)
llm_with_tools = llm.bind_tools([my_tool])
result = llm_with_tools.invoke([HumanMessage(content="Use the tool")])
Install with LangGraph extras:
pip install "pi-ai-py[langgraph]"
LangGraph integration
piai integrates with LangGraph for building multi-agent workflows. Two components are provided:
MCP → LangChain tool bridge
Convert MCP servers into LangChain BaseTool instances so LangGraph agents can use them directly:
from piai.mcp import to_langchain_tools, MCPServer, MCPHubToolset
from langchain_core.messages import HumanMessage
from langgraph.prebuilt import create_react_agent
from piai.langchain import PiAIChatModel
servers = [MCPServer.stdio("npx -y @modelcontextprotocol/server-filesystem /tmp")]
llm = PiAIChatModel(model_name="gpt-5.1-codex-mini")
async with MCPHubToolset(servers) as tools:
agent = create_react_agent(llm, tools)
result = await agent.ainvoke({"messages": [HumanMessage(content="List files in /tmp")]})
print(result["messages"][-1].content)
SubAgentTool — piai agent as a LangGraph tool
Wrap a full piai agent() (with its own model + MCP servers) as a single BaseTool. Use as a sub-agent inside a LangGraph Supervisor:
from piai.langchain import SubAgentTool
from piai.mcp import MCPServer
file_agent = SubAgentTool(
name="file_agent",
description="Reads, writes, and analyses files using the filesystem MCP server",
model_id="gpt-5.1-codex-mini",
system_prompt="You are a file management specialist.",
mcp_servers=[MCPServer.stdio("npx -y @modelcontextprotocol/server-filesystem /tmp")],
)
LangGraph Supervisor example
from piai.langchain import PiAIChatModel, SubAgentTool
from piai.mcp import MCPServer
from langgraph_supervisor import create_supervisor
from langchain_core.messages import HumanMessage
supervisor_llm = PiAIChatModel(model_name="gpt-5.1-codex-mini")
file_agent = SubAgentTool(name="file_agent", description="...", mcp_servers=[...])
code_agent = SubAgentTool(name="code_agent", description="...", mcp_servers=[...])
workflow = create_supervisor(
agents=[file_agent, code_agent],
model=supervisor_llm,
prompt="You are a supervisor. Delegate tasks to the appropriate specialist.",
).compile()
result = await workflow.ainvoke({"messages": [HumanMessage(content="Analyse the code in /tmp/app.py")]})
See examples/langgraph_supervisor_agent.py for a full runnable example.
Project structure
src/piai/
├── __init__.py # Public API: stream, complete, complete_text, agent, MCPServer
├── types.py # Context, messages, stream events
├── stream.py # Entry points with auth handling
├── agent.py # Autonomous agentic loop with MCP support
├── cli.py # CLI commands
├── mcp/
│ ├── server.py # MCPServer config (stdio/http/sse + from_toml)
│ ├── client.py # MCPClient — persistent session per server
│ ├── hub.py # MCPHub — multi-server manager
│ └── langchain_tools.py # MCP → LangChain tool bridge (to_langchain_tools, MCPHubToolset)
├── langchain/
│ ├── chat_model.py # PiAIChatModel — LangChain BaseChatModel adapter
│ └── sub_agent_tool.py # SubAgentTool — piai agent as LangChain BaseTool
├── oauth/
│ ├── pkce.py # PKCE verifier/challenge (RFC 7636)
│ ├── types.py # OAuthCredentials, OAuthProviderInterface
│ ├── storage.py # auth.json read/write
│ ├── openai_codex.py # ChatGPT Plus OAuth login + refresh
│ └── __init__.py # Provider registry + get_oauth_api_key()
└── providers/
├── message_transform.py # Context → OpenAI Responses API format
└── openai_codex.py # SSE streaming to chatgpt.com/backend-api
Running tests
uv run pytest tests/ -v
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pi_ai_py-0.2.1.tar.gz.
File metadata
- Download URL: pi_ai_py-0.2.1.tar.gz
- Upload date:
- Size: 39.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1ea4b99cb42e95de765cefbf8be43c2a3a77cc91c21e7556d114efc0900a35f7
|
|
| MD5 |
7e819dd404fd4887a90337e9b5706b5d
|
|
| BLAKE2b-256 |
dcecdc48023b806ded9150162694207cd5846dcddad5ac5dbf0fbadd8dc0acb9
|
File details
Details for the file pi_ai_py-0.2.1-py3-none-any.whl.
File metadata
- Download URL: pi_ai_py-0.2.1-py3-none-any.whl
- Upload date:
- Size: 51.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2cbfb3dfe06e99cc79e74e3eebfebb069bb8bcd9c1a277fa12fdb482364f91d9
|
|
| MD5 |
1646a6aa8c56331301e2d73c6f8500cb
|
|
| BLAKE2b-256 |
be1cc18c85337ce2c9d9811f5392d921fdfbab4e5dae68553109f64cbd624681
|