Python port of @mariozechner/pi-ai — unified LLM streaming API with OAuth authentication

These details have not been verified by PyPI

Project links

Project description

piai

Python port of @mariozechner/pi-ai — use your ChatGPT Plus/Pro subscription to access GPT models from Python, without paying per-token API rates.

Authenticates via OAuth using your existing ChatGPT account, then streams completions from ChatGPT's internal backend. No OpenAI API key needed.

How it works

The library logs in to ChatGPT using the same OAuth flow the official web app uses. It stores a refresh token in auth.json locally and auto-refreshes it before each request. Your Plus/Pro subscription grants access — no separate API billing.

Requirements

Python 3.12+
uv (recommended) or pip
A ChatGPT Plus or Pro subscription

Installation

From source

git clone https://github.com/Xplo8E/piai
cd piai
uv sync

As a dependency in your project

uv add pi-ai-py

Or with pip:

pip install pi-ai-py

Setup: Login

Run once to authenticate. Opens a browser for you to log in with your ChatGPT account.

uv run piai login
# or after installing as a package:
piai login

Credentials are saved to auth.json in your current working directory. Keep this file private — add it to .gitignore.

CLI usage

# Quick one-shot prompt
piai run "Explain async/await in Python"

# Specify a model
piai run "What is 2+2?" --model gpt-5.1

# With a system prompt
piai run "Summarize this" --system "You are a concise assistant"

# Check login status
piai status

# List available OAuth providers
piai list

# Log out
piai logout

Python API

`stream(model_id, context, options?)` → `AsyncGenerator[StreamEvent]`

Streams the model response as typed events. Handles auth and token refresh automatically.

import asyncio
from piai import stream
from piai.types import Context, UserMessage, TextDeltaEvent, DoneEvent

async def main():
    ctx = Context(
        system_prompt="You are a helpful assistant.",
        messages=[UserMessage(content="What is the capital of France?")]
    )

    async for event in stream("gpt-5.1-codex-mini", ctx):
        if isinstance(event, TextDeltaEvent):
            print(event.text, end="", flush=True)
        elif isinstance(event, DoneEvent):
            print()  # newline at end
            print(f"Tokens used: {event.message.usage['input']} in, {event.message.usage['output']} out")

asyncio.run(main())

`complete(model_id, context, options?)` → `AssistantMessage`

Collects the full response and returns an AssistantMessage.

import asyncio
from piai import complete
from piai.types import Context, UserMessage

async def main():
    ctx = Context(messages=[UserMessage(content="Write a haiku about Python.")])
    msg = await complete("gpt-5.1-codex-mini", ctx)

    for block in msg.content:
        from piai.types import TextContent
        if isinstance(block, TextContent):
            print(block.text)

    print(f"Stop reason: {msg.stop_reason}")
    print(f"Usage: {msg.usage}")

asyncio.run(main())

`complete_text(model_id, context, options?)` → `str`

Simplest interface — returns the full response text as a string.

import asyncio
from piai import complete_text
from piai.types import Context, UserMessage

async def main():
    ctx = Context(messages=[UserMessage(content="What is 2 + 2?")])
    text = await complete_text("gpt-5.1-codex-mini", ctx)
    print(text)

asyncio.run(main())

Multi-turn conversations

Append messages to context.messages to continue a conversation:

import asyncio
from piai import complete
from piai.types import Context, UserMessage

async def main():
    ctx = Context(system_prompt="You are a helpful assistant.")

    ctx.messages.append(UserMessage(content="My name is Vinay."))
    response = await complete("gpt-5.1-codex-mini", ctx)
    ctx.messages.append(response)  # add assistant reply to history

    ctx.messages.append(UserMessage(content="What's my name?"))
    response = await complete("gpt-5.1-codex-mini", ctx)
    ctx.messages.append(response)

    from piai.types import TextContent
    for block in response.content:
        if isinstance(block, TextContent):
            print(block.text)

asyncio.run(main())

Tool calling (function calling)

Define tools with a JSON Schema parameters dict:

import asyncio
import json
from piai import stream
from piai.types import (
    Context, UserMessage, ToolResultMessage, Tool,
    ToolCallStartEvent, ToolCallEndEvent, TextDeltaEvent, DoneEvent,
)

def get_weather(city: str) -> str:
    return f"The weather in {city} is sunny, 22°C."

async def main():
    ctx = Context(
        system_prompt="You are a helpful assistant with access to weather data.",
        messages=[UserMessage(content="What's the weather in London?")],
        tools=[
            Tool(
                name="get_weather",
                description="Get current weather for a city.",
                parameters={
                    "type": "object",
                    "properties": {
                        "city": {"type": "string", "description": "City name"}
                    },
                    "required": ["city"],
                },
            )
        ],
    )

    # First turn — model calls the tool
    tool_calls = []
    async for event in stream("gpt-5.1-codex-mini", ctx):
        if isinstance(event, ToolCallEndEvent):
            tool_calls.append(event.tool_call)
        elif isinstance(event, DoneEvent):
            ctx.messages.append(event.message)  # add assistant message to history

    # Execute tools and feed results back
    for tc in tool_calls:
        result = get_weather(**tc.input)
        ctx.messages.append(ToolResultMessage(tool_call_id=tc.id, content=result))

    # Second turn — model produces final answer
    async for event in stream("gpt-5.1-codex-mini", ctx):
        if isinstance(event, TextDeltaEvent):
            print(event.text, end="", flush=True)
    print()

asyncio.run(main())

Stream events reference

All events yielded by stream():

Event	Fields	Description
`TextStartEvent`	—	Model started producing text
`TextDeltaEvent`	`text: str`	Incremental text chunk
`TextEndEvent`	`text: str`	Full accumulated text for this block
`ThinkingDeltaEvent`	`thinking: str`	Incremental reasoning chunk (reasoning models)
`ToolCallStartEvent`	`tool_call: ToolCall`	Model started a tool call
`ToolCallDeltaEvent`	`id: str`, `json_delta: str`	Partial tool call arguments
`ToolCallEndEvent`	`tool_call: ToolCall`	Complete tool call with parsed input
`DoneEvent`	`reason: str`, `message: AssistantMessage`	Stream complete
`ErrorEvent`	`reason: str`, `error: AssistantMessage`	Stream failed

DoneEvent.reason values: "stop", "length", "tool_use", "error", "aborted"

Options

Pass an options dict to stream() or complete():

options = {
    "session_id": "my-session",        # enables prompt caching across calls
    "reasoning_effort": "high",        # for reasoning models (gpt-5.x): low/medium/high
    "reasoning_summary": "auto",       # auto/concise/detailed/off
    "text_verbosity": "medium",        # low/medium/high
    "temperature": 0.7,
}

Supported models

Any model your ChatGPT Plus/Pro subscription can access. Common ones:

gpt-5.1-codex-mini — fast, default
gpt-5.1 — more capable
gpt-5.1-codex-max
gpt-5.2, gpt-5.2-codex
gpt-5.3-codex, gpt-5.3-codex-spark
gpt-5.4

The model ID is passed directly to the backend — use whatever ChatGPT shows in its model picker.

`auth.json` format

Credentials are stored as JSON, compatible with the original JS pi-ai SDK:

{
  "openai-codex": {
    "refresh": "<refresh_token>",
    "access": "<access_token>",
    "expires": 1234567890000,
    "accountId": "<account_id>"
  }
}

If you've already logged in using the JS CLI (npx @mariozechner/pi-ai login openai-codex), the same auth.json works with piai without re-logging in.

Never commit auth.json to version control.

MCP tool servers

piai has a native MCP (Model Context Protocol) client. Pass any MCP server — radare2, IDA Pro, filesystem, web search, or any custom server — and the agent auto-discovers tools and runs the agentic loop for you.

import asyncio
from piai import agent
from piai.mcp import MCPServer
from piai.types import Context, UserMessage, TextDeltaEvent

async def main():
    ctx = Context(
        system_prompt="You are an expert reverse engineer.",
        messages=[UserMessage(content="Analyze /lib/target.so and report all JNI functions.")],
    )

    result = await agent(
        model_id="gpt-5.1-codex-mini",
        context=ctx,
        mcp_servers=[
            MCPServer.stdio("r2pm -r r2mcp"),             # radare2
            MCPServer.stdio("ida-mcp", name="ida"),        # IDA Pro headless
            MCPServer.http("http://127.0.0.1:13337/mcp"),  # IDA Pro HTTP server
        ],
        options={"reasoning_effort": "medium"},
        max_turns=30,
        on_event=lambda e: print(e.text, end="", flush=True) if isinstance(e, TextDeltaEvent) else None,
    )

asyncio.run(main())

Transport types:

MCPServer.stdio("command --args") — spawns a local subprocess
MCPServer.http("http://host/mcp") — Streamable HTTP (modern)
MCPServer.sse("http://host/sse") — legacy SSE transport

Auth shorthand:

MCPServer.http("https://api.example.com/mcp", bearer_token="my-token")
MCPServer.stdio("my-server", env_extra={"API_KEY": "secret"})

Load from a TOML config file:

Create ~/.piai/config.toml (or any path you prefer):

[mcp_servers.r2]
command = "r2pm"
args = ["-r", "r2mcp"]

[mcp_servers.ida]
command = "ida-mcp"

[mcp_servers.ida-http]
url = "http://127.0.0.1:13337/mcp"

[mcp_servers.remote]
url = "https://api.example.com/mcp"
bearer_token = "my-token"

[mcp_servers.with-env]
command = "my-server"

[mcp_servers.with-env.env_extra]
API_KEY = "secret"

Then load in one line:

from piai.mcp import MCPServer

servers = MCPServer.from_toml("~/.piai/config.toml")
result = await agent(model_id="gpt-5.1-codex-mini", context=ctx, mcp_servers=servers)

agent() options:

result = await agent(
    model_id="gpt-5.1-codex-mini",
    context=ctx,
    mcp_servers=[...],
    options={"reasoning_effort": "medium"},
    max_turns=20,                    # safety limit on agentic iterations
    on_event=my_callback,            # sync or async callback for every StreamEvent
    require_all_servers=False,       # True = raise if any server fails to connect
    connect_timeout=60.0,            # per-server connection timeout in seconds
    tool_result_max_chars=32_000,    # max chars per tool result (prevents context explosion)
)

Pre-defined tools + MCP: If you pass both context.tools and mcp_servers, they are merged. MCP tools take priority on name conflicts; your pre-defined tools are appended de-duplicated.

See docs/mcp.md for the full MCP reference.

LangChain integration

PiAIChatModel is a drop-in LangChain BaseChatModel backed by piai. Use it anywhere LangChain accepts a chat model — chains, agents, tools.

from piai.langchain import PiAIChatModel
from langchain_core.messages import HumanMessage

llm = PiAIChatModel(model_name="gpt-5.1-codex-mini")

# Invoke
result = llm.invoke([HumanMessage(content="What is 2+2?")])
print(result.content)

# Stream
async for chunk in llm.astream([HumanMessage(content="Tell me a joke")]):
    print(chunk.content, end="", flush=True)

# With tools (works with any LangChain agent or tool framework)
llm_with_tools = llm.bind_tools([my_tool])
result = llm_with_tools.invoke([HumanMessage(content="Use the tool")])

Project structure

src/piai/
├── __init__.py              # Public API: stream, complete, complete_text, agent, MCPServer
├── types.py                 # Context, messages, stream events
├── stream.py                # Entry points with auth handling
├── agent.py                 # Autonomous agentic loop with MCP support
├── cli.py                   # CLI commands
├── mcp/
│   ├── server.py            # MCPServer config (stdio/http/sse + from_toml)
│   ├── client.py            # MCPClient — persistent session per server
│   └── hub.py               # MCPHub — multi-server manager
├── langchain/
│   └── chat_model.py        # PiAIChatModel — LangChain BaseChatModel adapter
├── oauth/
│   ├── pkce.py              # PKCE verifier/challenge (RFC 7636)
│   ├── types.py             # OAuthCredentials, OAuthProviderInterface
│   ├── storage.py           # auth.json read/write
│   ├── openai_codex.py      # ChatGPT Plus OAuth login + refresh
│   └── __init__.py          # Provider registry + get_oauth_api_key()
└── providers/
    ├── message_transform.py # Context → OpenAI Responses API format
    └── openai_codex.py      # SSE streaming to chatgpt.com/backend-api

Running tests

uv run pytest tests/ -v

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.5

Mar 22, 2026

0.2.4

Mar 20, 2026

0.2.3

Mar 20, 2026

0.2.2

Mar 20, 2026

0.2.1

Mar 19, 2026

This version

0.2.0

Mar 19, 2026

0.1.0

Mar 19, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pi_ai_py-0.2.0.tar.gz (35.2 kB view details)

Uploaded Mar 19, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pi_ai_py-0.2.0-py3-none-any.whl (45.0 kB view details)

Uploaded Mar 19, 2026 Python 3

File details

Details for the file pi_ai_py-0.2.0.tar.gz.

File metadata

Download URL: pi_ai_py-0.2.0.tar.gz
Upload date: Mar 19, 2026
Size: 35.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.8.14

File hashes

Hashes for pi_ai_py-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`54daf40c9058bb7c5ae066698323de639ed14602561bb8bc126ff5fdf038f314`
MD5	`4f4475c0ac4447321f2adc4ae2d3e61e`
BLAKE2b-256	`05dea1a434357d16b57f7b2a3724402c09e2d97d0e2b0656838d5497039ac97a`

See more details on using hashes here.

File details

Details for the file pi_ai_py-0.2.0-py3-none-any.whl.

File metadata

Download URL: pi_ai_py-0.2.0-py3-none-any.whl
Upload date: Mar 19, 2026
Size: 45.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.8.14

File hashes

Hashes for pi_ai_py-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ca69d5954936bff6d11b4dec0c135816099a372c20dd7c50960af0b33ca4ac1c`
MD5	`94ef952be65988d47ac6d6116432d331`
BLAKE2b-256	`cf077a37afc25939a425c8a09ad072514fd3136721df58e6f0587b204899f9d4`

See more details on using hashes here.

pi-ai-py 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

piai

How it works

Requirements

Installation

From source

As a dependency in your project

Setup: Login

CLI usage

Python API

stream(model_id, context, options?) → AsyncGenerator[StreamEvent]

complete(model_id, context, options?) → AssistantMessage

complete_text(model_id, context, options?) → str

Multi-turn conversations

Tool calling (function calling)

Stream events reference

Options

Supported models

auth.json format

MCP tool servers

LangChain integration

Project structure

Running tests

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`stream(model_id, context, options?)` → `AsyncGenerator[StreamEvent]`

`complete(model_id, context, options?)` → `AssistantMessage`

`complete_text(model_id, context, options?)` → `str`

`auth.json` format