Skip to main content

Python port of @mariozechner/pi-ai — unified LLM streaming API with OAuth authentication

Project description

piai

Python port of @mariozechner/pi-ai — use your ChatGPT Plus/Pro subscription to access GPT models from Python, without paying per-token API rates.

Authenticates via OAuth using your existing ChatGPT account, then streams completions from ChatGPT's internal backend. No OpenAI API key needed.


How it works

The library logs in to ChatGPT using the same OAuth flow the official web app uses. It stores a refresh token in auth.json locally and auto-refreshes it before each request. Your Plus/Pro subscription grants access — no separate API billing.


Requirements

  • Python 3.12+
  • uv (recommended) or pip
  • A ChatGPT Plus or Pro subscription

Installation

From source

git clone https://github.com/Xplo8E/piai
cd piai
uv sync

As a dependency in your project

uv add pi-ai-py

Or with pip:

pip install pi-ai-py

Setup: Login

Run once to authenticate. Opens a browser for you to log in with your ChatGPT account.

uv run piai login
# or after installing as a package:
piai login

Credentials are saved to auth.json in your current working directory. Keep this file private — add it to .gitignore.


CLI usage

# Quick one-shot prompt
piai run "Explain async/await in Python"

# Specify a model
piai run "What is 2+2?" --model gpt-5.1

# With a system prompt
piai run "Summarize this" --system "You are a concise assistant"

# Check login status
piai status

# List available OAuth providers
piai list

# Log out
piai logout

Python API

stream(model_id, context, options?)AsyncGenerator[StreamEvent]

Streams the model response as typed events. Handles auth and token refresh automatically.

import asyncio
from piai import stream
from piai.types import Context, UserMessage, TextDeltaEvent, DoneEvent

async def main():
    ctx = Context(
        system_prompt="You are a helpful assistant.",
        messages=[UserMessage(content="What is the capital of France?")]
    )

    async for event in stream("gpt-5.1-codex-mini", ctx):
        if isinstance(event, TextDeltaEvent):
            print(event.text, end="", flush=True)
        elif isinstance(event, DoneEvent):
            print()  # newline at end
            print(f"Tokens used: {event.message.usage['input']} in, {event.message.usage['output']} out")

asyncio.run(main())

complete(model_id, context, options?)AssistantMessage

Collects the full response and returns an AssistantMessage.

import asyncio
from piai import complete
from piai.types import Context, UserMessage

async def main():
    ctx = Context(messages=[UserMessage(content="Write a haiku about Python.")])
    msg = await complete("gpt-5.1-codex-mini", ctx)

    for block in msg.content:
        from piai.types import TextContent
        if isinstance(block, TextContent):
            print(block.text)

    print(f"Stop reason: {msg.stop_reason}")
    print(f"Usage: {msg.usage}")

asyncio.run(main())

complete_text(model_id, context, options?)str

Simplest interface — returns the full response text as a string.

import asyncio
from piai import complete_text
from piai.types import Context, UserMessage

async def main():
    ctx = Context(messages=[UserMessage(content="What is 2 + 2?")])
    text = await complete_text("gpt-5.1-codex-mini", ctx)
    print(text)

asyncio.run(main())

Multi-turn conversations

Append messages to context.messages to continue a conversation:

import asyncio
from piai import complete
from piai.types import Context, UserMessage

async def main():
    ctx = Context(system_prompt="You are a helpful assistant.")

    ctx.messages.append(UserMessage(content="My name is Vinay."))
    response = await complete("gpt-5.1-codex-mini", ctx)
    ctx.messages.append(response)  # add assistant reply to history

    ctx.messages.append(UserMessage(content="What's my name?"))
    response = await complete("gpt-5.1-codex-mini", ctx)
    ctx.messages.append(response)

    from piai.types import TextContent
    for block in response.content:
        if isinstance(block, TextContent):
            print(block.text)

asyncio.run(main())

Tool calling (function calling)

Define tools with a JSON Schema parameters dict:

import asyncio
import json
from piai import stream
from piai.types import (
    Context, UserMessage, ToolResultMessage, Tool,
    ToolCallStartEvent, ToolCallEndEvent, TextDeltaEvent, DoneEvent,
)

def get_weather(city: str) -> str:
    return f"The weather in {city} is sunny, 22°C."

async def main():
    ctx = Context(
        system_prompt="You are a helpful assistant with access to weather data.",
        messages=[UserMessage(content="What's the weather in London?")],
        tools=[
            Tool(
                name="get_weather",
                description="Get current weather for a city.",
                parameters={
                    "type": "object",
                    "properties": {
                        "city": {"type": "string", "description": "City name"}
                    },
                    "required": ["city"],
                },
            )
        ],
    )

    # First turn — model calls the tool
    tool_calls = []
    async for event in stream("gpt-5.1-codex-mini", ctx):
        if isinstance(event, ToolCallEndEvent):
            tool_calls.append(event.tool_call)
        elif isinstance(event, DoneEvent):
            ctx.messages.append(event.message)  # add assistant message to history

    # Execute tools and feed results back
    for tc in tool_calls:
        result = get_weather(**tc.input)
        ctx.messages.append(ToolResultMessage(tool_call_id=tc.id, content=result))

    # Second turn — model produces final answer
    async for event in stream("gpt-5.1-codex-mini", ctx):
        if isinstance(event, TextDeltaEvent):
            print(event.text, end="", flush=True)
    print()

asyncio.run(main())

Stream events reference

All events yielded by stream():

Event Fields Description
TextStartEvent Model started producing text
TextDeltaEvent text: str Incremental text chunk
TextEndEvent text: str Full accumulated text for this block
ThinkingDeltaEvent thinking: str Incremental reasoning chunk (reasoning models)
ToolCallStartEvent tool_call: ToolCall Model started a tool call
ToolCallDeltaEvent id: str, json_delta: str Partial tool call arguments
ToolCallEndEvent tool_call: ToolCall Complete tool call with parsed input
DoneEvent reason: str, message: AssistantMessage Stream complete
ErrorEvent reason: str, error: AssistantMessage Stream failed

DoneEvent.reason values: "stop", "length", "tool_use", "error", "aborted"


Options

Pass an options dict to stream() or complete():

options = {
    "session_id": "my-session",        # enables prompt caching across calls
    "reasoning_effort": "high",        # for reasoning models (gpt-5.x): low/medium/high
    "reasoning_summary": "auto",       # auto/concise/detailed/off
    "text_verbosity": "medium",        # low/medium/high
    "temperature": 0.7,
}

Supported models

Any model your ChatGPT Plus/Pro subscription can access. Common ones:

  • gpt-5.1-codex-mini — fast, default
  • gpt-5.1 — more capable
  • gpt-5.1-codex-max
  • gpt-5.2, gpt-5.2-codex
  • gpt-5.3-codex, gpt-5.3-codex-spark
  • gpt-5.4

The model ID is passed directly to the backend — use whatever ChatGPT shows in its model picker.


auth.json format

Credentials are stored as JSON, compatible with the original JS pi-ai SDK:

{
  "openai-codex": {
    "refresh": "<refresh_token>",
    "access": "<access_token>",
    "expires": 1234567890000,
    "accountId": "<account_id>"
  }
}

If you've already logged in using the JS CLI (npx @mariozechner/pi-ai login openai-codex), the same auth.json works with piai without re-logging in.

Never commit auth.json to version control.


MCP tool servers

piai has a native MCP (Model Context Protocol) client. Pass any MCP server — radare2, IDA Pro, filesystem, web search, or any custom server — and the agent auto-discovers tools and runs the agentic loop for you.

import asyncio
from piai import agent
from piai.mcp import MCPServer
from piai.types import Context, UserMessage, TextDeltaEvent

async def main():
    ctx = Context(
        system_prompt="You are an expert reverse engineer.",
        messages=[UserMessage(content="Analyze /lib/target.so and report all JNI functions.")],
    )

    result = await agent(
        model_id="gpt-5.1-codex-mini",
        context=ctx,
        mcp_servers=[
            MCPServer.stdio("r2pm -r r2mcp"),             # radare2
            MCPServer.stdio("ida-mcp", name="ida"),        # IDA Pro headless
            MCPServer.http("http://127.0.0.1:13337/mcp"),  # IDA Pro HTTP server
        ],
        options={"reasoning_effort": "medium"},
        max_turns=30,
        on_event=lambda e: print(e.text, end="", flush=True) if isinstance(e, TextDeltaEvent) else None,
    )

asyncio.run(main())

Transport types:

  • MCPServer.stdio("command --args") — spawns a local subprocess
  • MCPServer.http("http://host/mcp") — Streamable HTTP (modern)
  • MCPServer.sse("http://host/sse") — legacy SSE transport

Auth shorthand:

MCPServer.http("https://api.example.com/mcp", bearer_token="my-token")
MCPServer.stdio("my-server", env_extra={"API_KEY": "secret"})

Load from a TOML config file:

Create ~/.piai/config.toml (or any path you prefer):

[mcp_servers.r2]
command = "r2pm"
args = ["-r", "r2mcp"]

[mcp_servers.ida]
command = "ida-mcp"

[mcp_servers.ida-http]
url = "http://127.0.0.1:13337/mcp"

[mcp_servers.remote]
url = "https://api.example.com/mcp"
bearer_token = "my-token"

[mcp_servers.with-env]
command = "my-server"

[mcp_servers.with-env.env_extra]
API_KEY = "secret"

Then load in one line:

from piai.mcp import MCPServer

servers = MCPServer.from_toml("~/.piai/config.toml")
result = await agent(model_id="gpt-5.1-codex-mini", context=ctx, mcp_servers=servers)

agent() options:

result = await agent(
    model_id="gpt-5.1-codex-mini",
    context=ctx,
    mcp_servers=[...],
    options={"reasoning_effort": "medium"},
    max_turns=20,                    # safety limit on agentic iterations
    on_event=my_callback,            # sync or async callback for every StreamEvent
    require_all_servers=False,       # True = raise if any server fails to connect
    connect_timeout=60.0,            # per-server connection timeout in seconds
    tool_result_max_chars=32_000,    # max chars per tool result (prevents context explosion)
)

Pre-defined tools + MCP: If you pass both context.tools and mcp_servers, they are merged. MCP tools take priority on name conflicts; your pre-defined tools are appended de-duplicated.

See docs/mcp.md for the full MCP reference.


LangChain integration

PiAIChatModel is a drop-in LangChain BaseChatModel backed by piai. Use it anywhere LangChain accepts a chat model — chains, agents, tools.

from piai.langchain import PiAIChatModel
from langchain_core.messages import HumanMessage

llm = PiAIChatModel(model_name="gpt-5.1-codex-mini")

# Invoke
result = llm.invoke([HumanMessage(content="What is 2+2?")])
print(result.content)

# Stream
async for chunk in llm.astream([HumanMessage(content="Tell me a joke")]):
    print(chunk.content, end="", flush=True)

# With tools (works with any LangChain agent or tool framework)
llm_with_tools = llm.bind_tools([my_tool])
result = llm_with_tools.invoke([HumanMessage(content="Use the tool")])

Project structure

src/piai/
├── __init__.py              # Public API: stream, complete, complete_text, agent, MCPServer
├── types.py                 # Context, messages, stream events
├── stream.py                # Entry points with auth handling
├── agent.py                 # Autonomous agentic loop with MCP support
├── cli.py                   # CLI commands
├── mcp/
│   ├── server.py            # MCPServer config (stdio/http/sse + from_toml)
│   ├── client.py            # MCPClient — persistent session per server
│   └── hub.py               # MCPHub — multi-server manager
├── langchain/
│   └── chat_model.py        # PiAIChatModel — LangChain BaseChatModel adapter
├── oauth/
│   ├── pkce.py              # PKCE verifier/challenge (RFC 7636)
│   ├── types.py             # OAuthCredentials, OAuthProviderInterface
│   ├── storage.py           # auth.json read/write
│   ├── openai_codex.py      # ChatGPT Plus OAuth login + refresh
│   └── __init__.py          # Provider registry + get_oauth_api_key()
└── providers/
    ├── message_transform.py # Context → OpenAI Responses API format
    └── openai_codex.py      # SSE streaming to chatgpt.com/backend-api

Running tests

uv run pytest tests/ -v

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pi_ai_py-0.2.0.tar.gz (35.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pi_ai_py-0.2.0-py3-none-any.whl (45.0 kB view details)

Uploaded Python 3

File details

Details for the file pi_ai_py-0.2.0.tar.gz.

File metadata

  • Download URL: pi_ai_py-0.2.0.tar.gz
  • Upload date:
  • Size: 35.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.14

File hashes

Hashes for pi_ai_py-0.2.0.tar.gz
Algorithm Hash digest
SHA256 54daf40c9058bb7c5ae066698323de639ed14602561bb8bc126ff5fdf038f314
MD5 4f4475c0ac4447321f2adc4ae2d3e61e
BLAKE2b-256 05dea1a434357d16b57f7b2a3724402c09e2d97d0e2b0656838d5497039ac97a

See more details on using hashes here.

File details

Details for the file pi_ai_py-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: pi_ai_py-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 45.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.14

File hashes

Hashes for pi_ai_py-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ca69d5954936bff6d11b4dec0c135816099a372c20dd7c50960af0b33ca4ac1c
MD5 94ef952be65988d47ac6d6116432d331
BLAKE2b-256 cf077a37afc25939a425c8a09ad072514fd3136721df58e6f0587b204899f9d4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page