A correct, simple, performant, and pythonic framework for building durable AI agents

Project description

PocketJoe

LLM Agents are just agents...

Agents are policies
A policy reasons over observations and chooses a batch of options
A policy can be any mix of LLM-based, human-in-the-loop, or heuristic

Semantics

An agent system using Reinforcement Learning theory with LLM semantics as first class

policy: all code/logic/llm are policies
observations - the set of observations for the policy to reason over
options - additional action spaces available to the policy
selected_actions - the set of concurrent actions the policy chose to take
Message: a shared dataclass for observations and actions that aligns with llm semantics

LLM semantics as platform semantics

In LLM APIs, everything is a Message. We adopt this as our universal unit:

Input: observations: list[Message] (what the policy sees)
Output: selected_actions - the policy's action space (owns its outputs)

Key insight: When options are provided, they expand the policy's action space. The runtime automatically invokes all option calls and injects the results back as observations.

Everything is a Policy

Universal Return Types: Policies can return any JSON-serializable type - the framework automatically wraps results when called as options.

An LLM policy using the adapter pattern:

@policy.tool(description="OpenAI-compatible chat completions")
async def llm_policy(
    observations: list[Message],
    options: list[OptionSchema] | None = None,
) -> list[Message]:
    """Call chat completions API and return option_call or text messages."""
    adapter = CompletionsAdapter(observations, options)
    client = CompletionsAdapter.client()
    response = await client.chat.completions.create(
        model="gpt-4o",
        messages=adapter.messages,
        tools=adapter.tools or [],
    )
    return adapter.decode(response, policy="llm_policy")

A simple helper policy returning primitives:

@policy.tool(description="Performs a web search and returns results.")
async def web_search_policy(query: str) -> str:
    """Performs a web search and returns results."""
    results = DDGS().text(query, max_results=5)
    return "\n\n".join([f"Title: {r['title']}\nURL: {r['href']}\nSnippet: {r['body']}" for r in results])

A policy returning structured data with Pydantic:

from pydantic import BaseModel

class TranscriptResult(BaseModel):
    title: str
    transcript: str
    thumbnail_url: str
    video_id: str
    error: str | None = None

@policy.tool(description="Transcribe YouTube video")
async def transcribe_youtube_policy(url: str) -> TranscriptResult:
    """Get video title, transcript and metadata from YouTube URL."""
    video_id = _extract_video_id(url)
    transcript = YouTubeTranscriptApi().fetch(video_id)
    return TranscriptResult(
        title=title,
        transcript=" ".join([snippet.text for snippet in transcript]),
        thumbnail_url=f"https://img.youtube.com/vi/{video_id}/maxresdefault.jpg",
        video_id=video_id
    )

An orchestrator policy that coordinates LLM + search:

@policy.tool(description="Orchestrates LLM with web search tool")
async def search_agent(prompt: str, max_iterations: int = 3) -> list[Message]:
    """Orchestrator that gives the LLM access to web search."""
    ctx = AppContext.get_ctx()

    system_builder = MessageBuilder(policy="system", role_hint_for_llm="system")
    system_builder.add_text("You are an AI assistant that can use tools.")
    system_message = system_builder.to_message()

    prompt_builder = MessageBuilder(policy="user", role_hint_for_llm="user")
    prompt_builder.add_text(prompt)
    prompt_message = prompt_builder.to_message()

    history = [system_message, prompt_message]
    for _ in range(max_iterations):
        selected_actions = await ctx.llm(
            observations=history,
            options=OptionSchema.from_func([ctx.web_search])
        )
        history.extend(selected_actions)
        if not any(msg.payload and isinstance(msg.payload, OptionCallPayload) for msg in selected_actions):
            break

    return history

Use AppContext for registry (gives IDE type hints):

class AppContext(BaseContext):
    def __init__(self, runner):
        super().__init__(runner)
        self.llm = self._bind(llm_policy)
        self.web_search = self._bind(web_seatch_ddgs_policy)
        self.search_agent = self._bind(search_agent)

Enjoy:

async def main():
    runner = InMemoryRunner()
    ctx = AppContext(runner)
    result = await ctx.search_agent(prompt="What is the latest Python version?")

    # Get final text message (Message.__str__ extracts text automatically)
    final_msg = next((msg for msg in reversed(result) if msg.parts), '')
    print(f"\nFinal Result: {final_msg}")

Why this matters:

Universal Composability: Decorate any function - it works like FastAPI/FastMCP endpoints
Flexible Return Types: Return primitives (str), Pydantic models, or list[Message] for complex flows
Auto-wrapping: Framework automatically wraps results in OptionResultPayload when called as options
Type-safe: Full IDE support with typed context and message payloads
Evolution-friendly: Start simple (primitives) → add complexity (messages) with no refactoring

A correct, simple, performant, and pythonic framework for building durable AI agents.

"There is no flow, only Policies and Actions."

Working with Media

MediaPart supports three mutually exclusive sources for image/audio/video content:

from pocket_joe import MessageBuilder, MediaPart, iter_parts

# URL source - for remote images
builder = MessageBuilder(policy="agent")
builder.add_image(url="https://example.com/photo.png", mime="image/png")

# Path source - for local files (adapter handles reading)
builder.add_image_path(path="/path/to/image.png")

# Bytes source - for generated/inline content (base64-encoded internally)
builder.add_image_bytes(data=image_bytes, mime="image/png", prompt_hint="Generated cat")

Iterating Over Parts

Use iter_parts() to iterate over all parts across multiple messages with optional type filtering:

from pocket_joe import iter_parts, MediaPart, TextPart

# Find first image with inline data
first_image = next(
    (p for p in iter_parts(messages, MediaPart) if p.data_b64),
    None
)
if first_image:
    raw_bytes = first_image.get_bytes()

# Check if any images exist
has_images = any(iter_parts(messages, MediaPart))

# Get all text content
all_text = [p.text for p in iter_parts(messages, TextPart)]

Getting Started

Prerequisites

Python 3.12+

Installation

uv add pocket-joe

Or with pip:

pip install pocket-joe

To install with example dependencies:

uv add pocket-joe --extra examples
# or
pip install pocket-joe[examples]

Development Setup

git clone https://github.com/Sohojoe/pocket-joe.git
cd pocket-joe
uv sync --dev --all-extras

Running Examples

Set your API key:

export OPENAI_API_KEY=sk-...

Search Agent (ReAct)

uv run python examples/search_agent.py

YouTube Summarizer

uv run python examples/youtube_summarizer.py

Dev Status

Still in prerelease, things will change

Initial version

[] Tidy up code - add partly refactored code
[] Proper tests
[] Implement more examples from Pocket-Flow

Durable System:

[] Ledger - Temporal style 'at least once, only one result' replay semantic
[] Durable Storage wrapper - For long running tasks & replay
[] Distributed - worker model

Background

Inspired by PocketFlow... I loved PocketFlow but it fell short in a couple of key areas. This is my rewrite that I can actually use.

Project details

Release history Release notifications | RSS feed

This version

0.2.0.10

Dec 21, 2025

0.2.0.8

Dec 17, 2025

0.2.0.7

Dec 17, 2025

0.2.0.6

Dec 15, 2025

0.2.0.5

Dec 15, 2025

0.1.0.4

Dec 14, 2025

0.1.0.3

Dec 7, 2025

0.1.0.2

Dec 1, 2025

0.1.0.1

Dec 1, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pocket_joe-0.2.0.10.tar.gz (117.1 kB view details)

Uploaded Dec 21, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pocket_joe-0.2.0.10-py3-none-any.whl (14.4 kB view details)

Uploaded Dec 21, 2025 Python 3

File details

Details for the file pocket_joe-0.2.0.10.tar.gz.

File metadata

Download URL: pocket_joe-0.2.0.10.tar.gz
Upload date: Dec 21, 2025
Size: 117.1 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pocket_joe-0.2.0.10.tar.gz
Algorithm	Hash digest
SHA256	`18368ee9ad363a54b48a610c9883a3ab744fdd558604741253cc66622c330e61`
MD5	`3fa367199067f00dabbf66a81268a3eb`
BLAKE2b-256	`a56ef12dd7692a2d6aea90747b95f3dda5359a82ef027fe11de347fb1ef70492`

See more details on using hashes here.

Provenance

The following attestation bundles were made for pocket_joe-0.2.0.10.tar.gz:

Publisher: publish.yml on Sohojoe/pocket-joe

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: pocket_joe-0.2.0.10.tar.gz
- Subject digest: 18368ee9ad363a54b48a610c9883a3ab744fdd558604741253cc66622c330e61
- Sigstore transparency entry: 774626743
- Sigstore integration time: Dec 21, 2025
Source repository:
- Permalink: Sohojoe/pocket-joe@1fe983cef1133f541c468ac3d7ff4defc3c56e33
- Branch / Tag: refs/heads/main
- Owner: https://github.com/Sohojoe
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@1fe983cef1133f541c468ac3d7ff4defc3c56e33
- Trigger Event: push

File details

Details for the file pocket_joe-0.2.0.10-py3-none-any.whl.

File metadata

Download URL: pocket_joe-0.2.0.10-py3-none-any.whl
Upload date: Dec 21, 2025
Size: 14.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pocket_joe-0.2.0.10-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6d4fd930cae8a482e890c02b5090bcbf9d4272623c0e8fc43837b5c79f3f34bb`
MD5	`7d43818e560674b721069b40a4954bfc`
BLAKE2b-256	`bfbe372445631dcda70e0b68bb3da6678da78b0d007436c39572b145ff9558c4`

See more details on using hashes here.

Provenance

The following attestation bundles were made for pocket_joe-0.2.0.10-py3-none-any.whl:

Publisher: publish.yml on Sohojoe/pocket-joe

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: pocket_joe-0.2.0.10-py3-none-any.whl
- Subject digest: 6d4fd930cae8a482e890c02b5090bcbf9d4272623c0e8fc43837b5c79f3f34bb
- Sigstore transparency entry: 774626745
- Sigstore integration time: Dec 21, 2025
Source repository:
- Permalink: Sohojoe/pocket-joe@1fe983cef1133f541c468ac3d7ff4defc3c56e33
- Branch / Tag: refs/heads/main
- Owner: https://github.com/Sohojoe
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@1fe983cef1133f541c468ac3d7ff4defc3c56e33
- Trigger Event: push

pocket-joe 0.2.0.10

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

PocketJoe

Semantics

LLM semantics as platform semantics

Everything is a Policy

Working with Media

Iterating Over Parts

Getting Started

Prerequisites

Installation

Development Setup

Running Examples

Search Agent (ReAct)

YouTube Summarizer

Dev Status

Background

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance