Streaming agents
Project description
Cogency
Streaming agents with stateless context assembly
Architecture
Cogency enables stateful agent execution through:
- Persist-then-rebuild: Write every LLM output event to storage immediately, rebuild context from storage on each execution
- Delimiter protocol: Explicit state signaling (
§think,§call,§execute,§respond,§end) - Stateless design: Agent and context assembly are pure functions, all state externalized to storage
This eliminates stale state bugs, enables crash recovery, and provides concurrent safety by treating storage as single source of truth.
Execution Modes
Resume: WebSocket session persists between tool calls
agent = Agent(llm="openai", mode="resume")
# Maintains LLM session, injects tool results without context replay
# Constant token usage per turn
Replay: Fresh HTTP request per iteration
agent = Agent(llm="openai", mode="replay")
# Rebuilds context from storage each iteration
# Context grows with conversation
# Universal LLM compatibility
Auto: Resume with fallback to Replay
agent = Agent(llm="openai", mode="auto") # Default
# Uses WebSocket when available, falls back to HTTP
Token Efficiency
Resume mode maintains LLM session state, eliminating context replay on every tool call:
| Turns | Replay (context replay) | Resume (session state) | Efficiency |
|---|---|---|---|
| 8 | 31,200 tokens | 6,000 tokens | 5.2x |
| 16 | 100,800 tokens | 10,800 tokens | 9.3x |
| 32 | 355,200 tokens | 20,400 tokens | 17.4x |
Mathematical proof: docs/proof.md
Installation
pip install cogency
export OPENAI_API_KEY="your-key"
Usage
from cogency import Agent
agent = Agent(llm="openai")
async for event in agent("What files are in this directory?"):
if event["type"] == "respond":
print(event["content"])
Event Streaming
Semantic mode (default): Complete thoughts
async for event in agent("Debug this code", chunks=False):
if event["type"] == "think":
print(f"~ {event['content']}")
elif event["type"] == "respond":
print(f"> {event['content']}")
Token mode: Real-time streaming
async for event in agent("Debug this code", chunks=True):
if event["type"] == "respond":
print(event["content"], end="", flush=True)
Multi-turn Conversations
# Stateless (default)
async for event in agent("What's in this directory?"):
if event["type"] == "respond":
print(event["content"])
# Stateful with profile learning
async for event in agent(
"Continue our code review",
conversation_id="review_session",
user_id="developer" # For profile learning and multi-tenancy
):
if event["type"] == "respond":
print(event["content"])
Built-in Tools
read,write,edit,list,grepsearch,scraperecallshell
Custom Tools
from cogency import Tool, ToolResult
class DatabaseTool(Tool):
name = "query_db"
description = "Execute SQL queries"
async def execute(self, sql: str, user_id: str):
# Your implementation
return ToolResult(
outcome="Query executed",
content="Results..."
)
agent = Agent(llm="openai", tools=[DatabaseTool()])
Configuration
agent = Agent(
llm="openai", # or "gemini", "anthropic"
mode="auto", # "resume", "replay", or "auto"
storage=custom_storage, # Custom Storage implementation
identity="Custom agent identity",
instructions="Additional context",
tools=[CustomTool()],
max_iterations=10,
history_window=None, # None = full history (default), int = sliding window
profile=True, # Enable automatic user learning
learn_every=5, # Profile update frequency
debug=False
)
Context Management
Cogency uses conversational message assembly for natural LLM interaction:
Storage: Events stored as typed records (clean content, no delimiters)
{"type": "user", "content": "debug this"}
{"type": "think", "content": "checking logs"}
{"type": "call", "content": '{"name": "read", ...}'}
Assembly: Transforms to proper conversational structure
[
{"role": "system", "content": "PROTOCOL + TOOLS"},
{"role": "user", "content": "debug this"},
{"role": "assistant", "content": "§think: checking logs\n§call: {...}\n§execute"},
{"role": "user", "content": "§result: ..."}
]
Cost control with history_window:
history_window=None- Full conversation history (default)history_window=20- Last 20 messages (sliding window for cost control)- Custom compaction: Query storage directly and implement app-level strategy
Considerations:
- Resume mode: Context sent once at connection, minimal impact
- Replay mode: Context grows with conversation, windowing recommended for long sessions
- Frontier models: Handle longer contexts better, can use
None - Weaker models: May benefit from smaller windows (e.g., 10-20 messages)
Multi-Provider Support
agent = Agent(llm="openai") # GPT-4o Realtime API (WebSocket)
agent = Agent(llm="gemini") # Gemini Live (WebSocket)
agent = Agent(llm="anthropic") # Claude (HTTP only)
Custom CLI
Build your own CLI with the exported Renderer:
from cogency import Agent, Renderer
agent = Agent(llm="anthropic", tools=my_tools())
renderer = Renderer()
async def main():
await renderer.render_stream(
agent("your query", conversation_id="session")
)
CLI
# Install with poetry
poetry install
# Stateless (default)
cogency run "What files are in this directory?"
# Multi-turn conversation
cogency run "hi" --conv session1
cogency run "what did i say?" --conv session1
# Custom user for profile learning
cogency run "help me code" --user alice
# Custom agent
cogency run "analyze this" --agent my_agent.py
# View conversation history
cogency conv session1
# Debug commands
cogency context system # Show system prompt
cogency context <id> # Show assembled context
cogency stats # Database statistics
cogency users # User profiles
cogency nuke # Delete .cogency folder
Display format:
> Agent response to user
~ Internal reasoning
○ Tool execution begins
● Tool execution complete
% 890➜67|4.8s (input→output tokens|duration)
Memory System
Passive profile: Automatic user preference learning
agent = Agent(llm="openai", profile=True)
# Learns patterns from interactions, embedded in system prompt
Active recall: Cross-conversation search
# Agent uses recall tool to query past interactions
§call: {"name": "recall", "args": {"query": "previous python debugging"}}
§execute
[SYSTEM: Found 3 previous debugging sessions...]
§respond: Based on your previous Python work...
Streaming Protocol
Agents signal execution state explicitly:
§think: I need to examine the code structure first
§call: {"name": "read", "args": {"file": "main.py"}}
§execute
[SYSTEM: Found syntax error on line 15]
§respond: Fixed the missing semicolon. Code runs correctly now.
§end
Parser detects delimiters, accumulator handles tool execution, persister writes to storage.
See docs/protocol.md for complete specification.
Documentation
- architecture.md - Core pipeline and design decisions
- protocol.md - Delimiter protocol specification
- proof.md - Mathematical efficiency analysis
License
Apache 2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cogency-3.1.0.tar.gz.
File metadata
- Download URL: cogency-3.1.0.tar.gz
- Upload date:
- Size: 55.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.3 CPython/3.12.10 Darwin/24.5.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a6d2083cc3f2717508b400ee2329dcf2720b137256dd5b8086a0d973304f8420
|
|
| MD5 |
1a84dcfece84b4d6ecbf9bc97aaa8722
|
|
| BLAKE2b-256 |
b1fa62b21c3cf7ea3ee52e5ff806c13509ac5c429f57779fe6bb6888b6e2a398
|
File details
Details for the file cogency-3.1.0-py3-none-any.whl.
File metadata
- Download URL: cogency-3.1.0-py3-none-any.whl
- Upload date:
- Size: 73.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.3 CPython/3.12.10 Darwin/24.5.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fb343940017faae9c51c14098551278e9437b47f5ac3d538200346c52377ab59
|
|
| MD5 |
c4bd2b3252518705c0dc985776e753aa
|
|
| BLAKE2b-256 |
15619bf789bcd3ffa5290397914d940cb9c34492606966e284a589e94252a3de
|