Production AI agent framework with memory control and semantic caching
Project description
Cogent
Build AI agents that actually work.
📚 Documentation: https://milad-o.github.io/cogent
Installation • Quick Start • Architecture • Capabilities • Examples
Cogent is a production AI agent framework built on cutting-edge research in memory control and semantic caching. Unlike frameworks focused on multi-agent orchestration, Cogent emphasizes bounded memory, reasoning artifacts caching, and tool augmentation for superior performance and reliability.
Why Cogent?
- 🧠 Memory Control — Bio-inspired bounded memory prevents context drift and poisoning
- ⚡ Semantic Caching — Cache reasoning artifacts (intents, plans) at 80%+ hit rates
- 🚀 Fast — Parallel tool execution, cached model binding, direct SDK calls
- 🔧 Simple — Define tools with
@tool, create agents in 3 lines, no boilerplate - 🏭 Production-ready — Built-in resilience, observability, and security interceptors
- 📦 Batteries included — File system, web search, code sandbox, browser, PDF, and more
from cogent import Agent, tool
@tool
async def search(query: str) -> str:
"""Search the web."""
# Your search implementation
return results
agent = Agent(name="Assistant", model="gpt-4o-mini", tools=[search])
result = await agent.run("Find the latest news on AI agents")
🎉 Latest Changes (v1.19.0)
A2A Protocol + Observer Redesign + GPT-5.4 Defaults
- 🌐 Agent2Agent (A2A) —
A2AAgentwraps any A2A HTTP endpoint as a drop-in subagent.A2AServerexposes any cogent agent as an A2A server with SSE streaming, persistent task store, push notifications, and auth schemes.Agent.serve(port=...)is a one-liner for scripts. Requirescogent-ai[a2a]. - 🔭 Observer levels simplified — Four levels:
"off","progress"(default),"debug","trace". Old preset aliases (minimal,verbose,normal,detailed) removed fromObserver;verbosity=string shortcuts are unchanged. - 🎨 Console formatter overhaul — Semantic label colors, consistent indented body layout,
[type@address]source labels for A2A and MCP tools (e.g.a2a://analyst@localhost:10088). - 🤖 GPT-5.4 defaults —
"gpt4"/"gpt4o"aliases now resolve togpt-5.4;"gpt4-mini"resolves togpt-5.4-mini. - 🔗
mcps=parameter onAgent— First-class MCP argument alongsidetools=,capabilities=, andsubagents=.
See CHANGELOG.md for full version history and migration guide.
Features
- Native Executor — High-performance parallel tool execution with zero framework overhead
- Native Model Support — OpenAI, Azure, Anthropic, Gemini, Groq, Ollama, Custom endpoints
- Capabilities — Filesystem, Web Search, Code Sandbox, Browser, PDF, Shell, MCP, Spreadsheet, and more
- RAG Pipeline — Document loading, per-file-type splitting, embeddings, vector stores, retrievers
- Memory & Persistence — Conversation history, long-term memory with fuzzy matching (docs/memory.md)
- Memory Control (ACC) — Bio-inspired bounded memory prevents drift (docs/acc.md)
- Semantic Caching — Cache reasoning artifacts at 80%+ hit rates (docs/memory.md#semantic-cache)
- Observability — Event-based tracing, metrics, progress tracking, and runtime event history
- TaskBoard — Built-in task tracking for complex multi-step workflows
- Interceptors — Budget guards, rate limiting, PII protection, tool gates
- Resilience — Two-tier recovery: systematic retry + intelligent LLM-driven retry
- Human-in-the-Loop — Tool approval, guidance, interruption handling
- Agent2Agent (A2A) —
A2AAgentconnects to remote A2A agents as subagents;A2AServer/Agent.serve()expose agents as A2A endpoints - Streaming — Real-time token streaming with callbacks
- Structured Output — Type-safe responses (Pydantic, dataclass, TypedDict, primitives, Literal, Union, Enum, collections, dict, None)
- Reasoning — Extended thinking mode with chain-of-thought
Modules
Cogent is organized into focused modules, each with multiple backends and implementations.
cogent.models — LLM Providers
Native SDK wrappers for all major LLM providers with zero abstraction overhead.
| Provider | Chat | Embeddings | String Alias | Notes |
|---|---|---|---|---|
| OpenAI | OpenAIChat |
OpenAIEmbedding |
"gpt4", "gpt-4o", "gpt-4o-mini" |
GPT-5.4 series, o1, o3 |
| Azure | AzureOpenAIChat |
AzureOpenAIEmbedding |
— | Managed Identity, Entra ID auth |
| Azure AI Foundry | AzureAIFoundryChat |
— | — | GitHub Models integration |
| Anthropic | AnthropicChat |
— | "claude", "claude-opus" |
Claude 3.5 Sonnet, extended thinking |
| Gemini | GeminiChat |
GeminiEmbedding |
"gemini", "gemini-pro" |
Gemini 2.5 Pro/Flash |
| Groq | GroqChat |
— | "llama", "mixtral" |
Fast inference, Llama 3.3, Mixtral |
| xAI | XAIChat |
— | "grok" |
Grok 4.20, Grok 4, vision models |
| DeepSeek | DeepSeekChat |
— | "deepseek" |
DeepSeek Chat, DeepSeek Reasoner |
| Cerebras | CerebrasChat |
— | "cerebras" |
Ultra-fast inference with WSE-3 |
| Mistral | MistralChat |
MistralEmbedding |
"mistral", "codestral" |
Mistral Large, Ministral |
| Cohere | CohereChat |
CohereEmbedding |
"command", "command-r" |
Command R+, Aya |
| OpenRouter | OpenRouterChat |
— | "or-gpt4o", "or-claude", "or-auto" |
200+ models via OpenRouter |
| Cloudflare | CloudflareChat |
CloudflareEmbedding |
— | Workers AI (@cf/...) |
| Ollama | OllamaChat |
OllamaEmbedding |
"ollama" |
Local models, any GGUF |
| Custom | CustomChat |
CustomEmbedding |
— | vLLM, Together AI, any OpenAI-compatible |
# 3 ways to create models
# 1. Simple strings (recommended)
agent = Agent("Helper", model="gpt4")
agent = Agent("Helper", model="claude")
agent = Agent("Helper", model="gemini")
# 2. Factory functions
from cogent import create_chat, create_embedding
model = create_chat("gpt4") # String alias
model = create_chat("gpt-4o-mini") # Model name
model = create_chat("claude-sonnet-4") # Auto-detects provider
model = create_chat("grok-4") # xAI Grok
model = create_chat("deepseek-chat") # DeepSeek
embeddings = create_embedding("openai:text-embedding-3-small") # Explicit provider:model
# 3. Direct instantiation (full control)
from cogent.models import OpenAIChat, XAIChat, DeepSeekChat
model = OpenAIChat(model="gpt-4o", temperature=0.7, api_key="sk-...")
model = XAIChat(model="grok-4", api_key="xai-...")
model = DeepSeekChat(model="deepseek-reasoner", api_key="sk-...")
cogent.capabilities — Agent Capabilities
Composable tools that plug into any agent. Each capability adds related tools.
| Capability | Description | Tools Added |
|---|---|---|
| HTTPClient | Full-featured HTTP client | http_request, http_get, http_post with retries, timeouts |
| Database | Async SQL database access | execute_query, fetch_one, fetch_all with connection pooling |
| APITester | HTTP endpoint testing | test_endpoint, assert_status, assert_json |
| DataValidator | Schema validation | validate_data, validate_json, validate_dict with Pydantic |
| WebSearch | Web search with caching | web_search, news_search with semantic cache |
| Browser | Playwright automation | navigate, click, fill, screenshot |
| FileSystem | Sandboxed file operations | read_file, write_file, list_dir, search_files |
| CodeSandbox | Safe Python execution | execute_python, run_function |
| Shell | Sandboxed shell commands | run_command |
| PDF processing | read_pdf, create_pdf, merge_pdfs |
|
| Spreadsheet | Excel/CSV operations | read_spreadsheet, write_spreadsheet |
| MCP | Model Context Protocol | Dynamic tools from MCP servers |
from cogent.capabilities import FileSystem, CodeSandbox, WebSearch, HTTPClient, Database
from cogent.capabilities import MCP
agent = Agent(
name="Assistant",
model="gpt-4o-mini",
capabilities=[
FileSystem(allowed_paths=["./project"]),
CodeSandbox(timeout=30),
WebSearch(),
HTTPClient(),
Database("sqlite:///data.db"),
],
mcps=MCP(command="npx", args=["-y", "@modelcontextprotocol/server-filesystem", "."]),
)
cogent.document — Document Processing
Load, split, and process documents for RAG pipelines.
Loaders — Support for all common file formats:
| Loader | Formats | Notes |
|---|---|---|
TextLoader |
.txt, .rst |
Plain text extraction |
MarkdownLoader |
.md |
Markdown with structure |
PDFLoader |
.pdf |
Basic text extraction (pypdf/pdfplumber) |
PDFMarkdownLoader |
.pdf |
Clean markdown output (pymupdf4llm) |
| PDFVisionLoader | .pdf | Vision model-based extraction |
| WordLoader | .docx | Microsoft Word documents |
| HTMLLoader | .html, .htm | HTML documents |
| CSVLoader | .csv | CSV files |
| JSONLoader | .json, .jsonl | JSON documents |
| XLSXLoader | .xlsx | Excel spreadsheets |
| CodeLoader | .py, .js, .ts, .java, .go, .rs, .cpp, etc. | Source code files |
Splitters — Multiple chunking strategies:
| Splitter | Strategy |
|---|---|
RecursiveCharacterSplitter |
Hierarchical separators (default) |
SentenceSplitter |
Sentence boundary detection |
MarkdownSplitter |
Markdown structure-aware |
HTMLSplitter |
HTML tag-based |
CodeSplitter |
Language-aware code splitting |
SemanticSplitter |
Embedding-based semantic chunking |
TokenSplitter |
Token count-based |
from cogent.document import DocumentLoader, SemanticSplitter
loader = DocumentLoader()
docs = await loader.load_directory("./documents")
splitter = SemanticSplitter(model=model)
chunks = splitter.split_documents(docs)
cogent.vectorstore — Vector Storage
Semantic search with pluggable backends and embedding providers.
Backends:
| Backend | Use Case | Persistence |
|---|---|---|
InMemoryBackend |
Development, small datasets | No |
FAISSBackend |
Large-scale local search | Optional |
ChromaBackend |
Persistent vector database | Yes |
QdrantBackend |
Production vector database | Yes |
PgVectorBackend |
PostgreSQL integration | Yes |
Embedding Providers:
| Provider | Model Examples |
|---|---|
OpenAI |
openai:text-embedding-3-small, openai:text-embedding-3-large |
Ollama |
ollama:nomic-embed-text, ollama:mxbai-embed-large |
Mock |
Testing only |
from cogent import create_embedding
from cogent.vectorstore import VectorStore
from cogent.vectorstore.backends import FAISSBackend
store = VectorStore(
embeddings=create_embedding("openai:text-embedding-3-large"),
backend=FAISSBackend(dimension=3072),
)
cogent.memory — Memory & Persistence
Long-term memory with fuzzy matching (semantic fallback optional), conversation history, and scoped views.
Stores:
| Store | Backend | Features |
|---|---|---|
InMemoryStore |
Dict | Fast, no persistence |
SQLAlchemyStore |
SQLite, PostgreSQL, MySQL | Async, full SQL |
RedisStore |
Redis | Distributed, native TTL |
from cogent.memory import Memory, SQLAlchemyStore
memory = Memory(store=SQLAlchemyStore("sqlite+aiosqlite:///./data.db"))
# Scoped views
user_mem = memory.scoped("user:alice")
team_mem = memory.scoped("team:research")
cogent.executors — Execution Strategies
Pluggable execution strategies that define HOW agents process tasks.
| Executor | Strategy | Use Case |
|---|---|---|
NativeExecutor |
Parallel tool execution | Default, high performance |
SequentialExecutor |
Sequential tool execution | Ordered dependencies |
Standalone execution — bypass Agent class entirely:
from cogent.executors import run
result = await run(
"Search for Python tutorials and summarize",
tools=[search, summarize],
model="gpt-4o-mini",
)
cogent.interceptors — Middleware
Composable middleware for cross-cutting concerns.
| Category | Interceptors |
|---|---|
| Budget | BudgetGuard (token/cost limits) |
| Security | PIIShield, ContentFilter |
| Rate Limiting | RateLimiter, ThrottleInterceptor |
| Context | ContextCompressor, TokenLimiter |
| Gates | ToolGate, PermissionGate, ConversationGate |
| Resilience | Failover, CircuitBreaker, ToolGuard |
| Audit | Auditor (event logging) |
| Prompt | PromptAdapter, ContextPrompt, LambdaPrompt |
from cogent.interceptors import BudgetGuard, PIIShield, RateLimiter
agent = Agent(
name="Safe",
model="gpt-4o-mini",
intercept=[
BudgetGuard(max_model_calls=100),
PIIShield(patterns=["email", "ssn"]),
RateLimiter(requests_per_minute=60),
]
)
cogent.observability — Monitoring & Tracing
Comprehensive monitoring for understanding system behavior.
| Component | Purpose |
|---|---|
ExecutionTracer |
Deep execution tracing with spans |
MetricsCollector |
Counter, Gauge, Histogram, Timer |
ProgressTracker |
Real-time progress output |
Observer |
Unified observability with history capture |
Dashboard |
Visual inspection interface |
Inspectors |
Agent, Task, Event inspection |
Renderers: TextRenderer, RichRenderer, JSONRenderer, MinimalRenderer
from cogent.observability import ExecutionTracer, ProgressTracker
tracer = ExecutionTracer()
async with tracer.trace("my-operation") as span:
span.set_attribute("user_id", user_id)
result = await do_work()
Installation
Note: The package is published as
cogent-aion PyPI, but you import it ascogentin your code.
# Install from PyPI
uv add cogent-ai
# With extras
uv add "cogent-ai[vector-stores,retrieval]"
uv add "cogent-ai[database]"
uv add "cogent-ai[all-backend]"
uv add "cogent-ai[all]"
# Or install from source (latest)
uv add git+https://github.com/milad-o/cogent.git
uv add "cogent-ai[all] @ git+https://github.com/milad-o/cogent.git"
Optional dependency groups:
| Group | Purpose | Includes |
|---|---|---|
vector-stores |
Vector databases | FAISS, Qdrant, SciPy |
retrieval |
Retrieval libraries | BM25, sentence-transformers |
database |
SQL databases | SQLAlchemy, aiosqlite, asyncpg, psycopg2 |
infrastructure |
Infrastructure | Redis |
web |
Web tools | BeautifulSoup4, DuckDuckGo search |
browser |
Browser automation | Playwright |
document |
Document processing | PDF, Word, Markdown loaders |
api |
API framework | FastAPI, Uvicorn, Starlette |
visualization |
Graphs & charts | PyVis, Gravis, Matplotlib, Seaborn, Pandas |
anthropic |
Claude models | Anthropic SDK |
azure |
Azure models | Azure Identity, Azure AI Inference |
cerebras |
Cerebras models | Cerebras Cloud SDK |
cohere |
Cohere models | Cohere SDK |
gemini |
Gemini models | Google GenAI SDK |
groq |
Groq models | Groq SDK |
all-providers |
All LLM providers | anthropic, azure, cerebras, cohere, gemini, groq |
a2a |
Agent2Agent protocol | a2a-sdk, uvicorn |
all-backend |
All backends | vector-stores, retrieval, database, infrastructure |
all |
Everything | All above + visualization |
Development installation:
# Core dev tools (linting, type checking)
uv sync --group dev
# Add testing
uv sync --group dev --group test
# Add backend tests (vector stores, databases)
uv sync --group dev --group test --group test-backends
# Add documentation
uv sync --group dev --group test --group test-backends --group docs
Core Architecture
Cogent is built around a high-performance Native Executor that eliminates framework overhead while providing enterprise-grade features.
Native Executor
The executor uses a direct asyncio loop with parallel tool execution—no graph frameworks, no unnecessary abstractions:
from cogent import Agent, tool
from cogent.models import ChatModel
@tool
def search(query: str) -> str:
"""Search the web."""
return f"Results for: {query}"
@tool
def calculate(expression: str) -> str:
"""Evaluate math expression."""
return str(eval(expression))
agent = Agent(
name="Assistant",
model="gpt4", # Simple string model
tools=[search, calculate],
)
# Tools execute in parallel when independent
result = await agent.run("Search for Python and calculate 2^10")
Key optimizations:
- Parallel tool execution — Multiple tool calls run concurrently via
asyncio.gather - Cached model binding — Tools bound once at construction, zero overhead per call
- Native SDK integration — Direct OpenAI/Anthropic SDK calls, no translation layers
- Automatic resilience — Rate limit retries with exponential backoff built-in
Tool System
Define tools with the @tool decorator—automatic schema extraction from type hints and docstrings:
from cogent import tool
from cogent.core.context import RunContext
@tool
def search(query: str, max_results: int = 10) -> str:
"""Search the web for information.
Args:
query: Search query string.
max_results: Maximum results to return.
"""
return f"Found {max_results} results for: {query}"
# With context injection for user/session data
@tool
def get_user_preferences(ctx: RunContext) -> str:
"""Get preferences for the current user."""
return f"Preferences for user {ctx.user_id}"
# Async tools supported
@tool
async def fetch_data(url: str) -> str:
"""Fetch data from URL."""
async with httpx.AsyncClient() as client:
response = await client.get(url)
return response.text
Tool features:
- Type hints → JSON schema conversion
- Docstring → description extraction
- Sync and async function support
- Context injection via
ctx: RunContextparameter - Automatic error handling and retries
Standalone Execution
For maximum performance, bypass the Agent class entirely:
from cogent.executors import run
result = await run(
"Search for Python tutorials and summarize the top 3",
tools=[search, summarize],
model="gpt-4o-mini",
)
Quick Start
Simple Agent
import asyncio
from cogent import Agent, tool
@tool
def get_weather(city: str) -> str:
"""Get current weather for a city."""
return f"Weather in {city}: 72°F, sunny"
async def main():
agent = Agent(
name="Assistant",
model="gpt-4o-mini",
tools=[get_weather],
)
result = await agent.run("What's the weather in Tokyo?")
print(result)
asyncio.run(main())
Multi-Agent with Subagents
from cogent import Agent
# Create specialist agents
data_analyst = Agent(
name="data_analyst",
model="gpt-4o-mini",
instructions="Analyze data and provide statistical insights.",
)
market_researcher = Agent(
name="market_researcher",
model="gpt-4o-mini",
instructions="Research market trends and competitive landscape.",
)
# Coordinator delegates to specialists
coordinator = Agent(
name="coordinator",
model="gpt-4o-mini",
instructions="""Coordinate research tasks:
- Use data_analyst for numerical analysis
- Use market_researcher for market trends
Synthesize their findings.""",
# Simply pass the agents - uses their names automatically
subagents=[data_analyst, market_researcher],
)
# Full metadata preserved (tokens, duration, delegation chain)
result = await coordinator.run("Analyze Q4 2025 e-commerce growth")
print(f"Total tokens: {result.metadata.tokens.total_tokens}") # Includes all subagents
print(f" Prompt: {result.metadata.tokens.prompt_tokens}")
print(f" Completion: {result.metadata.tokens.completion_tokens}")
if result.metadata.tokens.reasoning_tokens:
print(f" Reasoning: {result.metadata.tokens.reasoning_tokens}")
print(f"Subagent calls: {len(result.subagent_responses)}")
Agent2Agent (A2A)
Connect to remote agents and expose your own as A2A endpoints. Requires cogent-ai[a2a].
from cogent import Agent
from cogent.agent import A2AAgent, A2AServer
# Wrap a remote A2A endpoint as a subagent
remote_analyst = A2AAgent(
name="analyst",
url="http://localhost:10088",
description="Analyzes financial data",
)
coordinator = Agent(
name="coordinator",
model="gpt-4o-mini",
subagents=[remote_analyst], # mix local and remote freely
)
# Expose an agent as an A2A server (one-liner for scripts)
agent = Agent(name="Assistant", model="gpt-4o-mini")
agent.serve(port=10002) # blocking
# Or async with background=True (default)
server = await A2AServer(agent, port=10002).start()
# ... do work ...
await server.stop()
# Serve multiple agents concurrently
group = await A2AServer.start_many((agent_a, 10001), (agent_b, 10002))
await group.stop_all()
Streaming
agent = Agent(
model="gpt-4o-mini",
stream=True,
)
async for chunk in agent.run_stream("Write a poem"):
print(chunk.content, end="", flush=True)
Human-in-the-Loop
from cogent import Agent
from cogent.agent import InterruptedException
agent = Agent(
name="Assistant",
model="gpt-4o-mini",
tools=[sensitive_tool],
interrupt_on={"tools": ["sensitive_tool"]}, # Require approval
)
try:
result = await agent.run("Do something sensitive")
except InterruptedException as e:
# Handle approval flow
decision = await get_human_decision(e.pending_action)
result = await agent.resume(e.state, decision)
Observability
from cogent import Agent
from cogent.observability import Observer, ObservabilityLevel
# Verbosity levels for agents
agent = Agent(
name="Assistant",
model="gpt-4o-mini",
verbosity="debug", # off | result | progress | detailed | debug | trace
)
# Or use enum/int
agent = Agent(model=model, verbosity=ObservabilityLevel.DEBUG) # Enum
agent = Agent(model=model, verbosity=4) # Int (0-5)
# Boolean shorthand
agent = Agent(model=model, verbosity=True) # → PROGRESS level
# With observer for history capture
observer = Observer(level="debug", capture=["tool.result", "agent.*"])
result = await agent.run("Query", observer=observer)
# Access captured events
for event in observer.history():
print(event)
Interceptors
Control execution flow with middleware:
from cogent.interceptors import (
BudgetGuard, # Token/cost limits
RateLimiter, # Request throttling
PIIShield, # Redact sensitive data
ContentFilter, # Block harmful content
ToolGate, # Conditional tool access
PromptAdapter, # Modify prompts dynamically
Auditor, # Audit logging
)
agent = Agent(
name="Safe",
model="gpt-4o-mini",
intercept=[
BudgetGuard(max_model_calls=100, max_tool_calls=500),
PIIShield(patterns=["email", "ssn"]),
RateLimiter(requests_per_minute=60),
],
)
Structured Output
Type-safe responses with comprehensive type support and automatic validation:
Supported Types:
- Structured Models:
BaseModel,dataclass,TypedDict - Primitives:
str,int,bool,float - Constrained:
Literal["A", "B", "C"] - Collections:
list[T],set[T],tuple[T, ...](wrap in models for reliability) - Polymorphic:
Union[A, B](agent chooses schema) - Enumerations:
Enumtypes - Dynamic:
dict(agent decides structure) - Confirmation:
Nonetype
from pydantic import BaseModel
from typing import Literal, Union
from enum import Enum
from cogent import Agent
# Structured models
class Analysis(BaseModel):
sentiment: str
confidence: float
topics: list[str]
# Configure on agent (all calls use schema)
agent = Agent(
name="Analyzer",
model="gpt-4o-mini",
output=Analysis, # Enforce schema on all runs
)
result = await agent.run("Analyze: I love this product!")
print(result.content.data.sentiment) # "positive"
print(result.content.data.confidence) # 0.95
# OR: Per-call override (more flexible)
agent = Agent(name="Analyzer", model="gpt-4o-mini") # No default schema
result = await agent.run(
"Analyze: I love this product!",
returns=Analysis, # Schema for this call only
)
print(result.content.data.sentiment) # "positive"
# Bare types - return primitive values directly
agent = Agent(name="Reviewer", model="gpt-4o-mini")
result = await agent.run(
"Review this code",
returns=Literal["APPROVE", "REJECT"], # Per-call schema
)
print(result.content.data) # "APPROVE" (bare string)
# Collections - wrap in models for reliability
class Tags(BaseModel):
items: list[str]
agent = Agent(name="Tagger", model="gpt-4o-mini", output=Tags)
result = await agent.run("Extract tags from: Python async FastAPI")
print(result.content.data.items) # ["Python", "async", "FastAPI"]
# Union types - polymorphic responses
from typing import Union
class Success(BaseModel):
status: Literal["success"] = "success"
result: str
class Error(BaseModel):
status: Literal["error"] = "error"
message: str
agent = Agent(name="Handler", model="gpt-4o-mini", output=Union[Success, Error])
# Agent chooses schema based on content
# Enum types
from enum import Enum
class Priority(str, Enum):
LOW = "low"
HIGH = "high"
agent = Agent(name="Prioritizer", model="gpt-4o-mini", output=Priority)
result = await agent.run("Server is down!")
print(result.content.data) # Priority.HIGH
# Dynamic structure - agent decides fields
agent = Agent(name="Analyzer", model="gpt-4o-mini", output=dict)
result = await agent.run("Analyze user feedback")
print(result.content.data) # {"sentiment": "positive", "score": 8, ...}
# Other bare types: str, int, bool, float
agent = Agent(name="Counter", model="gpt-4o-mini", output=int)
result = await agent.run("Count the items")
print(result.content.data) # 5 (bare int)
Reasoning
Extended thinking for complex problems with AI-controlled rounds:
from cogent import Agent
from cogent.agent.reasoning import ReasoningConfig
# Simple: Enable with defaults
agent = Agent(
name="Analyst",
model="gpt-4o",
reasoning=True, # AI decides when ready (up to 10 rounds)
)
# Custom config
agent = Agent(
name="DeepThinker",
model="gpt-4o",
reasoning=ReasoningConfig(
max_thinking_rounds=15, # Safety limit
style=ReasoningStyle.CRITICAL,
),
)
# Per-call override
result = await agent.run(
"Complex analysis task",
reasoning=True, # Enable for this call only
)
Reasoning Styles: ANALYTICAL, EXPLORATORY, CRITICAL, CREATIVE
Resilience
from cogent.agent import ResilienceConfig
agent = Agent(
name="Resilient",
model="gpt-4o-mini",
resilience=ResilienceConfig(
max_retries=3,
strategy="exponential_jitter",
timeout_seconds=30.0,
on_exhaustion="ask_agent", # LLM decides how to recover
),
)
Configuration
Use environment variables or .env:
# LLM Provider
OPENAI_API_KEY=sk-...
# Azure
AZURE_OPENAI_ENDPOINT=https://...
AZURE_OPENAI_DEPLOYMENT=gpt-4o
AZURE_OPENAI_AUTH_TYPE=managed_identity
AZURE_OPENAI_CLIENT_ID=... # optional (user-assigned managed identity)
# Azure (service principal / client secret)
# AZURE_OPENAI_AUTH_TYPE=client_secret
# AZURE_OPENAI_TENANT_ID=...
# AZURE_OPENAI_CLIENT_ID=...
# AZURE_OPENAI_CLIENT_SECRET=...
# Anthropic
ANTHROPIC_API_KEY=...
# Ollama (local)
OLLAMA_HOST=http://localhost:11434
Examples
See examples/ for complete examples organized by category:
Basics (examples/basics/)
| Example | Description |
|---|---|
hello_world.py |
Simple agent with tools |
memory.py |
Conversation persistence |
memory_layers.py |
Multi-layer memory management |
memory_semantic_search.py |
Semantic memory search |
streaming.py |
Real-time token streaming |
structured_output.py |
Type-safe responses (12 patterns) |
Capabilities (examples/capabilities/)
| Example | Description |
|---|---|
browser.py |
Web browsing with Playwright |
code_sandbox.py |
Safe Python execution |
codebase_analyzer.py |
Code analysis agent |
data_validator.py |
Schema validation |
database_agent.py |
SQL database operations |
filesystem.py |
File system operations |
http_agent.py |
HTTP client capability |
kg_agent_viz.py |
Knowledge graph visualization |
knowledge_graph.py |
Knowledge graph construction |
mcp_example.py |
Model Context Protocol integration |
shell.py |
Shell command execution |
spreadsheet.py |
Excel/CSV operations |
web_search.py |
Web search with caching |
Advanced (examples/advanced/)
| Example | Description |
|---|---|
acc.py |
Adaptive Context Control (bounded memory) |
acc_comparison.py |
ACC vs standard memory comparison |
complex_task.py |
Multi-step task handling |
content_review.py |
Content moderation |
context_layer.py |
Context management |
deferred_tools.py |
Deferred tool execution |
executors_demo.py |
Executor strategies (Sequential, Tree Search) |
human_in_the_loop.py |
Approval workflows |
interceptors.py |
Middleware patterns |
model_thinking.py |
Extended thinking mode |
reasoning.py |
Reasoning strategies |
semantic_cache.py |
Semantic caching demo |
single_vs_multi_agent.py |
Single vs delegated agents |
tactical_delegation.py |
Dynamic agent spawning |
taskboard.py |
TaskBoard for complex workflows |
Retrieval (examples/retrieval/)
| Example | Description |
|---|---|
finance_table_example.py |
Financial data extraction |
hyde.py |
Hypothetical Document Embeddings |
pdf_summarizer.py |
PDF document summarization |
pdf_vision_showcase.py |
Vision-based PDF extraction |
retrievers.py |
12 retriever strategies (Dense, BM25, Hybrid, etc.) |
summarizer.py |
Document summarization strategies |
Observability (examples/observability/)
| Example | Description |
|---|---|
observer.py |
Start here: live output plus captured event history |
shared_observer.py |
Share one observer across two agents and inspect the merged stream |
subagent_lineage.py |
Trace parent-child run IDs through real subagent delegation |
agent_lifecycle.py |
Build a lifecycle timeline from subscribed events |
custom_formatter.py |
Customize console output without losing structured events |
Development
# Install with dev dependencies
uv sync --extra dev
# Run tests
uv run pytest
# Type checking
uv run mypy src/cogent
# Linting
uv run ruff check src/cogent
License
MIT License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cogent_ai-1.19.1.tar.gz.
File metadata
- Download URL: cogent_ai-1.19.1.tar.gz
- Upload date:
- Size: 524.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.3 {"installer":{"name":"uv","version":"0.11.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a5839698c98938882ecc6490f52e2e34923431c7f95f986cbe7ca35ffbae2835
|
|
| MD5 |
ccddd344e87484aa91fb564db193d558
|
|
| BLAKE2b-256 |
cf5657b9ed022f4746a6d3b3ff02207aaac04ef56cf5f21e1691899f0c992631
|
File details
Details for the file cogent_ai-1.19.1-py3-none-any.whl.
File metadata
- Download URL: cogent_ai-1.19.1-py3-none-any.whl
- Upload date:
- Size: 643.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.3 {"installer":{"name":"uv","version":"0.11.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
180d80b6b809d12b0748740ff1380eb6d22b8886eb86a44ca7a1535b29a73551
|
|
| MD5 |
739870f202f5fec646f337d576dcfc5a
|
|
| BLAKE2b-256 |
7beeb15dff1d19161e9a61516de49485f2bea2c77bf6ec6e7cfd9f64e8e456b6
|