Production AI agent framework with memory control and semantic caching

These details have not been verified by PyPI

Project links

Project description

Cogent

Build AI agents that actually work.

📚 Documentation: https://milad-o.github.io/cogent

Installation • Quick Start • Architecture • Capabilities • Examples

Cogent is a production AI agent framework built on cutting-edge research in memory control and semantic caching. Unlike frameworks focused on multi-agent orchestration, Cogent emphasizes bounded memory, reasoning artifacts caching, and tool augmentation for superior performance and reliability.

Why Cogent?

🧠 Memory Control — Bio-inspired bounded memory prevents context drift and poisoning
⚡ Semantic Caching — Cache reasoning artifacts (intents, plans) at 80%+ hit rates
🚀 Fast — Parallel tool execution, cached model binding, direct SDK calls
🔧 Simple — Define tools with @tool, create agents in 3 lines, no boilerplate
🏭 Production-ready — Built-in resilience, observability, and security interceptors
📦 Batteries included — File system, web search, code sandbox, browser, PDF, and more

from cogent import Agent, tool

@tool
async def search(query: str) -> str:
    """Search the web."""
    # Your search implementation
    return results

agent = Agent(name="Assistant", model="gpt-4o-mini", tools=[search])
result = await agent.run("Find the latest news on AI agents")

🎉 Latest Changes (v1.19.0)

A2A Protocol + Observer Redesign + GPT-5.4 Defaults

🌐 Agent2Agent (A2A) — A2AAgent wraps any A2A HTTP endpoint as a drop-in subagent. A2AServer exposes any cogent agent as an A2A server with SSE streaming, persistent task store, push notifications, and auth schemes. Agent.serve(port=...) is a one-liner for scripts. Requires cogent-ai[a2a].
🔭 Observer levels simplified — Four levels: "off", "progress" (default), "debug", "trace". Old preset aliases (minimal, verbose, normal, detailed) removed from Observer; verbosity= string shortcuts are unchanged.
🎨 Console formatter overhaul — Semantic label colors, consistent indented body layout, [type@address] source labels for A2A and MCP tools (e.g. a2a://analyst@localhost:10088).
🤖 GPT-5.4 defaults — "gpt4" / "gpt4o" aliases now resolve to gpt-5.4; "gpt4-mini" resolves to gpt-5.4-mini.
🔗 mcps= parameter on Agent — First-class MCP argument alongside tools=, capabilities=, and subagents=.

See CHANGELOG.md for full version history and migration guide.

Features

Native Executor — High-performance parallel tool execution with zero framework overhead
Native Model Support — OpenAI, Azure, Anthropic, Gemini, Groq, Ollama, Custom endpoints
Capabilities — Filesystem, Web Search, Code Sandbox, Browser, PDF, Shell, MCP, Spreadsheet, and more
RAG Pipeline — Document loading, per-file-type splitting, embeddings, vector stores, retrievers
Memory & Persistence — Conversation history, long-term memory with fuzzy matching (docs/memory.md)
Memory Control (ACC) — Bio-inspired bounded memory prevents drift (docs/acc.md)
Semantic Caching — Cache reasoning artifacts at 80%+ hit rates (docs/memory.md#semantic-cache)
Observability — Event-based tracing, metrics, progress tracking, and runtime event history
TaskBoard — Built-in task tracking for complex multi-step workflows
Interceptors — Budget guards, rate limiting, PII protection, tool gates
Resilience — Two-tier recovery: systematic retry + intelligent LLM-driven retry
Human-in-the-Loop — Tool approval, guidance, interruption handling
Agent2Agent (A2A) — A2AAgent connects to remote A2A agents as subagents; A2AServer / Agent.serve() expose agents as A2A endpoints
Streaming — Real-time token streaming with callbacks
Structured Output — Type-safe responses (Pydantic, dataclass, TypedDict, primitives, Literal, Union, Enum, collections, dict, None)
Reasoning — Extended thinking mode with chain-of-thought

Modules

Cogent is organized into focused modules, each with multiple backends and implementations.

`cogent.models` — LLM Providers

Native SDK wrappers for all major LLM providers with zero abstraction overhead.

Provider	Chat	Embeddings	String Alias	Notes
OpenAI	`OpenAIChat`	`OpenAIEmbedding`	`"gpt4"`, `"gpt-4o"`, `"gpt-4o-mini"`	GPT-5.4 series, o1, o3
Azure	`AzureOpenAIChat`	`AzureOpenAIEmbedding`	—	Managed Identity, Entra ID auth
Azure AI Foundry	`AzureAIFoundryChat`	—	—	GitHub Models integration
Anthropic	`AnthropicChat`	—	`"claude"`, `"claude-opus"`	Claude 3.5 Sonnet, extended thinking
Gemini	`GeminiChat`	`GeminiEmbedding`	`"gemini"`, `"gemini-pro"`	Gemini 2.5 Pro/Flash
Groq	`GroqChat`	—	`"llama"`, `"mixtral"`	Fast inference, Llama 3.3, Mixtral
xAI	`XAIChat`	—	`"grok"`	Grok 4.20, Grok 4, vision models
DeepSeek	`DeepSeekChat`	—	`"deepseek"`	DeepSeek Chat, DeepSeek Reasoner
Cerebras	`CerebrasChat`	—	`"cerebras"`	Ultra-fast inference with WSE-3
Mistral	`MistralChat`	`MistralEmbedding`	`"mistral"`, `"codestral"`	Mistral Large, Ministral
Cohere	`CohereChat`	`CohereEmbedding`	`"command"`, `"command-r"`	Command R+, Aya
OpenRouter	`OpenRouterChat`	—	`"or-gpt4o"`, `"or-claude"`, `"or-auto"`	200+ models via OpenRouter
Cloudflare	`CloudflareChat`	`CloudflareEmbedding`	—	Workers AI (@cf/...)
Ollama	`OllamaChat`	`OllamaEmbedding`	`"ollama"`	Local models, any GGUF
Custom	`CustomChat`	`CustomEmbedding`	—	vLLM, Together AI, any OpenAI-compatible

# 3 ways to create models

# 1. Simple strings (recommended)
agent = Agent("Helper", model="gpt4")
agent = Agent("Helper", model="claude")
agent = Agent("Helper", model="gemini")

# 2. Factory functions
from cogent import create_chat, create_embedding
model = create_chat("gpt4")  # String alias
model = create_chat("gpt-4o-mini")  # Model name
model = create_chat("claude-sonnet-4")  # Auto-detects provider
model = create_chat("grok-4")  # xAI Grok
model = create_chat("deepseek-chat")  # DeepSeek
embeddings = create_embedding("openai:text-embedding-3-small")  # Explicit provider:model

# 3. Direct instantiation (full control)
from cogent.models import OpenAIChat, XAIChat, DeepSeekChat
model = OpenAIChat(model="gpt-4o", temperature=0.7, api_key="sk-...")
model = XAIChat(model="grok-4", api_key="xai-...")
model = DeepSeekChat(model="deepseek-reasoner", api_key="sk-...")

`cogent.capabilities` — Agent Capabilities

Composable tools that plug into any agent. Each capability adds related tools.

Capability	Description	Tools Added
HTTPClient	Full-featured HTTP client	`http_request`, `http_get`, `http_post` with retries, timeouts
Database	Async SQL database access	`execute_query`, `fetch_one`, `fetch_all` with connection pooling
APITester	HTTP endpoint testing	`test_endpoint`, `assert_status`, `assert_json`
DataValidator	Schema validation	`validate_data`, `validate_json`, `validate_dict` with Pydantic
WebSearch	Web search with caching	`web_search`, `news_search` with semantic cache
Browser	Playwright automation	`navigate`, `click`, `fill`, `screenshot`
FileSystem	Sandboxed file operations	`read_file`, `write_file`, `list_dir`, `search_files`
CodeSandbox	Safe Python execution	`execute_python`, `run_function`
Shell	Sandboxed shell commands	`run_command`
PDF	PDF processing	`read_pdf`, `create_pdf`, `merge_pdfs`
Spreadsheet	Excel/CSV operations	`read_spreadsheet`, `write_spreadsheet`
MCP	Model Context Protocol	Dynamic tools from MCP servers

from cogent.capabilities import FileSystem, CodeSandbox, WebSearch, HTTPClient, Database
from cogent.capabilities import MCP

agent = Agent(
    name="Assistant",
    model="gpt-4o-mini",
    capabilities=[
        FileSystem(allowed_paths=["./project"]),
        CodeSandbox(timeout=30),
        WebSearch(),
        HTTPClient(),
        Database("sqlite:///data.db"),
    ],
    mcps=MCP(command="npx", args=["-y", "@modelcontextprotocol/server-filesystem", "."]),
)

`cogent.document` — Document Processing

Load, split, and process documents for RAG pipelines.

Loaders — Support for all common file formats:

Loader	Formats	Notes
`TextLoader`	`.txt`, `.rst`	Plain text extraction
`MarkdownLoader`	`.md`	Markdown with structure
`PDFLoader`	`.pdf`	Basic text extraction (pypdf/pdfplumber)
`PDFMarkdownLoader`	`.pdf`	Clean markdown output (pymupdf4llm)

Splitters — Multiple chunking strategies:

Splitter	Strategy
`RecursiveCharacterSplitter`	Hierarchical separators (default)
`SentenceSplitter`	Sentence boundary detection
`MarkdownSplitter`	Markdown structure-aware
`HTMLSplitter`	HTML tag-based
`CodeSplitter`	Language-aware code splitting
`SemanticSplitter`	Embedding-based semantic chunking
`TokenSplitter`	Token count-based

from cogent.document import DocumentLoader, SemanticSplitter

loader = DocumentLoader()
docs = await loader.load_directory("./documents")

splitter = SemanticSplitter(model=model)
chunks = splitter.split_documents(docs)

`cogent.vectorstore` — Vector Storage

Semantic search with pluggable backends and embedding providers.

Backends:

Backend	Use Case	Persistence
`InMemoryBackend`	Development, small datasets	No
`FAISSBackend`	Large-scale local search	Optional
`ChromaBackend`	Persistent vector database	Yes
`QdrantBackend`	Production vector database	Yes
`PgVectorBackend`	PostgreSQL integration	Yes

Embedding Providers:

Provider	Model Examples
`OpenAI`	`openai:text-embedding-3-small`, `openai:text-embedding-3-large`
`Ollama`	`ollama:nomic-embed-text`, `ollama:mxbai-embed-large`
`Mock`	Testing only

from cogent import create_embedding
from cogent.vectorstore import VectorStore
from cogent.vectorstore.backends import FAISSBackend

store = VectorStore(
    embeddings=create_embedding("openai:text-embedding-3-large"),
    backend=FAISSBackend(dimension=3072),
)

`cogent.memory` — Memory & Persistence

Long-term memory with fuzzy matching (semantic fallback optional), conversation history, and scoped views.

Stores:

Store	Backend	Features
`InMemoryStore`	Dict	Fast, no persistence
`SQLAlchemyStore`	SQLite, PostgreSQL, MySQL	Async, full SQL
`RedisStore`	Redis	Distributed, native TTL

from cogent.memory import Memory, SQLAlchemyStore

memory = Memory(store=SQLAlchemyStore("sqlite+aiosqlite:///./data.db"))

# Scoped views
user_mem = memory.scoped("user:alice")
team_mem = memory.scoped("team:research")

`cogent.executors` — Execution Strategies

Pluggable execution strategies that define HOW agents process tasks.

Executor	Strategy	Use Case
`NativeExecutor`	Parallel tool execution	Default, high performance
`SequentialExecutor`	Sequential tool execution	Ordered dependencies

Standalone execution — bypass Agent class entirely:

from cogent.executors import run

result = await run(
    "Search for Python tutorials and summarize",
    tools=[search, summarize],
    model="gpt-4o-mini",
)

`cogent.interceptors` — Middleware

Composable middleware for cross-cutting concerns.

Category	Interceptors
Budget	`BudgetGuard` (token/cost limits)
Security	`PIIShield`, `ContentFilter`
Rate Limiting	`RateLimiter`, `ThrottleInterceptor`
Context	`ContextCompressor`, `TokenLimiter`
Gates	`ToolGate`, `PermissionGate`, `ConversationGate`
Resilience	`Failover`, `CircuitBreaker`, `ToolGuard`
Audit	`Auditor` (event logging)
Prompt	`PromptAdapter`, `ContextPrompt`, `LambdaPrompt`

from cogent.interceptors import BudgetGuard, PIIShield, RateLimiter

agent = Agent(
    name="Safe",
    model="gpt-4o-mini",
    intercept=[
        BudgetGuard(max_model_calls=100),
        PIIShield(patterns=["email", "ssn"]),
        RateLimiter(requests_per_minute=60),
    ]
)

`cogent.observability` — Monitoring & Tracing

Comprehensive monitoring for understanding system behavior.

Component	Purpose
`ExecutionTracer`	Deep execution tracing with spans
`MetricsCollector`	Counter, Gauge, Histogram, Timer
`ProgressTracker`	Real-time progress output
`Observer`	Unified observability with history capture
`Dashboard`	Visual inspection interface
`Inspectors`	Agent, Task, Event inspection

Renderers: TextRenderer, RichRenderer, JSONRenderer, MinimalRenderer

from cogent.observability import ExecutionTracer, ProgressTracker

tracer = ExecutionTracer()
async with tracer.trace("my-operation") as span:
    span.set_attribute("user_id", user_id)
    result = await do_work()

Installation

Note: The package is published as cogent-ai on PyPI, but you import it as cogent in your code.

# Install from PyPI
uv add cogent-ai

# With extras
uv add "cogent-ai[vector-stores,retrieval]"
uv add "cogent-ai[database]"
uv add "cogent-ai[all-backend]"
uv add "cogent-ai[all]"

# Or install from source (latest)
uv add git+https://github.com/milad-o/cogent.git
uv add "cogent-ai[all] @ git+https://github.com/milad-o/cogent.git"

Optional dependency groups:

Group	Purpose	Includes
`vector-stores`	Vector databases	FAISS, Qdrant, SciPy
`retrieval`	Retrieval libraries	BM25, sentence-transformers
`database`	SQL databases	SQLAlchemy, aiosqlite, asyncpg, psycopg2
`infrastructure`	Infrastructure	Redis
`web`	Web tools	BeautifulSoup4, DuckDuckGo search
`browser`	Browser automation	Playwright
`document`	Document processing	PDF, Word, Markdown loaders
`api`	API framework	FastAPI, Uvicorn, Starlette
`visualization`	Graphs & charts	PyVis, Gravis, Matplotlib, Seaborn, Pandas
`anthropic`	Claude models	Anthropic SDK
`azure`	Azure models	Azure Identity, Azure AI Inference
`cerebras`	Cerebras models	Cerebras Cloud SDK
`cohere`	Cohere models	Cohere SDK
`gemini`	Gemini models	Google GenAI SDK
`groq`	Groq models	Groq SDK
`all-providers`	All LLM providers	anthropic, azure, cerebras, cohere, gemini, groq
`a2a`	Agent2Agent protocol	a2a-sdk, uvicorn
`all-backend`	All backends	vector-stores, retrieval, database, infrastructure
`all`	Everything	All above + visualization

Development installation:

# Core dev tools (linting, type checking)
uv sync --group dev

# Add testing
uv sync --group dev --group test

# Add backend tests (vector stores, databases)
uv sync --group dev --group test --group test-backends

# Add documentation
uv sync --group dev --group test --group test-backends --group docs

Core Architecture

Cogent is built around a high-performance Native Executor that eliminates framework overhead while providing enterprise-grade features.

Native Executor

The executor uses a direct asyncio loop with parallel tool execution—no graph frameworks, no unnecessary abstractions:

from cogent import Agent, tool
from cogent.models import ChatModel

@tool
def search(query: str) -> str:
    """Search the web."""
    return f"Results for: {query}"

@tool
def calculate(expression: str) -> str:
    """Evaluate math expression."""
    return str(eval(expression))

agent = Agent(
    name="Assistant",
    model="gpt4",  # Simple string model
    tools=[search, calculate],
)

# Tools execute in parallel when independent
result = await agent.run("Search for Python and calculate 2^10")

Key optimizations:

Parallel tool execution — Multiple tool calls run concurrently via asyncio.gather
Cached model binding — Tools bound once at construction, zero overhead per call
Native SDK integration — Direct OpenAI/Anthropic SDK calls, no translation layers
Automatic resilience — Rate limit retries with exponential backoff built-in

Tool System

Define tools with the @tool decorator—automatic schema extraction from type hints and docstrings:

from cogent import tool
from cogent.core.context import RunContext

@tool
def search(query: str, max_results: int = 10) -> str:
    """Search the web for information.
    
    Args:
        query: Search query string.
        max_results: Maximum results to return.
    """
    return f"Found {max_results} results for: {query}"

# With context injection for user/session data
@tool
def get_user_preferences(ctx: RunContext) -> str:
    """Get preferences for the current user."""
    return f"Preferences for user {ctx.user_id}"

# Async tools supported
@tool
async def fetch_data(url: str) -> str:
    """Fetch data from URL."""
    async with httpx.AsyncClient() as client:
        response = await client.get(url)
        return response.text

Tool features:

Type hints → JSON schema conversion
Docstring → description extraction
Sync and async function support
Context injection via ctx: RunContext parameter
Automatic error handling and retries

Standalone Execution

For maximum performance, bypass the Agent class entirely:

from cogent.executors import run

result = await run(
    "Search for Python tutorials and summarize the top 3",
    tools=[search, summarize],
    model="gpt-4o-mini",
)

Quick Start

Simple Agent

import asyncio
from cogent import Agent, tool

@tool
def get_weather(city: str) -> str:
    """Get current weather for a city."""
    return f"Weather in {city}: 72°F, sunny"

async def main():
    agent = Agent(
        name="Assistant",
        model="gpt-4o-mini",
        tools=[get_weather],
    )
    
    result = await agent.run("What's the weather in Tokyo?")
    print(result)

asyncio.run(main())

Multi-Agent with Subagents

from cogent import Agent

# Create specialist agents
data_analyst = Agent(
    name="data_analyst",
    model="gpt-4o-mini",
    instructions="Analyze data and provide statistical insights.",
)

market_researcher = Agent(
    name="market_researcher",
    model="gpt-4o-mini",
    instructions="Research market trends and competitive landscape.",
)

# Coordinator delegates to specialists
coordinator = Agent(
    name="coordinator",
    model="gpt-4o-mini",
    instructions="""Coordinate research tasks:
- Use data_analyst for numerical analysis
- Use market_researcher for market trends
Synthesize their findings.""",
    # Simply pass the agents - uses their names automatically
    subagents=[data_analyst, market_researcher],
)

# Full metadata preserved (tokens, duration, delegation chain)
result = await coordinator.run("Analyze Q4 2025 e-commerce growth")
print(f"Total tokens: {result.metadata.tokens.total_tokens}")  # Includes all subagents
print(f"  Prompt: {result.metadata.tokens.prompt_tokens}")
print(f"  Completion: {result.metadata.tokens.completion_tokens}")
if result.metadata.tokens.reasoning_tokens:
    print(f"  Reasoning: {result.metadata.tokens.reasoning_tokens}")
print(f"Subagent calls: {len(result.subagent_responses)}")

Agent2Agent (A2A)

Connect to remote agents and expose your own as A2A endpoints. Requires cogent-ai[a2a].

from cogent import Agent
from cogent.agent import A2AAgent, A2AServer

# Wrap a remote A2A endpoint as a subagent
remote_analyst = A2AAgent(
    name="analyst",
    url="http://localhost:10088",
    description="Analyzes financial data",
)

coordinator = Agent(
    name="coordinator",
    model="gpt-4o-mini",
    subagents=[remote_analyst],  # mix local and remote freely
)

# Expose an agent as an A2A server (one-liner for scripts)
agent = Agent(name="Assistant", model="gpt-4o-mini")
agent.serve(port=10002)  # blocking

# Or async with background=True (default)
server = await A2AServer(agent, port=10002).start()
# ... do work ...
await server.stop()

# Serve multiple agents concurrently
group = await A2AServer.start_many((agent_a, 10001), (agent_b, 10002))
await group.stop_all()

Streaming

agent = Agent(
    model="gpt-4o-mini",
    stream=True,
)

async for chunk in agent.run_stream("Write a poem"):
    print(chunk.content, end="", flush=True)

Human-in-the-Loop

from cogent import Agent
from cogent.agent import InterruptedException

agent = Agent(
    name="Assistant",
    model="gpt-4o-mini",
    tools=[sensitive_tool],
    interrupt_on={"tools": ["sensitive_tool"]},  # Require approval
)

try:
    result = await agent.run("Do something sensitive")
except InterruptedException as e:
    # Handle approval flow
    decision = await get_human_decision(e.pending_action)
    result = await agent.resume(e.state, decision)

Observability

from cogent import Agent
from cogent.observability import Observer, ObservabilityLevel

# Verbosity levels for agents
agent = Agent(
    name="Assistant",
    model="gpt-4o-mini",
    verbosity="debug",  # off | result | progress | detailed | debug | trace
)

# Or use enum/int
agent = Agent(model=model, verbosity=ObservabilityLevel.DEBUG)  # Enum
agent = Agent(model=model, verbosity=4)  # Int (0-5)

# Boolean shorthand
agent = Agent(model=model, verbosity=True)  # → PROGRESS level

# With observer for history capture
observer = Observer(level="debug", capture=["tool.result", "agent.*"])
result = await agent.run("Query", observer=observer)

# Access captured events
for event in observer.history():
    print(event)

Interceptors

Control execution flow with middleware:

from cogent.interceptors import (
    BudgetGuard,      # Token/cost limits
    RateLimiter,      # Request throttling
    PIIShield,        # Redact sensitive data
    ContentFilter,    # Block harmful content
    ToolGate,         # Conditional tool access
    PromptAdapter,    # Modify prompts dynamically
    Auditor,          # Audit logging
)

agent = Agent(
    name="Safe",
    model="gpt-4o-mini",
    intercept=[
        BudgetGuard(max_model_calls=100, max_tool_calls=500),
        PIIShield(patterns=["email", "ssn"]),
        RateLimiter(requests_per_minute=60),
    ],
)

Structured Output

Type-safe responses with comprehensive type support and automatic validation:

Supported Types:

Structured Models: BaseModel, dataclass, TypedDict
Primitives: str, int, bool, float
Constrained: Literal["A", "B", "C"]
Collections: list[T], set[T], tuple[T, ...] (wrap in models for reliability)
Polymorphic: Union[A, B] (agent chooses schema)
Enumerations: Enum types
Dynamic: dict (agent decides structure)
Confirmation: None type

from pydantic import BaseModel
from typing import Literal, Union
from enum import Enum
from cogent import Agent

# Structured models
class Analysis(BaseModel):
    sentiment: str
    confidence: float
    topics: list[str]

# Configure on agent (all calls use schema)
agent = Agent(
    name="Analyzer",
    model="gpt-4o-mini",
    output=Analysis,  # Enforce schema on all runs
)

result = await agent.run("Analyze: I love this product!")
print(result.content.data.sentiment)   # "positive"
print(result.content.data.confidence)  # 0.95

# OR: Per-call override (more flexible)
agent = Agent(name="Analyzer", model="gpt-4o-mini")  # No default schema
result = await agent.run(
    "Analyze: I love this product!",
    returns=Analysis,  # Schema for this call only
)
print(result.content.data.sentiment)   # "positive"

# Bare types - return primitive values directly
agent = Agent(name="Reviewer", model="gpt-4o-mini")
result = await agent.run(
    "Review this code",
    returns=Literal["APPROVE", "REJECT"],  # Per-call schema
)
print(result.content.data)  # "APPROVE" (bare string)

# Collections - wrap in models for reliability
class Tags(BaseModel):
    items: list[str]

agent = Agent(name="Tagger", model="gpt-4o-mini", output=Tags)
result = await agent.run("Extract tags from: Python async FastAPI")
print(result.content.data.items)  # ["Python", "async", "FastAPI"]

# Union types - polymorphic responses
from typing import Union

class Success(BaseModel):
    status: Literal["success"] = "success"
    result: str

class Error(BaseModel):
    status: Literal["error"] = "error"
    message: str

agent = Agent(name="Handler", model="gpt-4o-mini", output=Union[Success, Error])
# Agent chooses schema based on content

# Enum types
from enum import Enum

class Priority(str, Enum):
    LOW = "low"
    HIGH = "high"

agent = Agent(name="Prioritizer", model="gpt-4o-mini", output=Priority)
result = await agent.run("Server is down!")
print(result.content.data)  # Priority.HIGH

# Dynamic structure - agent decides fields
agent = Agent(name="Analyzer", model="gpt-4o-mini", output=dict)
result = await agent.run("Analyze user feedback")
print(result.content.data)  # {"sentiment": "positive", "score": 8, ...}

# Other bare types: str, int, bool, float
agent = Agent(name="Counter", model="gpt-4o-mini", output=int)
result = await agent.run("Count the items")
print(result.content.data)  # 5 (bare int)

Reasoning

Extended thinking for complex problems with AI-controlled rounds:

from cogent import Agent
from cogent.agent.reasoning import ReasoningConfig

# Simple: Enable with defaults
agent = Agent(
    name="Analyst",
    model="gpt-4o",
    reasoning=True,  # AI decides when ready (up to 10 rounds)
)

# Custom config
agent = Agent(
    name="DeepThinker",
    model="gpt-4o",
    reasoning=ReasoningConfig(
        max_thinking_rounds=15,  # Safety limit
        style=ReasoningStyle.CRITICAL,
    ),
)

# Per-call override
result = await agent.run(
    "Complex analysis task",
    reasoning=True,  # Enable for this call only
)

Reasoning Styles: ANALYTICAL, EXPLORATORY, CRITICAL, CREATIVE

Resilience

from cogent.agent import ResilienceConfig

agent = Agent(
    name="Resilient",
    model="gpt-4o-mini",
    resilience=ResilienceConfig(
        max_retries=3,
        strategy="exponential_jitter",
        timeout_seconds=30.0,
        on_exhaustion="ask_agent",  # LLM decides how to recover
    ),
)

Configuration

Use environment variables or .env:

# LLM Provider
OPENAI_API_KEY=sk-...

# Azure
AZURE_OPENAI_ENDPOINT=https://...
AZURE_OPENAI_DEPLOYMENT=gpt-4o
AZURE_OPENAI_AUTH_TYPE=managed_identity
AZURE_OPENAI_CLIENT_ID=...  # optional (user-assigned managed identity)

# Azure (service principal / client secret)
# AZURE_OPENAI_AUTH_TYPE=client_secret
# AZURE_OPENAI_TENANT_ID=...
# AZURE_OPENAI_CLIENT_ID=...
# AZURE_OPENAI_CLIENT_SECRET=...

# Anthropic
ANTHROPIC_API_KEY=...

# Ollama (local)
OLLAMA_HOST=http://localhost:11434

Examples

See examples/ for complete examples organized by category:

Basics (`examples/basics/`)

Example	Description
`hello_world.py`	Simple agent with tools
`memory.py`	Conversation persistence
`memory_layers.py`	Multi-layer memory management
`memory_semantic_search.py`	Semantic memory search
`streaming.py`	Real-time token streaming
`structured_output.py`	Type-safe responses (12 patterns)

Capabilities (`examples/capabilities/`)

Example	Description
`browser.py`	Web browsing with Playwright
`code_sandbox.py`	Safe Python execution
`codebase_analyzer.py`	Code analysis agent
`data_validator.py`	Schema validation
`database_agent.py`	SQL database operations
`filesystem.py`	File system operations
`http_agent.py`	HTTP client capability
`kg_agent_viz.py`	Knowledge graph visualization
`knowledge_graph.py`	Knowledge graph construction
`mcp_example.py`	Model Context Protocol integration
`shell.py`	Shell command execution
`spreadsheet.py`	Excel/CSV operations
`web_search.py`	Web search with caching

Advanced (`examples/advanced/`)

Example	Description
`acc.py`	Adaptive Context Control (bounded memory)
`acc_comparison.py`	ACC vs standard memory comparison
`complex_task.py`	Multi-step task handling
`content_review.py`	Content moderation
`context_layer.py`	Context management
`deferred_tools.py`	Deferred tool execution
`executors_demo.py`	Executor strategies (Sequential, Tree Search)
`human_in_the_loop.py`	Approval workflows
`interceptors.py`	Middleware patterns
`model_thinking.py`	Extended thinking mode
`reasoning.py`	Reasoning strategies
`semantic_cache.py`	Semantic caching demo
`single_vs_multi_agent.py`	Single vs delegated agents
`tactical_delegation.py`	Dynamic agent spawning
`taskboard.py`	TaskBoard for complex workflows

Retrieval (`examples/retrieval/`)

Example	Description
`finance_table_example.py`	Financial data extraction
`hyde.py`	Hypothetical Document Embeddings
`pdf_summarizer.py`	PDF document summarization
`pdf_vision_showcase.py`	Vision-based PDF extraction
`retrievers.py`	12 retriever strategies (Dense, BM25, Hybrid, etc.)
`summarizer.py`	Document summarization strategies

Observability (`examples/observability/`)

Example	Description
`observer.py`	Start here: live output plus captured event history
`shared_observer.py`	Share one observer across two agents and inspect the merged stream
`subagent_lineage.py`	Trace parent-child run IDs through real subagent delegation
`agent_lifecycle.py`	Build a lifecycle timeline from subscribed events
`custom_formatter.py`	Customize console output without losing structured events

Development

# Install with dev dependencies
uv sync --extra dev

# Run tests
uv run pytest

# Type checking
uv run mypy src/cogent

# Linting
uv run ruff check src/cogent

License

MIT License

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.19.1

Apr 8, 2026

1.19.0

Apr 7, 2026

1.18.0

Apr 3, 2026

1.17.10

Mar 27, 2026

1.17.9

Mar 26, 2026

1.17.7

Mar 26, 2026

1.17.6

Mar 25, 2026

1.17.5

Mar 11, 2026

1.17.4

Mar 9, 2026

1.17.3

Feb 20, 2026

1.17.2

Feb 19, 2026

1.17.1

Feb 19, 2026

1.17.0

Feb 19, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cogent_ai-1.19.1.tar.gz (524.2 kB view details)

Uploaded Apr 8, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

cogent_ai-1.19.1-py3-none-any.whl (643.1 kB view details)

Uploaded Apr 8, 2026 Python 3

File details

Details for the file cogent_ai-1.19.1.tar.gz.

File metadata

Download URL: cogent_ai-1.19.1.tar.gz
Upload date: Apr 8, 2026
Size: 524.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.11.3 {"installer":{"name":"uv","version":"0.11.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for cogent_ai-1.19.1.tar.gz
Algorithm	Hash digest
SHA256	`a5839698c98938882ecc6490f52e2e34923431c7f95f986cbe7ca35ffbae2835`
MD5	`ccddd344e87484aa91fb564db193d558`
BLAKE2b-256	`cf5657b9ed022f4746a6d3b3ff02207aaac04ef56cf5f21e1691899f0c992631`

See more details on using hashes here.

File details

Details for the file cogent_ai-1.19.1-py3-none-any.whl.

File metadata

Download URL: cogent_ai-1.19.1-py3-none-any.whl
Upload date: Apr 8, 2026
Size: 643.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.11.3 {"installer":{"name":"uv","version":"0.11.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for cogent_ai-1.19.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`180d80b6b809d12b0748740ff1380eb6d22b8886eb86a44ca7a1535b29a73551`
MD5	`739870f202f5fec646f337d576dcfc5a`
BLAKE2b-256	`7beeb15dff1d19161e9a61516de49485f2bea2c77bf6ec6e7cfd9f64e8e456b6`

See more details on using hashes here.

cogent-ai 1.19.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Cogent

🎉 Latest Changes (v1.19.0)

Features

Modules

cogent.models — LLM Providers

cogent.capabilities — Agent Capabilities

cogent.document — Document Processing

cogent.vectorstore — Vector Storage

cogent.memory — Memory & Persistence

cogent.executors — Execution Strategies

cogent.interceptors — Middleware

cogent.observability — Monitoring & Tracing

Installation

Core Architecture

Native Executor

Tool System

Standalone Execution

Quick Start

Simple Agent

Multi-Agent with Subagents

Agent2Agent (A2A)

Streaming

Human-in-the-Loop

Observability

Interceptors

Structured Output

Reasoning

Resilience

Configuration

Examples

Basics (examples/basics/)

Capabilities (examples/capabilities/)

Advanced (examples/advanced/)

Retrieval (examples/retrieval/)

Observability (examples/observability/)

Development

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`cogent.models` — LLM Providers

`cogent.capabilities` — Agent Capabilities

`cogent.document` — Document Processing

`cogent.vectorstore` — Vector Storage

`cogent.memory` — Memory & Persistence

`cogent.executors` — Execution Strategies

`cogent.interceptors` — Middleware

`cogent.observability` — Monitoring & Tracing

Basics (`examples/basics/`)

Capabilities (`examples/capabilities/`)

Advanced (`examples/advanced/`)

Retrieval (`examples/retrieval/`)

Observability (`examples/observability/`)