Skip to main content

Production AI agent framework with memory control and semantic caching

Project description

Cogent

Build AI agents that actually work.

📚 Documentation: https://milad-o.github.io/cogent

Version License Python Documentation Tests

InstallationQuick StartArchitectureCapabilitiesExamples


Cogent is a production AI agent framework built on cutting-edge research in memory control and semantic caching. Unlike frameworks focused on multi-agent orchestration, Cogent emphasizes bounded memory, reasoning artifacts caching, and tool augmentation for superior performance and reliability.

Why Cogent?

  • 🧠 Memory Control — Bio-inspired bounded memory prevents context drift and poisoning
  • Semantic Caching — Cache reasoning artifacts (intents, plans) at 80%+ hit rates
  • 🚀 Fast — Parallel tool execution, cached model binding, direct SDK calls
  • 🔧 Simple — Define tools with @tool, create agents in 3 lines, no boilerplate
  • 🏭 Production-ready — Built-in resilience, observability, and security interceptors
  • 📦 Batteries included — File system, web search, code sandbox, browser, PDF, and more
from cogent import Agent, tool

@tool
async def search(query: str) -> str:
    """Search the web."""
    # Your search implementation
    return results

agent = Agent(name="Assistant", model="gpt-4o-mini", tools=[search])
result = await agent.run("Find the latest news on AI agents")

🎉 Latest Changes (v1.19.0)

A2A Protocol + Observer Redesign + GPT-5.4 Defaults

  • 🌐 Agent2Agent (A2A)A2AAgent wraps any A2A HTTP endpoint as a drop-in subagent. A2AServer exposes any cogent agent as an A2A server with SSE streaming, persistent task store, push notifications, and auth schemes. Agent.serve(port=...) is a one-liner for scripts. Requires cogent-ai[a2a].
  • 🔭 Observer levels simplified — Four levels: "off", "progress" (default), "debug", "trace". Old preset aliases (minimal, verbose, normal, detailed) removed from Observer; verbosity= string shortcuts are unchanged.
  • 🎨 Console formatter overhaul — Semantic label colors, consistent indented body layout, [type@address] source labels for A2A and MCP tools (e.g. a2a://analyst@localhost:10088).
  • 🤖 GPT-5.4 defaults"gpt4" / "gpt4o" aliases now resolve to gpt-5.4; "gpt4-mini" resolves to gpt-5.4-mini.
  • 🔗 mcps= parameter on Agent — First-class MCP argument alongside tools=, capabilities=, and subagents=.

See CHANGELOG.md for full version history and migration guide.


Features

  • Native Executor — High-performance parallel tool execution with zero framework overhead
  • Native Model Support — OpenAI, Azure, Anthropic, Gemini, Groq, Ollama, Custom endpoints
  • Capabilities — Filesystem, Web Search, Code Sandbox, Browser, PDF, Shell, MCP, Spreadsheet, and more
  • RAG Pipeline — Document loading, per-file-type splitting, embeddings, vector stores, retrievers
  • Memory & Persistence — Conversation history, long-term memory with fuzzy matching (docs/memory.md)
  • Memory Control (ACC) — Bio-inspired bounded memory prevents drift (docs/acc.md)
  • Semantic Caching — Cache reasoning artifacts at 80%+ hit rates (docs/memory.md#semantic-cache)
  • Observability — Event-based tracing, metrics, progress tracking, and runtime event history
  • TaskBoard — Built-in task tracking for complex multi-step workflows
  • Interceptors — Budget guards, rate limiting, PII protection, tool gates
  • Resilience — Two-tier recovery: systematic retry + intelligent LLM-driven retry
  • Human-in-the-Loop — Tool approval, guidance, interruption handling
  • Agent2Agent (A2A)A2AAgent connects to remote A2A agents as subagents; A2AServer / Agent.serve() expose agents as A2A endpoints
  • Streaming — Real-time token streaming with callbacks
  • Structured Output — Type-safe responses (Pydantic, dataclass, TypedDict, primitives, Literal, Union, Enum, collections, dict, None)
  • Reasoning — Extended thinking mode with chain-of-thought

Modules

Cogent is organized into focused modules, each with multiple backends and implementations.

cogent.models — LLM Providers

Native SDK wrappers for all major LLM providers with zero abstraction overhead.

Provider Chat Embeddings String Alias Notes
OpenAI OpenAIChat OpenAIEmbedding "gpt4", "gpt-4o", "gpt-4o-mini" GPT-5.4 series, o1, o3
Azure AzureOpenAIChat AzureOpenAIEmbedding Managed Identity, Entra ID auth
Azure AI Foundry AzureAIFoundryChat GitHub Models integration
Anthropic AnthropicChat "claude", "claude-opus" Claude 3.5 Sonnet, extended thinking
Gemini GeminiChat GeminiEmbedding "gemini", "gemini-pro" Gemini 2.5 Pro/Flash
Groq GroqChat "llama", "mixtral" Fast inference, Llama 3.3, Mixtral
xAI XAIChat "grok" Grok 4.20, Grok 4, vision models
DeepSeek DeepSeekChat "deepseek" DeepSeek Chat, DeepSeek Reasoner
Cerebras CerebrasChat "cerebras" Ultra-fast inference with WSE-3
Mistral MistralChat MistralEmbedding "mistral", "codestral" Mistral Large, Ministral
Cohere CohereChat CohereEmbedding "command", "command-r" Command R+, Aya
OpenRouter OpenRouterChat "or-gpt4o", "or-claude", "or-auto" 200+ models via OpenRouter
Cloudflare CloudflareChat CloudflareEmbedding Workers AI (@cf/...)
Ollama OllamaChat OllamaEmbedding "ollama" Local models, any GGUF
Custom CustomChat CustomEmbedding vLLM, Together AI, any OpenAI-compatible
# 3 ways to create models

# 1. Simple strings (recommended)
agent = Agent("Helper", model="gpt4")
agent = Agent("Helper", model="claude")
agent = Agent("Helper", model="gemini")

# 2. Factory functions
from cogent import create_chat, create_embedding
model = create_chat("gpt4")  # String alias
model = create_chat("gpt-4o-mini")  # Model name
model = create_chat("claude-sonnet-4")  # Auto-detects provider
model = create_chat("grok-4")  # xAI Grok
model = create_chat("deepseek-chat")  # DeepSeek
embeddings = create_embedding("openai:text-embedding-3-small")  # Explicit provider:model

# 3. Direct instantiation (full control)
from cogent.models import OpenAIChat, XAIChat, DeepSeekChat
model = OpenAIChat(model="gpt-4o", temperature=0.7, api_key="sk-...")
model = XAIChat(model="grok-4", api_key="xai-...")
model = DeepSeekChat(model="deepseek-reasoner", api_key="sk-...")

cogent.capabilities — Agent Capabilities

Composable tools that plug into any agent. Each capability adds related tools.

Capability Description Tools Added
HTTPClient Full-featured HTTP client http_request, http_get, http_post with retries, timeouts
Database Async SQL database access execute_query, fetch_one, fetch_all with connection pooling
APITester HTTP endpoint testing test_endpoint, assert_status, assert_json
DataValidator Schema validation validate_data, validate_json, validate_dict with Pydantic
WebSearch Web search with caching web_search, news_search with semantic cache
Browser Playwright automation navigate, click, fill, screenshot
FileSystem Sandboxed file operations read_file, write_file, list_dir, search_files
CodeSandbox Safe Python execution execute_python, run_function
Shell Sandboxed shell commands run_command
PDF PDF processing read_pdf, create_pdf, merge_pdfs
Spreadsheet Excel/CSV operations read_spreadsheet, write_spreadsheet
MCP Model Context Protocol Dynamic tools from MCP servers
from cogent.capabilities import FileSystem, CodeSandbox, WebSearch, HTTPClient, Database
from cogent.capabilities import MCP

agent = Agent(
    name="Assistant",
    model="gpt-4o-mini",
    capabilities=[
        FileSystem(allowed_paths=["./project"]),
        CodeSandbox(timeout=30),
        WebSearch(),
        HTTPClient(),
        Database("sqlite:///data.db"),
    ],
    mcps=MCP(command="npx", args=["-y", "@modelcontextprotocol/server-filesystem", "."]),
)

cogent.document — Document Processing

Load, split, and process documents for RAG pipelines.

Loaders — Support for all common file formats:

Loader Formats Notes
TextLoader .txt, .rst Plain text extraction
MarkdownLoader .md Markdown with structure
PDFLoader .pdf Basic text extraction (pypdf/pdfplumber)
PDFMarkdownLoader .pdf Clean markdown output (pymupdf4llm)

| PDFVisionLoader | .pdf | Vision model-based extraction | | WordLoader | .docx | Microsoft Word documents | | HTMLLoader | .html, .htm | HTML documents | | CSVLoader | .csv | CSV files | | JSONLoader | .json, .jsonl | JSON documents | | XLSXLoader | .xlsx | Excel spreadsheets | | CodeLoader | .py, .js, .ts, .java, .go, .rs, .cpp, etc. | Source code files |

Splitters — Multiple chunking strategies:

Splitter Strategy
RecursiveCharacterSplitter Hierarchical separators (default)
SentenceSplitter Sentence boundary detection
MarkdownSplitter Markdown structure-aware
HTMLSplitter HTML tag-based
CodeSplitter Language-aware code splitting
SemanticSplitter Embedding-based semantic chunking
TokenSplitter Token count-based
from cogent.document import DocumentLoader, SemanticSplitter

loader = DocumentLoader()
docs = await loader.load_directory("./documents")

splitter = SemanticSplitter(model=model)
chunks = splitter.split_documents(docs)

cogent.vectorstore — Vector Storage

Semantic search with pluggable backends and embedding providers.

Backends:

Backend Use Case Persistence
InMemoryBackend Development, small datasets No
FAISSBackend Large-scale local search Optional
ChromaBackend Persistent vector database Yes
QdrantBackend Production vector database Yes
PgVectorBackend PostgreSQL integration Yes

Embedding Providers:

Provider Model Examples
OpenAI openai:text-embedding-3-small, openai:text-embedding-3-large
Ollama ollama:nomic-embed-text, ollama:mxbai-embed-large
Mock Testing only
from cogent import create_embedding
from cogent.vectorstore import VectorStore
from cogent.vectorstore.backends import FAISSBackend

store = VectorStore(
    embeddings=create_embedding("openai:text-embedding-3-large"),
    backend=FAISSBackend(dimension=3072),
)

cogent.memory — Memory & Persistence

Long-term memory with fuzzy matching (semantic fallback optional), conversation history, and scoped views.

Stores:

Store Backend Features
InMemoryStore Dict Fast, no persistence
SQLAlchemyStore SQLite, PostgreSQL, MySQL Async, full SQL
RedisStore Redis Distributed, native TTL
from cogent.memory import Memory, SQLAlchemyStore

memory = Memory(store=SQLAlchemyStore("sqlite+aiosqlite:///./data.db"))

# Scoped views
user_mem = memory.scoped("user:alice")
team_mem = memory.scoped("team:research")

cogent.executors — Execution Strategies

Pluggable execution strategies that define HOW agents process tasks.

Executor Strategy Use Case
NativeExecutor Parallel tool execution Default, high performance
SequentialExecutor Sequential tool execution Ordered dependencies

Standalone execution — bypass Agent class entirely:

from cogent.executors import run

result = await run(
    "Search for Python tutorials and summarize",
    tools=[search, summarize],
    model="gpt-4o-mini",
)

cogent.interceptors — Middleware

Composable middleware for cross-cutting concerns.

Category Interceptors
Budget BudgetGuard (token/cost limits)
Security PIIShield, ContentFilter
Rate Limiting RateLimiter, ThrottleInterceptor
Context ContextCompressor, TokenLimiter
Gates ToolGate, PermissionGate, ConversationGate
Resilience Failover, CircuitBreaker, ToolGuard
Audit Auditor (event logging)
Prompt PromptAdapter, ContextPrompt, LambdaPrompt
from cogent.interceptors import BudgetGuard, PIIShield, RateLimiter

agent = Agent(
    name="Safe",
    model="gpt-4o-mini",
    intercept=[
        BudgetGuard(max_model_calls=100),
        PIIShield(patterns=["email", "ssn"]),
        RateLimiter(requests_per_minute=60),
    ]
)

cogent.observability — Monitoring & Tracing

Comprehensive monitoring for understanding system behavior.

Component Purpose
ExecutionTracer Deep execution tracing with spans
MetricsCollector Counter, Gauge, Histogram, Timer
ProgressTracker Real-time progress output
Observer Unified observability with history capture
Dashboard Visual inspection interface
Inspectors Agent, Task, Event inspection

Renderers: TextRenderer, RichRenderer, JSONRenderer, MinimalRenderer

from cogent.observability import ExecutionTracer, ProgressTracker

tracer = ExecutionTracer()
async with tracer.trace("my-operation") as span:
    span.set_attribute("user_id", user_id)
    result = await do_work()

Installation

Note: The package is published as cogent-ai on PyPI, but you import it as cogent in your code.

# Install from PyPI
uv add cogent-ai

# With extras
uv add "cogent-ai[vector-stores,retrieval]"
uv add "cogent-ai[database]"
uv add "cogent-ai[all-backend]"
uv add "cogent-ai[all]"

# Or install from source (latest)
uv add git+https://github.com/milad-o/cogent.git
uv add "cogent-ai[all] @ git+https://github.com/milad-o/cogent.git"

Optional dependency groups:

Group Purpose Includes
vector-stores Vector databases FAISS, Qdrant, SciPy
retrieval Retrieval libraries BM25, sentence-transformers
database SQL databases SQLAlchemy, aiosqlite, asyncpg, psycopg2
infrastructure Infrastructure Redis
web Web tools BeautifulSoup4, DuckDuckGo search
browser Browser automation Playwright
document Document processing PDF, Word, Markdown loaders
api API framework FastAPI, Uvicorn, Starlette
visualization Graphs & charts PyVis, Gravis, Matplotlib, Seaborn, Pandas
anthropic Claude models Anthropic SDK
azure Azure models Azure Identity, Azure AI Inference
cerebras Cerebras models Cerebras Cloud SDK
cohere Cohere models Cohere SDK
gemini Gemini models Google GenAI SDK
groq Groq models Groq SDK
all-providers All LLM providers anthropic, azure, cerebras, cohere, gemini, groq
a2a Agent2Agent protocol a2a-sdk, uvicorn
all-backend All backends vector-stores, retrieval, database, infrastructure
all Everything All above + visualization

Development installation:

# Core dev tools (linting, type checking)
uv sync --group dev

# Add testing
uv sync --group dev --group test

# Add backend tests (vector stores, databases)
uv sync --group dev --group test --group test-backends

# Add documentation
uv sync --group dev --group test --group test-backends --group docs

Core Architecture

Cogent is built around a high-performance Native Executor that eliminates framework overhead while providing enterprise-grade features.

Native Executor

The executor uses a direct asyncio loop with parallel tool execution—no graph frameworks, no unnecessary abstractions:

from cogent import Agent, tool
from cogent.models import ChatModel

@tool
def search(query: str) -> str:
    """Search the web."""
    return f"Results for: {query}"

@tool
def calculate(expression: str) -> str:
    """Evaluate math expression."""
    return str(eval(expression))

agent = Agent(
    name="Assistant",
    model="gpt4",  # Simple string model
    tools=[search, calculate],
)

# Tools execute in parallel when independent
result = await agent.run("Search for Python and calculate 2^10")

Key optimizations:

  • Parallel tool execution — Multiple tool calls run concurrently via asyncio.gather
  • Cached model binding — Tools bound once at construction, zero overhead per call
  • Native SDK integration — Direct OpenAI/Anthropic SDK calls, no translation layers
  • Automatic resilience — Rate limit retries with exponential backoff built-in

Tool System

Define tools with the @tool decorator—automatic schema extraction from type hints and docstrings:

from cogent import tool
from cogent.core.context import RunContext

@tool
def search(query: str, max_results: int = 10) -> str:
    """Search the web for information.
    
    Args:
        query: Search query string.
        max_results: Maximum results to return.
    """
    return f"Found {max_results} results for: {query}"

# With context injection for user/session data
@tool
def get_user_preferences(ctx: RunContext) -> str:
    """Get preferences for the current user."""
    return f"Preferences for user {ctx.user_id}"

# Async tools supported
@tool
async def fetch_data(url: str) -> str:
    """Fetch data from URL."""
    async with httpx.AsyncClient() as client:
        response = await client.get(url)
        return response.text

Tool features:

  • Type hints → JSON schema conversion
  • Docstring → description extraction
  • Sync and async function support
  • Context injection via ctx: RunContext parameter
  • Automatic error handling and retries

Standalone Execution

For maximum performance, bypass the Agent class entirely:

from cogent.executors import run

result = await run(
    "Search for Python tutorials and summarize the top 3",
    tools=[search, summarize],
    model="gpt-4o-mini",
)

Quick Start

Simple Agent

import asyncio
from cogent import Agent, tool

@tool
def get_weather(city: str) -> str:
    """Get current weather for a city."""
    return f"Weather in {city}: 72°F, sunny"

async def main():
    agent = Agent(
        name="Assistant",
        model="gpt-4o-mini",
        tools=[get_weather],
    )
    
    result = await agent.run("What's the weather in Tokyo?")
    print(result)

asyncio.run(main())

Multi-Agent with Subagents

from cogent import Agent

# Create specialist agents
data_analyst = Agent(
    name="data_analyst",
    model="gpt-4o-mini",
    instructions="Analyze data and provide statistical insights.",
)

market_researcher = Agent(
    name="market_researcher",
    model="gpt-4o-mini",
    instructions="Research market trends and competitive landscape.",
)

# Coordinator delegates to specialists
coordinator = Agent(
    name="coordinator",
    model="gpt-4o-mini",
    instructions="""Coordinate research tasks:
- Use data_analyst for numerical analysis
- Use market_researcher for market trends
Synthesize their findings.""",
    # Simply pass the agents - uses their names automatically
    subagents=[data_analyst, market_researcher],
)

# Full metadata preserved (tokens, duration, delegation chain)
result = await coordinator.run("Analyze Q4 2025 e-commerce growth")
print(f"Total tokens: {result.metadata.tokens.total_tokens}")  # Includes all subagents
print(f"  Prompt: {result.metadata.tokens.prompt_tokens}")
print(f"  Completion: {result.metadata.tokens.completion_tokens}")
if result.metadata.tokens.reasoning_tokens:
    print(f"  Reasoning: {result.metadata.tokens.reasoning_tokens}")
print(f"Subagent calls: {len(result.subagent_responses)}")

Agent2Agent (A2A)

Connect to remote agents and expose your own as A2A endpoints. Requires cogent-ai[a2a].

from cogent import Agent
from cogent.agent import A2AAgent, A2AServer

# Wrap a remote A2A endpoint as a subagent
remote_analyst = A2AAgent(
    name="analyst",
    url="http://localhost:10088",
    description="Analyzes financial data",
)

coordinator = Agent(
    name="coordinator",
    model="gpt-4o-mini",
    subagents=[remote_analyst],  # mix local and remote freely
)

# Expose an agent as an A2A server (one-liner for scripts)
agent = Agent(name="Assistant", model="gpt-4o-mini")
agent.serve(port=10002)  # blocking

# Or async with background=True (default)
server = await A2AServer(agent, port=10002).start()
# ... do work ...
await server.stop()

# Serve multiple agents concurrently
group = await A2AServer.start_many((agent_a, 10001), (agent_b, 10002))
await group.stop_all()

Streaming

agent = Agent(
    model="gpt-4o-mini",
    stream=True,
)

async for chunk in agent.run_stream("Write a poem"):
    print(chunk.content, end="", flush=True)

Human-in-the-Loop

from cogent import Agent
from cogent.agent import InterruptedException

agent = Agent(
    name="Assistant",
    model="gpt-4o-mini",
    tools=[sensitive_tool],
    interrupt_on={"tools": ["sensitive_tool"]},  # Require approval
)

try:
    result = await agent.run("Do something sensitive")
except InterruptedException as e:
    # Handle approval flow
    decision = await get_human_decision(e.pending_action)
    result = await agent.resume(e.state, decision)

Observability

from cogent import Agent
from cogent.observability import Observer, ObservabilityLevel

# Verbosity levels for agents
agent = Agent(
    name="Assistant",
    model="gpt-4o-mini",
    verbosity="debug",  # off | result | progress | detailed | debug | trace
)

# Or use enum/int
agent = Agent(model=model, verbosity=ObservabilityLevel.DEBUG)  # Enum
agent = Agent(model=model, verbosity=4)  # Int (0-5)

# Boolean shorthand
agent = Agent(model=model, verbosity=True)  # → PROGRESS level

# With observer for history capture
observer = Observer(level="debug", capture=["tool.result", "agent.*"])
result = await agent.run("Query", observer=observer)

# Access captured events
for event in observer.history():
    print(event)

Interceptors

Control execution flow with middleware:

from cogent.interceptors import (
    BudgetGuard,      # Token/cost limits
    RateLimiter,      # Request throttling
    PIIShield,        # Redact sensitive data
    ContentFilter,    # Block harmful content
    ToolGate,         # Conditional tool access
    PromptAdapter,    # Modify prompts dynamically
    Auditor,          # Audit logging
)

agent = Agent(
    name="Safe",
    model="gpt-4o-mini",
    intercept=[
        BudgetGuard(max_model_calls=100, max_tool_calls=500),
        PIIShield(patterns=["email", "ssn"]),
        RateLimiter(requests_per_minute=60),
    ],
)

Structured Output

Type-safe responses with comprehensive type support and automatic validation:

Supported Types:

  • Structured Models: BaseModel, dataclass, TypedDict
  • Primitives: str, int, bool, float
  • Constrained: Literal["A", "B", "C"]
  • Collections: list[T], set[T], tuple[T, ...] (wrap in models for reliability)
  • Polymorphic: Union[A, B] (agent chooses schema)
  • Enumerations: Enum types
  • Dynamic: dict (agent decides structure)
  • Confirmation: None type
from pydantic import BaseModel
from typing import Literal, Union
from enum import Enum
from cogent import Agent

# Structured models
class Analysis(BaseModel):
    sentiment: str
    confidence: float
    topics: list[str]

# Configure on agent (all calls use schema)
agent = Agent(
    name="Analyzer",
    model="gpt-4o-mini",
    output=Analysis,  # Enforce schema on all runs
)

result = await agent.run("Analyze: I love this product!")
print(result.content.data.sentiment)   # "positive"
print(result.content.data.confidence)  # 0.95

# OR: Per-call override (more flexible)
agent = Agent(name="Analyzer", model="gpt-4o-mini")  # No default schema
result = await agent.run(
    "Analyze: I love this product!",
    returns=Analysis,  # Schema for this call only
)
print(result.content.data.sentiment)   # "positive"

# Bare types - return primitive values directly
agent = Agent(name="Reviewer", model="gpt-4o-mini")
result = await agent.run(
    "Review this code",
    returns=Literal["APPROVE", "REJECT"],  # Per-call schema
)
print(result.content.data)  # "APPROVE" (bare string)

# Collections - wrap in models for reliability
class Tags(BaseModel):
    items: list[str]

agent = Agent(name="Tagger", model="gpt-4o-mini", output=Tags)
result = await agent.run("Extract tags from: Python async FastAPI")
print(result.content.data.items)  # ["Python", "async", "FastAPI"]

# Union types - polymorphic responses
from typing import Union

class Success(BaseModel):
    status: Literal["success"] = "success"
    result: str

class Error(BaseModel):
    status: Literal["error"] = "error"
    message: str

agent = Agent(name="Handler", model="gpt-4o-mini", output=Union[Success, Error])
# Agent chooses schema based on content

# Enum types
from enum import Enum

class Priority(str, Enum):
    LOW = "low"
    HIGH = "high"

agent = Agent(name="Prioritizer", model="gpt-4o-mini", output=Priority)
result = await agent.run("Server is down!")
print(result.content.data)  # Priority.HIGH

# Dynamic structure - agent decides fields
agent = Agent(name="Analyzer", model="gpt-4o-mini", output=dict)
result = await agent.run("Analyze user feedback")
print(result.content.data)  # {"sentiment": "positive", "score": 8, ...}

# Other bare types: str, int, bool, float
agent = Agent(name="Counter", model="gpt-4o-mini", output=int)
result = await agent.run("Count the items")
print(result.content.data)  # 5 (bare int)

Reasoning

Extended thinking for complex problems with AI-controlled rounds:

from cogent import Agent
from cogent.agent.reasoning import ReasoningConfig

# Simple: Enable with defaults
agent = Agent(
    name="Analyst",
    model="gpt-4o",
    reasoning=True,  # AI decides when ready (up to 10 rounds)
)

# Custom config
agent = Agent(
    name="DeepThinker",
    model="gpt-4o",
    reasoning=ReasoningConfig(
        max_thinking_rounds=15,  # Safety limit
        style=ReasoningStyle.CRITICAL,
    ),
)

# Per-call override
result = await agent.run(
    "Complex analysis task",
    reasoning=True,  # Enable for this call only
)

Reasoning Styles: ANALYTICAL, EXPLORATORY, CRITICAL, CREATIVE

Resilience

from cogent.agent import ResilienceConfig

agent = Agent(
    name="Resilient",
    model="gpt-4o-mini",
    resilience=ResilienceConfig(
        max_retries=3,
        strategy="exponential_jitter",
        timeout_seconds=30.0,
        on_exhaustion="ask_agent",  # LLM decides how to recover
    ),
)

Configuration

Use environment variables or .env:

# LLM Provider
OPENAI_API_KEY=sk-...

# Azure
AZURE_OPENAI_ENDPOINT=https://...
AZURE_OPENAI_DEPLOYMENT=gpt-4o
AZURE_OPENAI_AUTH_TYPE=managed_identity
AZURE_OPENAI_CLIENT_ID=...  # optional (user-assigned managed identity)

# Azure (service principal / client secret)
# AZURE_OPENAI_AUTH_TYPE=client_secret
# AZURE_OPENAI_TENANT_ID=...
# AZURE_OPENAI_CLIENT_ID=...
# AZURE_OPENAI_CLIENT_SECRET=...

# Anthropic
ANTHROPIC_API_KEY=...

# Ollama (local)
OLLAMA_HOST=http://localhost:11434

Examples

See examples/ for complete examples organized by category:

Basics (examples/basics/)

Example Description
hello_world.py Simple agent with tools
memory.py Conversation persistence
memory_layers.py Multi-layer memory management
memory_semantic_search.py Semantic memory search
streaming.py Real-time token streaming
structured_output.py Type-safe responses (12 patterns)

Capabilities (examples/capabilities/)

Example Description
browser.py Web browsing with Playwright
code_sandbox.py Safe Python execution
codebase_analyzer.py Code analysis agent
data_validator.py Schema validation
database_agent.py SQL database operations
filesystem.py File system operations
http_agent.py HTTP client capability
kg_agent_viz.py Knowledge graph visualization
knowledge_graph.py Knowledge graph construction
mcp_example.py Model Context Protocol integration
shell.py Shell command execution
spreadsheet.py Excel/CSV operations
web_search.py Web search with caching

Advanced (examples/advanced/)

Example Description
acc.py Adaptive Context Control (bounded memory)
acc_comparison.py ACC vs standard memory comparison
complex_task.py Multi-step task handling
content_review.py Content moderation
context_layer.py Context management
deferred_tools.py Deferred tool execution
executors_demo.py Executor strategies (Sequential, Tree Search)
human_in_the_loop.py Approval workflows
interceptors.py Middleware patterns
model_thinking.py Extended thinking mode
reasoning.py Reasoning strategies
semantic_cache.py Semantic caching demo
single_vs_multi_agent.py Single vs delegated agents
tactical_delegation.py Dynamic agent spawning
taskboard.py TaskBoard for complex workflows

Retrieval (examples/retrieval/)

Example Description
finance_table_example.py Financial data extraction
hyde.py Hypothetical Document Embeddings
pdf_summarizer.py PDF document summarization
pdf_vision_showcase.py Vision-based PDF extraction
retrievers.py 12 retriever strategies (Dense, BM25, Hybrid, etc.)
summarizer.py Document summarization strategies

Observability (examples/observability/)

Example Description
observer.py Start here: live output plus captured event history
shared_observer.py Share one observer across two agents and inspect the merged stream
subagent_lineage.py Trace parent-child run IDs through real subagent delegation
agent_lifecycle.py Build a lifecycle timeline from subscribed events
custom_formatter.py Customize console output without losing structured events

Development

# Install with dev dependencies
uv sync --extra dev

# Run tests
uv run pytest

# Type checking
uv run mypy src/cogent

# Linting
uv run ruff check src/cogent

License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cogent_ai-1.19.1.tar.gz (524.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cogent_ai-1.19.1-py3-none-any.whl (643.1 kB view details)

Uploaded Python 3

File details

Details for the file cogent_ai-1.19.1.tar.gz.

File metadata

  • Download URL: cogent_ai-1.19.1.tar.gz
  • Upload date:
  • Size: 524.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.3 {"installer":{"name":"uv","version":"0.11.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for cogent_ai-1.19.1.tar.gz
Algorithm Hash digest
SHA256 a5839698c98938882ecc6490f52e2e34923431c7f95f986cbe7ca35ffbae2835
MD5 ccddd344e87484aa91fb564db193d558
BLAKE2b-256 cf5657b9ed022f4746a6d3b3ff02207aaac04ef56cf5f21e1691899f0c992631

See more details on using hashes here.

File details

Details for the file cogent_ai-1.19.1-py3-none-any.whl.

File metadata

  • Download URL: cogent_ai-1.19.1-py3-none-any.whl
  • Upload date:
  • Size: 643.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.3 {"installer":{"name":"uv","version":"0.11.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for cogent_ai-1.19.1-py3-none-any.whl
Algorithm Hash digest
SHA256 180d80b6b809d12b0748740ff1380eb6d22b8886eb86a44ca7a1535b29a73551
MD5 739870f202f5fec646f337d576dcfc5a
BLAKE2b-256 7beeb15dff1d19161e9a61516de49485f2bea2c77bf6ec6e7cfd9f64e8e456b6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page