Core utilities for CNOE agents including LLM factory, tracing, and base agent classes
Project description
🤖 cnoe-agent-utils
- Reusable utilities and abstractions for building agent-based (LLM-powered) systems.
- Centralized LLM Factory supporting major providers (AWS, Azure, GCP, OpenAI, Gemini, Anthropic).
- Centralized Tracing Utilities (since v0.2.0) to eliminate duplicated tracing code across CNOE agents.
- Agent Base Classes (since v0.4.0) for LangGraph and Strands agent frameworks with A2A protocol support.
Key Features
Core Utilities
- Unified interface (LLM Factory) for seamless LLM instantiation across multiple clouds and vendors.
- 🏭 LLM Factory for easy model instantiation across:
- ☁️ AWS
- ☁️ Azure
- ☁️ GCP Vertex
- 🤖 Google Gemini
- 🤖 Anthropic Claude
- 🤖 OpenAI
- 🤖 Groq
- 🏭 LLM Factory for easy model instantiation across:
- Simple, environment-variable-driven configuration.
- Example scripts for each LLM provider with setup instructions.
Agent Tracing (since v0.2.0)
- Centralized tracing logic: Removes 350+ lines of repeated code per agent.
- Single import/decorator: No more copy-pasting tracing logic.
- Environment-based toggling: Use
ENABLE_TRACINGenv var to control all tracing. - A2A Tracing Disabling: Single method to monkey-patch/disable agent-to-agent tracing everywhere.
- Graceful fallback: Works with or without Langfuse; tracing is zero-overhead when disabled.
Agent Base Classes (since v0.4.0)
- Multi-Framework Support: Base classes for LangGraph and Strands agent frameworks
- A2A Protocol Integration: Seamless integration with Agent-to-Agent protocol for distributed agent systems
- Context Management: Automatic context window management with token counting and intelligent message trimming
- Streaming Support: Built-in streaming capabilities for real-time agent responses with tool notifications
- Optional Dependencies: Graceful handling of missing dependencies - install only what you need
- MCP Integration: Built-in support for Model Context Protocol (MCP) with multi-server configurations
Note:
- Checkout this tutorial on Tracing
- See Agent Base Classes Documentation for detailed agent utilities guide
🚀 LLM Factory Getting Started
🛡️ Create and Activate a Virtual Environment
It is recommended to use a virtual environment to manage dependencies:
python3 -m venv .venv
source .venv/bin/activate
⚡ Prerequisite: Install uv
Before running the examples, install uv:
pip install uv
📦 Installation
Installation Options
Default Installation (recommended for most users):
pip install cnoe-agent-utils
This installs all dependencies and provides full functionality. It's equivalent to pip install 'cnoe-agent-utils[all]'.
Minimal Installation (specific functionality only): Use these when you only need specific functionality or want to minimize package size:
# Anthropic Claude support only
pip install "cnoe-agent-utils[anthropic]"
# OpenAI support (openai.com GPT models) only
pip install "cnoe-agent-utils[openai]"
# Azure OpenAI support (Azure-hosted GPT models) only
pip install "cnoe-agent-utils[azure]"
# AWS support (Bedrock, etc.) only
pip install "cnoe-agent-utils[aws]"
# Google Cloud support (Vertex AI, Gemini) only
pip install "cnoe-agent-utils[gcp]"
# Groq support only
pip install "cnoe-agent-utils[groq]"
# Advanced tracing and observability (Langfuse, OpenTelemetry) only
pip install "cnoe-agent-utils[tracing]"
# Agent base classes and utilities only
pip install "cnoe-agent-utils[agents]"
# LangGraph agent framework support
pip install "cnoe-agent-utils[langgraph]"
# Strands agent framework support
pip install "cnoe-agent-utils[strands]"
# A2A protocol support for agent executors
pip install "cnoe-agent-utils[a2a]"
# Complete agent stack (all agent frameworks)
pip install "cnoe-agent-utils[agents-all]"
# Development dependencies (testing, linting, etc.)
pip install "cnoe-agent-utils[dev]"
Using uv
# Default installation (all dependencies)
uv add cnoe-agent-utils
# Minimal installation (specific functionality only)
uv add "cnoe-agent-utils[anthropic]"
uv add "cnoe-agent-utils[openai]"
uv add "cnoe-agent-utils[azure]"
uv add "cnoe-agent-utils[aws]"
uv add "cnoe-agent-utils[groq]"
uv add "cnoe-agent-utils[gcp]"
uv add "cnoe-agent-utils[tracing]"
uv add "cnoe-agent-utils[agents]"
uv add "cnoe-agent-utils[langgraph]"
uv add "cnoe-agent-utils[strands]"
uv add "cnoe-agent-utils[a2a]"
uv add "cnoe-agent-utils[agents-all]"
Local Development
If you are developing locally:
git clone https://github.com/cnoe-agent-utils/cnoe-agent-utils.git
cd cnoe-agent-utils
uv sync
🧑💻 Usage
To test integration with different LLM providers, configure the required environment variables for each provider as shown below. Then, run the corresponding example script using uv.
🤖 Anthropic
Set the following environment variables:
export ANTHROPIC_API_KEY=<your_anthropic_api_key>
export ANTHROPIC_MODEL_NAME=<model_name>
# Optional: Enable extended thinking for Claude 4+ models
export ANTHROPIC_THINKING_ENABLED=true
export ANTHROPIC_THINKING_BUDGET=1024 # Default: 1024, Min: 1024
Run the example:
uv run examples/test_anthropic.py
☁️ AWS Bedrock (Anthropic Claude)
Set the following environment variables:
export AWS_PROFILE=<your_aws_profile>
export AWS_REGION=<your_aws_region>
export AWS_BEDROCK_MODEL_ID="us.anthropic.claude-3-7-sonnet-20250219-v1:0"
export AWS_BEDROCK_PROVIDER="anthropic"
# Optional: Enable extended thinking for Claude 4+ models
export AWS_BEDROCK_THINKING_ENABLED=true
export AWS_BEDROCK_THINKING_BUDGET=1024 # Default: 1024, Min: 1024
Run the example:
uv run examples/test_aws_bedrock_claude.py
🤖 Groq
Set the following environment variable:
Groq Configuration
GROQ_API_KEY= GROQ_MODEL_NAME= GROQ_TEMPERATURE=
Run the example:
uv run examples/groq_stream.py
AWS Bedrock Prompt Caching
AWS Bedrock supports prompt caching to reduce latency and costs by caching repeated context across requests. This feature is particularly beneficial for:
- Multi-turn conversations with long system prompts
- Repeated use of large context documents
- Agent systems with consistent instructions
Enable prompt caching:
export AWS_BEDROCK_ENABLE_PROMPT_CACHE=true
Supported Models:
For the latest list of models that support prompt caching and their minimum token requirements, see the AWS Bedrock Prompt Caching documentation.
Implementation Note: When AWS_BEDROCK_ENABLE_PROMPT_CACHE=true, the library uses ChatBedrockConverse which has native prompt caching support. If your model doesn't support caching, AWS Bedrock will return a clear error message. There's no need to validate model compatibility in advance—AWS handles this automatically.
Note: Model IDs may include regional prefixes (us., eu., ap., etc.) depending on your AWS account configuration. Pass the full model ID as provided by AWS:
- Example:
us.anthropic.claude-3-7-sonnet-20250219-v1:0 - Example:
anthropic.claude-opus-4-1-20250805-v1:0
Benefits:
- Up to 85% reduction in latency for cached content
- Up to 90% reduction in costs for cached tokens
- 5-minute cache TTL (automatically managed by AWS)
- Maximum 4 cache checkpoints per request
Usage Example:
import os
from cnoe_agent_utils.llm_factory import LLMFactory
from langchain_core.messages import SystemMessage, HumanMessage
# Enable caching
os.environ["AWS_BEDROCK_ENABLE_PROMPT_CACHE"] = "true"
# Initialize LLM
llm = LLMFactory("aws-bedrock").get_llm()
# Create cache point for system message
cache_point = llm.create_cache_point()
# Build messages with cache control
messages = [
SystemMessage(content=[
{"text": "You are a helpful AI assistant with expertise in..."},
cache_point # Marks cache checkpoint
]),
HumanMessage(content="What is your primary function?")
]
# Invoke with caching
response = llm.invoke(messages)
# Check cache statistics in response metadata
if hasattr(response, 'response_metadata'):
usage = response.response_metadata.get('usage', {})
print(f"Cache read tokens: {usage.get('cacheReadInputTokens', 0)}")
print(f"Cache creation tokens: {usage.get('cacheCreationInputTokens', 0)}")
Run the caching example:
uv run examples/aws_bedrock_cache_example.py
Monitoring Cache Performance:
Cache hit/miss statistics are available in:
- Response metadata -
cacheReadInputTokensandcacheCreationInputTokens - CloudWatch metrics - Track cache performance across all requests
- Application logs - Enable via
AWS_CREDENTIALS_DEBUG=true
Best Practices:
- Use cache for system prompts and context that remain consistent across requests
- Ensure cached content meets minimum token requirements (see AWS documentation for model-specific limits)
- Place cache points strategically (after system messages, large context documents, or tool definitions)
- Monitor cache hit rates to optimize placement
☁️ Azure OpenAI
Set the following environment variables:
export AZURE_OPENAI_API_KEY=<your_azure_openai_api_key>
export AZURE_OPENAI_API_VERSION=<api_version>
export AZURE_OPENAI_DEPLOYMENT=gpt-4.1
export AZURE_OPENAI_ENDPOINT=<your_azure_openai_endpoint>
Run the example:
uv run examples/test_azure_openai.py
🤖 OpenAI
Set the following environment variables:
export OPENAI_API_KEY=<your_openai_api_key>
export OPENAI_ENDPOINT=https://api.openai.com/v1
export OPENAI_MODEL_NAME=gpt-4.1
Optional configuration:
export OPENAI_DEFAULT_HEADERS='{"my-header-key":"my-value"}'
export OPENAI_USER=user-identifier
Run the example:
uv run examples/test_openai.py
🤖 Google Gemini
Set the following environment variable:
export GOOGLE_API_KEY=<your_google_api_key>
Run the example:
uv run examples/test_google_gemini.py
☁️ GCP Vertex AI
Set the following environment variables:
export GOOGLE_APPLICATION_CREDENTIALS=~/.config/gcp.json
export VERTEXAI_MODEL_NAME="gemini-2.0-flash-001"
# Optional: Enable extended thinking for Claude 4+ models on Vertex AI
export VERTEXAI_THINKING_ENABLED=true
export VERTEXAI_THINKING_BUDGET=1024 # Default: 1024, Min: 1024
Run the example:
uv run examples/test_gcp_vertexai.py
This demonstrates how to use the LLM Factory and other utilities provided by the library.
🔧 Middleware
The cnoe_agent_utils.middleware module provides a collection of reusable middleware components for LangGraph agents, extending the DeepAgents library from LangChain. Middleware allows you to intercept and modify agent behavior at various stages of execution without changing the core agent logic.
[!NOTE] The middleware listed below extends the default DeepAgents middleware (such as
PlanningMiddleware,FilesystemMiddleware, andSubAgentMiddleware) with additional specialized capabilities for advanced agent workflows.
Extended Middleware
CallToolWithFileArgMiddleware
Automatically substitutes file paths with their contents when calling non-filesystem tools.
Features:
- Intercepts tool calls after model generation
- Replaces file path arguments with actual file contents from the in-memory FS
- Preserves original behavior for filesystem-specific tools
- Generates acknowledgment messages for transformed calls
How it works:
- Agent calls a tool with a file path as an argument
- Middleware detects the file path and replaces it with file contents
- Creates a
ToolMessageacknowledging the original call - Emits a rewritten
AIMessagewith the actual tool call using file contents
Usage:
from cnoe_agent_utils.middleware import CallToolWithFileArgMiddleware
middleware = [CallToolWithFileArgMiddleware()]
agent = create_agent(model, tools=tools, middleware=middleware)
QuickActionTasksAnnouncementMiddleware
Manages task announcements and execution flow for quick action scenarios.
Features:
- Announces the next task via
AIMessagewithout immediate execution - Updates todo status to "in_progress" for the current task
- Removes and replaces previous
write_todostool calls - Coordinates with
SubAgentMiddlewarefor task execution
RemoveToolsForSubagentMiddleware
Conditionally removes tools when an agent is called as a sub-agent.
Features:
- Detects when agent is running as a sub-agent
- Removes
write_todosandtasktools in sub-agent mode - Prevents recursive task management in nested agent hierarchies
Middleware Execution Flow
Middleware hooks are executed at different stages:
-
before_model: Called before the LLM is invoked- Modify state before model sees it
- Inject messages or update context
-
modify_model_request: Called to modify the model request- Change system prompts
- Filter or add tools
- Adjust model parameters
-
after_model: Called after the LLM generates a response- Transform tool calls
- Add acknowledgment messages
- Update state based on model output
📜 License
Apache 2.0 (see LICENSE)
👥 Maintainers
See MAINTAINERS.md
- Contributions welcome via PR or issue!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cnoe_agent_utils-0.3.15.tar.gz.
File metadata
- Download URL: cnoe_agent_utils-0.3.15.tar.gz
- Upload date:
- Size: 199.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8b4fe76fcac19a5806671f2ed44365b935aef2dc718a4e71ab8b197070e071a1
|
|
| MD5 |
90da3211cff575a4760e536ea757b0b2
|
|
| BLAKE2b-256 |
0f07743f0a490e3aba234b7c6d027f1974627f8a1c98054173f929ca8760f88a
|
File details
Details for the file cnoe_agent_utils-0.3.15-py3-none-any.whl.
File metadata
- Download URL: cnoe_agent_utils-0.3.15-py3-none-any.whl
- Upload date:
- Size: 61.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0ff53419d07ab5b50e8bbcb5b0859aae74e4a4c848705872cae35c2bddeb46db
|
|
| MD5 |
5e2d51e6812a1d9880e62943fe9eee78
|
|
| BLAKE2b-256 |
f4c8e9785bc8b90364e2b1285701a9e8fd82696bac3e00fa4260d3ddc50ca8d7
|