Skip to main content

Production-ready AI agents with tool calling, structured output, execution traces, and RAG. Provider-agnostic (OpenAI, Anthropic, Gemini, Ollama) with fallback chains, batch processing, tool policies, streaming, caching, and cost tracking.

Project description

Selectools

PyPI version License: LGPL v3 Python 3.9+

Production-ready AI agents with tool calling, RAG, and hybrid search. Connect LLMs to your Python functions, embed and search your documents with vector + keyword fusion, stream responses in real time, and dynamically manage tools at runtime. Works with OpenAI, Anthropic, Gemini, and Ollama. Tracks costs automatically.

Why Selectools

Capability What You Get
Provider Agnostic Switch between OpenAI, Anthropic, Gemini, Ollama with one line. Your tools stay identical.
Structured Output Pydantic or JSON Schema response_format with auto-retry on validation failure.
Execution Traces Every run() returns result.trace — structured timeline of LLM calls, tool picks, and executions.
Reasoning Visibility result.reasoning surfaces why the agent chose a tool, extracted from LLM responses.
Provider Fallback FallbackProvider tries providers in priority order with circuit breaker on failure.
Batch Processing agent.batch() / agent.abatch() for concurrent multi-prompt classification.
Tool Policy Engine Declarative allow/review/deny rules with glob patterns. Human-in-the-loop approval callbacks.
Hybrid Search BM25 keyword + vector semantic search with RRF/weighted fusion and cross-encoder reranking.
Advanced Chunking Fixed, recursive, semantic (embedding-based), and contextual (LLM-enriched) chunking strategies.
E2E Streaming Token-level astream() with native tool call support. Parallel tool execution via asyncio.gather.
Dynamic Tools Load tools from files/directories at runtime. Add, remove, replace tools without restarting.
Response Caching LRU + TTL in-memory cache and Redis backend. Avoid redundant LLM calls for identical requests.
Routing Mode Agent selects a tool without executing it. Use for intent classification and request routing.
Production Hardened Retries with backoff, per-tool timeouts, iteration caps, cost warnings, observability hooks.
Library-First Not a framework. No magic globals, no hidden state. Use as much or as little as you need.

What's Included

  • 5 LLM Providers: OpenAI, Anthropic, Gemini, Ollama + FallbackProvider (auto-failover)
  • Structured Output: Pydantic / JSON Schema response_format with auto-retry
  • Execution Traces: result.trace with typed timeline of every agent step
  • Reasoning Visibility: result.reasoning explains why the agent chose a tool
  • Batch Processing: agent.batch() / agent.abatch() for concurrent classification
  • Tool Policy Engine: Declarative allow/review/deny rules with human-in-the-loop
  • 4 Embedding Providers: OpenAI, Anthropic/Voyage, Gemini (free!), Cohere
  • 4 Vector Stores: In-memory, SQLite, Chroma, Pinecone
  • Hybrid Search: BM25 + vector fusion with Cohere/Jina reranking
  • Advanced Chunking: Semantic + contextual chunking for better retrieval
  • Dynamic Tool Loading: Plugin system with hot-reload support
  • Response Caching: InMemoryCache and RedisCache with stats tracking
  • 120 Model Registry: Type-safe constants with pricing and metadata
  • Pre-built Toolbox: 22 tools for files, data, text, datetime, web
  • 27 Examples: RAG, hybrid search, streaming, structured output, traces, batch, policy, and more
  • 880+ Tests: Unit, integration, and E2E with real API calls

Install

pip install selectools                    # Core + basic RAG
pip install selectools[rag]               # + Chroma, Pinecone, Voyage, Cohere, PyPDF
pip install selectools[cache]             # + Redis cache
pip install selectools[rag,cache]         # Everything

Set your API key:

export OPENAI_API_KEY="sk-..."

Quick Start

New to Selectools? Follow the 5-minute Quickstart tutorial — no API key needed.

Tool Calling Agent (No API Key)

from selectools import Agent, AgentConfig, tool
from selectools.providers.stubs import LocalProvider

@tool(description="Look up the price of a product")
def get_price(product: str) -> str:
    prices = {"laptop": "$999", "phone": "$699", "headphones": "$149"}
    return prices.get(product.lower(), f"No price found for {product}")

agent = Agent(
    tools=[get_price],
    provider=LocalProvider(),
    config=AgentConfig(max_iterations=3),
)

result = agent.ask("How much is a laptop?")
print(result.content)

Tool Calling Agent (OpenAI)

from selectools import Agent, AgentConfig, OpenAIProvider, tool
from selectools.models import OpenAI

@tool(description="Search the web for information")
def search(query: str) -> str:
    return f"Results for: {query}"

agent = Agent(
    tools=[search],
    provider=OpenAIProvider(default_model=OpenAI.GPT_4O_MINI.id),
    config=AgentConfig(max_iterations=5),
)

result = agent.ask("Search for Python tutorials")
print(result.content)

RAG Agent

from selectools import OpenAIProvider
from selectools.embeddings import OpenAIEmbeddingProvider
from selectools.models import OpenAI
from selectools.rag import RAGAgent, VectorStore

embedder = OpenAIEmbeddingProvider(model=OpenAI.Embeddings.TEXT_EMBEDDING_3_SMALL.id)
store = VectorStore.create("memory", embedder=embedder)

agent = RAGAgent.from_directory(
    directory="./docs",
    provider=OpenAIProvider(default_model=OpenAI.GPT_4O_MINI.id),
    vector_store=store,
    chunk_size=500, top_k=3,
)

result = agent.ask("What are the main features?")
print(result.content)
print(agent.get_usage_summary())  # LLM + embedding costs

Hybrid Search (Keyword + Semantic)

from selectools.rag import BM25, HybridSearcher, FusionMethod, HybridSearchTool, VectorStore

store = VectorStore.create("memory", embedder=embedder)
store.add_documents(chunked_docs)

searcher = HybridSearcher(
    vector_store=store,
    vector_weight=0.6,
    keyword_weight=0.4,
    fusion=FusionMethod.RRF,
)
searcher.add_documents(chunked_docs)

# Use with agent
hybrid_tool = HybridSearchTool(searcher=searcher, top_k=5)
agent = Agent(tools=[hybrid_tool.search_knowledge_base], provider=provider)

Streaming with Parallel Tools

import asyncio
from selectools import Agent, AgentConfig
from selectools.types import StreamChunk, AgentResult

agent = Agent(
    tools=[tool_a, tool_b, tool_c],
    provider=provider,
    config=AgentConfig(parallel_tool_execution=True),  # Default: enabled
)

async for item in agent.astream("Run all tasks"):
    if isinstance(item, StreamChunk):
        print(item.content, end="", flush=True)
    elif isinstance(item, AgentResult):
        print(f"\nDone in {item.iterations} iterations")

Key Features

Hybrid Search & Reranking

Combine semantic search with BM25 keyword matching for better recall on exact terms, names, and acronyms:

from selectools.rag import BM25, HybridSearcher, CohereReranker, FusionMethod

searcher = HybridSearcher(
    vector_store=store,
    fusion=FusionMethod.RRF,
    reranker=CohereReranker(),  # Optional cross-encoder reranking
)
results = searcher.search("GDPR compliance", top_k=5)

See docs/modules/HYBRID_SEARCH.md for full documentation.

Advanced Chunking

Go beyond fixed-size splitting with embedding-aware and LLM-enriched chunking:

from selectools.rag import SemanticChunker, ContextualChunker

# Split at topic boundaries using embedding similarity
semantic = SemanticChunker(embedder=embedder, similarity_threshold=0.75)

# Enrich each chunk with LLM-generated context (Anthropic-style contextual retrieval)
contextual = ContextualChunker(base_chunker=semantic, provider=provider)
enriched_docs = contextual.split_documents(documents)

See docs/modules/ADVANCED_CHUNKING.md for full documentation.

Dynamic Tool Loading

Discover and load @tool functions from files and directories at runtime:

from selectools.tools import ToolLoader

# Load tools from a plugin directory
tools = ToolLoader.from_directory("./plugins", recursive=True)
agent.add_tools(tools)

# Hot-reload after editing a plugin
updated = ToolLoader.reload_file("./plugins/search.py")
agent.replace_tool(updated[0])

# Remove tools the agent no longer needs
agent.remove_tool("deprecated_search")

See docs/modules/DYNAMIC_TOOLS.md for full documentation.

Response Caching

Avoid redundant LLM calls with pluggable caching:

from selectools import Agent, AgentConfig, InMemoryCache

cache = InMemoryCache(max_size=1000, default_ttl=300)
agent = Agent(
    tools=[...],
    provider=provider,
    config=AgentConfig(cache=cache),
)

# Same question twice -> second call is instant (cache hit)
agent.ask("What is Python?")
agent.reset()
agent.ask("What is Python?")

print(cache.stats)  # CacheStats(hits=1, misses=1, hit_rate=50.00%)

For distributed setups: from selectools.cache_redis import RedisCache

Routing Mode

Agent selects a tool without executing it -- use for intent classification:

config = AgentConfig(routing_only=True)
agent = Agent(tools=[send_email, schedule_meeting, search_kb], provider=provider, config=config)

result = agent.ask("Book a meeting with Alice tomorrow")
print(result.tool_name)  # "schedule_meeting"
print(result.tool_args)  # {"attendee": "Alice", "date": "tomorrow"}

Structured Output

Get typed, validated results from the LLM:

from pydantic import BaseModel
from typing import Literal

class Classification(BaseModel):
    intent: Literal["billing", "support", "sales", "cancel"]
    confidence: float
    priority: Literal["low", "medium", "high"]

result = agent.ask("I want to cancel my account", response_format=Classification)
print(result.parsed)  # Classification(intent="cancel", confidence=0.95, priority="high")

Auto-retries with error feedback when validation fails.

Execution Traces & Reasoning

See exactly what your agent did and why:

result = agent.run("Classify this ticket")

# Structured timeline of every step
for step in result.trace:
    print(f"{step.type} | {step.duration_ms:.0f}ms | {step.summary}")

# Why the agent chose a tool
print(result.reasoning)  # "Customer is asking about billing, routing to billing_support"

# Export for dashboards
result.trace.to_json("trace.json")

Provider Fallback

Automatic failover with circuit breaker:

from selectools import FallbackProvider, OpenAIProvider, AnthropicProvider

provider = FallbackProvider([
    OpenAIProvider(default_model="gpt-4o-mini"),
    AnthropicProvider(default_model="claude-haiku"),
])
agent = Agent(tools=[...], provider=provider)
# If OpenAI is down → tries Anthropic automatically

Batch Processing

Classify multiple requests concurrently:

results = await agent.abatch(
    ["Cancel my subscription", "How do I upgrade?", "My payment failed"],
    max_concurrency=10,
)

Tool Policy & Human-in-the-Loop

Declarative safety rules with approval callbacks:

from selectools import ToolPolicy

policy = ToolPolicy(
    allow=["search_*", "read_*"],
    review=["send_*", "create_*"],
    deny=["delete_*"],
)

async def confirm(tool_name, tool_args, reason):
    return await get_user_approval(tool_name, tool_args)

config = AgentConfig(tool_policy=policy, confirm_action=confirm)

E2E Streaming & Parallel Execution

  • agent.astream() yields StreamChunk (text deltas) then AgentResult (final)
  • Multiple tool calls execute concurrently via asyncio.gather() (3 tools @ 0.15s each = ~0.15s total)
  • Fallback chain: astream -> acomplete -> complete via executor
  • Context propagation with contextvars for tracing/auth

See docs/modules/STREAMING.md for full documentation.

Providers

Provider Streaming Vision Native Tools Cost
OpenAI Yes Yes Yes Paid
Anthropic Yes Yes Yes Paid
Gemini Yes Yes Yes Free tier
Ollama Yes No No Free (local)
Fallback Yes Yes Yes Varies (wraps others)
Local No No No Free (testing)
from selectools.models import OpenAI, Anthropic, Gemini, Ollama

# IDE autocomplete for all 120 models with pricing metadata
model = OpenAI.GPT_4O_MINI
print(f"Cost: ${model.prompt_cost}/${model.completion_cost} per 1M tokens")
print(f"Context: {model.context_window:,} tokens")

Embedding Providers

from selectools.embeddings import (
    OpenAIEmbeddingProvider,     # text-embedding-3-small/large
    AnthropicEmbeddingProvider,  # Voyage AI (voyage-3, voyage-3-lite)
    GeminiEmbeddingProvider,     # FREE (text-embedding-001/004)
    CohereEmbeddingProvider,     # embed-english-v3.0
)

Vector Stores

from selectools.rag import VectorStore

store = VectorStore.create("memory", embedder=embedder)           # Fast, no persistence
store = VectorStore.create("sqlite", embedder=embedder, db_path="docs.db")  # Persistent
store = VectorStore.create("chroma", embedder=embedder, persist_directory="./chroma")
store = VectorStore.create("pinecone", embedder=embedder, index_name="my-index")

Agent Configuration

config = AgentConfig(
    model="gpt-4o-mini",
    temperature=0.0,
    max_tokens=2000,
    max_iterations=6,
    max_retries=3,
    retry_backoff_seconds=2.0,
    request_timeout=60.0,
    tool_timeout_seconds=30.0,
    cost_warning_threshold=0.50,
    parallel_tool_execution=True,
    routing_only=False,
    stream=False,
    cache=None,                  # InMemoryCache or RedisCache
    tool_policy=None,            # ToolPolicy with allow/review/deny rules
    confirm_action=None,         # Human-in-the-loop approval callback
    approval_timeout=60.0,       # Seconds before auto-deny
    enable_analytics=True,
    verbose=False,
    hooks={                      # Lifecycle callbacks
        "on_tool_start": lambda name, args: ...,
        "on_tool_end": lambda name, result, duration: ...,
        "on_llm_end": lambda response, usage: ...,
    },
    system_prompt="You are a helpful assistant...",
)

Tool Definition

@tool Decorator (Recommended)

from selectools import tool

@tool(description="Calculate compound interest")
def calculate_interest(principal: float, rate: float, years: int) -> str:
    amount = principal * (1 + rate / 100) ** years
    return f"After {years} years: ${amount:.2f}"

Tool Registry

from selectools import ToolRegistry

registry = ToolRegistry()

@registry.tool(description="Search the knowledge base")
def search_kb(query: str, max_results: int = 5) -> str:
    return f"Results for: {query}"

agent = Agent(tools=registry.all(), provider=provider)

Injected Parameters

Keep secrets out of the LLM's view:

db_tool = Tool(
    name="query_db",
    description="Execute SQL query",
    parameters=[ToolParameter(name="sql", param_type=str, description="SQL query")],
    function=query_database,
    injected_kwargs={"db_connection": db_conn}  # Hidden from LLM
)

Streaming Tools

from typing import Generator

@tool(description="Process large file", streaming=True)
def process_file(filepath: str) -> Generator[str, None, None]:
    with open(filepath) as f:
        for i, line in enumerate(f, 1):
            yield f"[Line {i}] {line.strip()}\n"

config = AgentConfig(hooks={"on_tool_chunk": lambda name, chunk: print(chunk, end="")})

Conversation Memory

from selectools import Agent, ConversationMemory

memory = ConversationMemory(max_messages=20)
agent = Agent(tools=[...], provider=provider, memory=memory)

agent.ask("My name is Alice")
agent.ask("What's my name?")  # Remembers "Alice"

Cost Tracking

result = agent.ask("Search and summarize")

print(f"Total cost: ${agent.total_cost:.6f}")
print(f"Total tokens: {agent.total_tokens:,}")
print(agent.get_usage_summary())
# Includes LLM + embedding costs, per-tool breakdown

Examples

Examples are numbered by difficulty. Start from 01 and work your way up.

# Example Features API Key?
01 01_hello_world.py First agent, @tool, ask() No
02 02_search_weather.py ToolRegistry, multiple tools No
03 03_toolbox.py 22 pre-built tools (file, data, text, datetime) No
04 04_conversation_memory.py Multi-turn memory Yes
05 05_cost_tracking.py Token counting, cost warnings Yes
06 06_async_agent.py arun(), concurrent agents, FastAPI Yes
07 07_streaming_tools.py Generator-based streaming Yes
08 08_streaming_parallel.py astream(), parallel execution, StreamChunk Yes
09 09_caching.py InMemoryCache, RedisCache, cache stats Yes
10 10_routing_mode.py Routing mode, intent classification Yes
11 11_tool_analytics.py Call counts, success rates, timing Yes
12 12_observability_hooks.py Lifecycle hooks, tool validation Yes
13 13_dynamic_tools.py ToolLoader, plugins, hot-reload Yes
14 14_rag_basic.py RAG pipeline, document loading, vector search Yes + [rag]
15 15_semantic_search.py Pure semantic search, metadata filtering Yes + [rag]
16 16_rag_advanced.py PDFs, SQLite persistence, custom chunking Yes + [rag]
17 17_rag_multi_provider.py Embedding/store/chunk-size comparisons Yes + [rag]
18 18_hybrid_search.py BM25 + vector fusion, RRF, reranking Yes + [rag]
19 19_advanced_chunking.py Semantic and contextual chunking Yes + [rag]
20 20_customer_support_bot.py Multi-tool customer support workflow Yes
21 21_data_analysis_agent.py Data exploration and analysis Yes
22 22_ollama_local.py Fully local LLM via Ollama No (Ollama)
23 23_structured_output.py Pydantic response_format, auto-retry, JSON extraction No
24 24_traces_and_reasoning.py AgentTrace timeline, reasoning visibility, JSON export No
25 25_provider_fallback.py FallbackProvider, circuit breaker, failover chain No
26 26_batch_processing.py batch(), abatch(), structured batch, error isolation No
27 27_tool_policy.py ToolPolicy, deny_when, HITL approval, memory trimming No

Run any example:

python examples/01_hello_world.py   # No API key needed
python examples/14_rag_basic.py     # Needs OPENAI_API_KEY

Documentation

Comprehensive technical documentation is available in docs/:

Module Description
AGENT Agent loop, structured output, traces, reasoning, batch, policy
STREAMING E2E streaming, parallel execution, routing
TOOLS Tool definition, validation, registry
DYNAMIC_TOOLS ToolLoader, plugins, hot-reload
HYBRID_SEARCH BM25, fusion, reranking
ADVANCED_CHUNKING Semantic & contextual chunking
RAG Complete RAG pipeline
EMBEDDINGS Embedding providers
VECTOR_STORES Storage backends
PROVIDERS LLM provider adapters + FallbackProvider
MEMORY Conversation memory + tool-pair trimming
USAGE Cost tracking & analytics
MODELS Model registry & pricing
PARSER Tool call parsing
PROMPT System prompt generation

Tests

pytest tests/ -x -q          # All tests
pytest tests/ -k "not e2e"   # Skip E2E (no API keys needed)

400+ tests covering parsing, agent loop, providers, RAG pipeline, hybrid search, advanced chunking, dynamic tools, caching, streaming, and E2E integration.

License

LGPL-3.0-or-later - Use freely in commercial applications. Only modifications to the library itself must be shared. See LICENSE.

Contributing

See CONTRIBUTING.md. We welcome contributions for new tools, providers, vector stores, examples, and documentation.


Roadmap | Changelog | Documentation | Feature Proposals

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

selectools-0.13.0.tar.gz (125.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

selectools-0.13.0-py3-none-any.whl (134.6 kB view details)

Uploaded Python 3

File details

Details for the file selectools-0.13.0.tar.gz.

File metadata

  • Download URL: selectools-0.13.0.tar.gz
  • Upload date:
  • Size: 125.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for selectools-0.13.0.tar.gz
Algorithm Hash digest
SHA256 eaf1f321481b3b9a0e9d37113da498e3864eccf58bc269c775de76c6bc3bb6db
MD5 8aba10bf7e840325650d36e980f36941
BLAKE2b-256 be8a1271f1b55d6342cf4bfbd1a1e747fec957d32c1cf0755299e2440cff745c

See more details on using hashes here.

File details

Details for the file selectools-0.13.0-py3-none-any.whl.

File metadata

  • Download URL: selectools-0.13.0-py3-none-any.whl
  • Upload date:
  • Size: 134.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for selectools-0.13.0-py3-none-any.whl
Algorithm Hash digest
SHA256 da7f6cc3c39d109aac9bf28b0a4d3da6ec32a740dd26aaeaeaa5c78fb2ac4816
MD5 30c5734963aa3937248f92b0ac8b7295
BLAKE2b-256 da95c7c18bf0f23c2fa0b2348f6cf43e6f31e28cdde9b792e84ec1254b00c297

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page