Production-Ready Agentic AI Framework with Enterprise Safety
Project description
๐ช Kite
Build Production-Ready AI Agents That Actually Work
Fast โข Safe โข Simple โข Powerful
๐ What is Kite? A lightweight Python framework that turns LLMs into reliable AI agents you can deploy with confidence. No PhD required.
๐ฆ Installation
Via pip (recommended):
pip install kite-agent
From source:
git clone https://github.com/thienzz/Kite.git
cd Kite
pip install -e .
Quick Start โข Examples โข Features โข Documentation
โจ Why Developers Choose Kite
Most AI frameworks overwhelm you with complexity. Kite gives you production-grade reliability with dead-simple APIs:
from kite import Kite
# Initialize once
ai = Kite()
# Create a specialist agent
support_agent = ai.create_agent(
name="CustomerSupport",
system_prompt="You are a helpful e-commerce support agent.",
tools=[search_orders, process_refunds],
agent_type="react" # Autonomous reasoning loop
)
# Run it
result = await support_agent.run("Where is order ORD-12345?")
print(result['response'])
That's it. Behind the scenes, Kite handles:
- โ Circuit breakers (prevent cascading failures)
- โ Retry logic (auto-recovery from API errors)
- โ Memory management (RAG, sessions, graph knowledge)
- โ Multi-provider support (OpenAI, Anthropic, Groq, local models)
- โ Cost tracking & monitoring
๐ฏ Built for Real-World Problems
Stop building MVP demos. Start shipping production systems:
| Your Challenge | Kite's Solution |
|---|---|
| "LLMs hallucinate in production" | Vector RAG + Graph RAG for grounded responses |
| "API failures crash my agents" | Circuit breakers auto-pause failing services |
| "Too slow & expensive" | Smart/Fast model routing - use cheap models when possible |
| "Can't track what agents are doing" | Event bus + metrics for full observability |
| "Hard to prevent dangerous actions" | Guardrails & shell whitelisting built-in |
| "Need human approval for critical tasks" | HITL workflows with checkpoints |
โก Quick Start
Installation
git clone https://github.com/thienzz/Kite.git
cd Kite
pip install -r requirements.txt
Setup Environment
cp .env.example .env
# Edit .env with your API keys
Minimum config:
LLM_PROVIDER=openai # or anthropic, groq, ollama
OPENAI_API_KEY=sk-...
Your First Agent (30 seconds)
import asyncio
from kite import Kite
async def main():
# Auto-loads from .env
ai = Kite()
# Create a tool
def get_weather(city: str) -> str:
return f"Sunny, 72ยฐF in {city}"
weather_tool = ai.create_tool("get_weather", get_weather,
"Get current weather for a city")
# Create agent
agent = ai.create_agent(
name="WeatherBot",
system_prompt="You help users check weather. Always use the tool.",
tools=[weather_tool],
agent_type="react"
)
# Run
result = await agent.run("What's the weather in San Francisco?")
print(result['response'])
asyncio.run(main())
๐๏ธ Architecture Overview
Kite's modular design lets you use what you need:
kite/
โโโ agents/ # ๐ค Reasoning patterns (ReAct, ReWOO, ToT, Plan-Execute)
โโโ memory/ # ๐ง Vector RAG, Graph RAG, Session Memory
โโโ safety/ # ๐ก๏ธ Circuit Breakers, Idempotency, Kill Switches
โโโ routing/ # ๐งญ Semantic Routing, Aggregator Routing, Smart/Fast Model Selection
โโโ tools/ # ๐ง Built-in utilities (Web Search, Code Execution, Shell, MCP integrations)
โโโ pipeline/ # โ๏ธ Deterministic workflows with HITL support
โโโ monitoring/ # ๐ Metrics, Tracing, Event Bus
Core Components (Lazy-Loaded)
ai = Kite()
# These initialize only when accessed:
ai.llm # LLM provider (OpenAI, Anthropic, Groq, Ollama)
ai.embeddings # Embedding provider (FastEmbed, OpenAI)
ai.vector_memory # Vector similarity search (FAISS, ChromaDB, or in-memory)
ai.graph_rag # Knowledge graph for relationships
ai.session_memory # Conversation history
ai.semantic_router # Intent-based routing
ai.circuit_breaker # Fault tolerance
ai.idempotency # Duplicate request prevention
ai.tools # Tool registry
ai.pipeline # Workflow manager
๐ Core Features
1๏ธโฃ Multiple Reasoning Patterns
Choose the right "brain" for your task:
# ReAct: Standard loop (Think โ Act โ Observe โ Repeat)
agent = ai.create_agent(..., agent_type="react")
# ReWOO: Plan everything upfront, execute in parallel (FAST!)
agent = ai.create_agent(..., agent_type="rewoo")
# Tree-of-Thoughts: Explore multiple solutions (creative tasks)
agent = ai.create_agent(..., agent_type="tot")
# Plan-Execute: Classic two-phase planning
agent = ai.create_agent(..., agent_type="plan_execute")
See them in action: examples/case6_reasoning_architectures.py
2๏ธโฃ Production Safety Mechanisms
Circuit Breakers prevent cascading failures:
ai.circuit_breaker.config.failure_threshold = 3 # Open after 3 failures
ai.circuit_breaker.config.timeout_seconds = 60 # Cool-down period
# Circuit auto-opens if LLM/tool fails 3x, preventing waste
Idempotency prevents duplicate operations:
# Same operation_id within TTL returns cached result
result = ai.idempotency.execute(
operation_id="order_123_refund",
func=process_refund,
args=(order_id,)
)
Guardrails for dangerous operations:
from kite.tools.system_tools import ShellTool
# Whitelist safe commands only
shell = ShellTool(allowed_commands=["ls", "git", "df", "uptime"])
# Blocks 'rm -rf', 'sudo', etc. automatically
3๏ธโฃ Advanced Memory Systems
Vector Memory for semantic search:
# Add knowledge
ai.vector_memory.add_document("policy_001", "Returns accepted within 30 days...")
# Semantic search
results = ai.vector_memory.search("What's the return policy?", top_k=3)
Graph RAG for relationship-aware knowledge:
ai.graph_rag.add_entity("Kite", "framework", {"language": "Python"})
ai.graph_rag.add_relationship("Kite", "uses", "OpenAI")
# Query walks the graph
answer = ai.graph_rag.query("What providers does Kite support?")
Session Memory for conversations:
ai.session_memory.add_message(session_id="user_123", role="user", content="Hi!")
history = ai.session_memory.get_history(session_id="user_123")
4๏ธโฃ Smart Multi-Provider Support
Switch between providers without changing code:
# OpenAI
ai.config['llm_provider'] = 'openai'
ai.config['llm_model'] = 'gpt-4o'
# Anthropic
ai.config['llm_provider'] = 'anthropic'
ai.config['llm_model'] = 'claude-3-5-sonnet-20241022'
# Groq (ultra-fast inference)
ai.config['llm_provider'] = 'groq'
ai.config['llm_model'] = 'llama-3.3-70b-versatile'
# Local with Ollama
ai.config['llm_provider'] = 'ollama'
ai.config['llm_model'] = 'qwen2.5:1.5b'
Cost Optimization: Use resource-aware routing:
from kite.optimization.resource_router import ResourceAwareRouter
router = ResourceAwareRouter(ai.config)
# Automatically uses:
# - FAST model (cheap) for routing, simple tasks
# - SMART model (powerful) for complex reasoning
analyst = ai.create_agent(
name="Analyst",
model=router.smart_model, # gpt-4o for hard problems
...
)
5๏ธโฃ Human-in-the-Loop Workflows
Build approval workflows for critical operations:
from kite.pipeline import DeterministicPipeline
# Define workflow
def draft_email(state):
return {"draft": "Dear Customer, ..."}
def send_email(state):
return {"status": "sent"}
# Create pipeline with checkpoint
pipeline = ai.pipeline.create("approval_flow")
pipeline.add_step("draft", draft_email)
pipeline.add_checkpoint("draft") # Pauses here for approval
pipeline.add_step("send", send_email)
# Execute (stops at checkpoint)
state = await pipeline.execute_async({"to": "user@example.com"})
# Human reviews, then resume
final = await pipeline.resume_async(state.task_id, approved=True)
Real example: case4_multi_agent_collab.py
๐ Production Examples
We built 6 real-world case studies to show you exactly how to use Kite:
| Case | Scenario | Key Concepts | Difficulty |
|---|---|---|---|
| Case 1 | E-commerce Support Bot | LLM Routing, Tools, Multi-Agent | ๐ข Beginner |
| Case 2 | Data Analyst Agent | SQL + Python Execution, Charts | ๐ก Intermediate |
| Case 3 | Deep Research System | Web Scraping, Multi-Step Planning | ๐ก Intermediate |
| Case 4 | Multi-Agent Collaboration | Supervisor Pattern, HITL, Iterative Refinement | ๐ด Advanced |
| Case 5 | DevOps Automation | Shell Tools, Safety Guardrails | ๐ก Intermediate |
| Case 6 | Reasoning Pattern Comparison | ReAct vs ReWOO vs ToT | ๐ด Advanced |
Run an Example
# E-commerce support demo
PYTHONPATH=. python3 examples/case1_ecommerce_support.py
# Data analyst with charts
PYTHONPATH=. python3 examples/case2_enterprise_analytics.py
๐ See detailed tutorials for each case โ
๐ Performance Benchmarks
| Metric | Value |
|---|---|
| Framework Startup | ~50ms (lazy loading) |
| Memory Footprint | <100MB (base) |
| Agent Latency | 500ms - 2s (depends on LLM provider) |
| Throughput | 100+ req/s with caching |
Real data (M1 Mac, Ollama qwen2.5:1.5b):
- Simple completion: 50-200ms
- ReAct agent (3 tool calls): 800ms-1.5s
- Plan-Execute (5 steps): 3-5s
๐ ๏ธ Production Deployment
Docker Compose (Recommended)
docker-compose up -d
Includes:
- Kite API server (FastAPI)
- Redis (caching)
- PostgreSQL (session storage)
- Prometheus + Grafana (monitoring)
Environment Variables
See .env.example for all options. Key configs:
# LLM Provider
LLM_PROVIDER=openai
LLM_MODEL=gpt-4o
OPENAI_API_KEY=sk-...
# Embeddings
EMBEDDING_PROVIDER=fastembed
EMBEDDING_MODEL=BAAI/bge-small-en-v1.5
# Safety
CIRCUIT_BREAKER_FAILURE_THRESHOLD=3
CIRCUIT_BREAKER_TIMEOUT_SECONDS=60
IDEMPOTENCY_TTL=3600
# Memory
VECTOR_BACKEND=faiss
VECTOR_DIMENSION=384
# Optimization
FAST_LLM_MODEL=groq/llama-3.1-8b-instant # Cheap routing
SMART_LLM_MODEL=openai/gpt-4o # Complex tasks
๐ Documentation
Guides
- Quick Start Guide - Get running in 5 minutes
- Architecture Deep Dive - How Kite works internally
- API Reference - Complete API docs
- Deployment Guide - Docker, scaling, monitoring
- Safety Patterns - Circuit breakers, guardrails, idempotency
- Memory Systems - Vector, Graph RAG, sessions
Examples
All examples include detailed inline comments and step-by-step walkthroughs:
- E-commerce Support - Multi-agent routing
- Enterprise Analytics - SQL + Python
- Research Assistant - Web research
- Multi-Agent Workflow - Supervisor pattern
- DevOps Automation - Safe shell execution
- Reasoning Patterns - ReAct/ReWOO/ToT
๐งช Testing
# Run all tests
pytest tests/
# Specific suites
pytest tests/test_framework.py # Core functionality
pytest tests/test_async_concurrency.py # Async patterns
pytest tests/test_exports.py # Module exports
# With coverage
pytest --cov=kite tests/
๐ค Contributing
We welcome contributions! See CONTRIBUTING.md for:
- ๐ Bug reports & feature requests
- ๐ Documentation improvements
- ๐ง New reasoning patterns
- ๐ Additional LLM integrations
- โก Performance optimizations
Priority areas:
- More agent architectures (LATS, Reflexion)
- Streaming response support
- Multi-agent orchestration patterns
- Integration tests for all examples
๐บ๏ธ Roadmap
- v0.1.0: Core framework, ReAct/ReWOO/ToT agents
- v0.2.0: Streaming responses, async batch processing
- v0.3.0: Multi-agent coordination primitives
- v0.4.0: Fine-tuning integration
- v1.0.0: Production-ready release with full test coverage
๐ License
MIT License - see LICENSE for details.
TLDR: Use it however you want. Commercial use welcome. No warranty.
๐ Acknowledgments
Built with amazing open-source tools:
- Ollama - Local LLM runtime
- FastEmbed - Lightning-fast embeddings
- FAISS - Facebook's vector search
- ChromaDB - Vector database
- LangChain - Inspiration for tool abstractions
๐ฌ Community & Support
- ๐ Bug Reports: GitHub Issues
- ๐ก Feature Requests: GitHub Discussions
- ๐ง Contact: thien@beevr.ai
Stop building demos. Start shipping AI agents to production.
โญ Star this repo if Kite helps you build better AI systems!
Made with โค๏ธ by developers who ship production AI
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file kite_agent-0.1.0.tar.gz.
File metadata
- Download URL: kite_agent-0.1.0.tar.gz
- Upload date:
- Size: 154.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b95c7530299489db331c98facb1d02909ab8000258d71bf782b18994fd1882aa
|
|
| MD5 |
697ce0437d791956bae881a17a7b7f15
|
|
| BLAKE2b-256 |
e2ac74cba5d39405eb88ca2f6614435f7fbc9cc74c932bbc8c9972491476cfae
|
File details
Details for the file kite_agent-0.1.0-py3-none-any.whl.
File metadata
- Download URL: kite_agent-0.1.0-py3-none-any.whl
- Upload date:
- Size: 146.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f2086a470c82417ebe5dc0c1c15726d92cd0ebc5ed6d97a00c340c6643013080
|
|
| MD5 |
2edf03eef74c475bfeb87b8ada56bf90
|
|
| BLAKE2b-256 |
46f99d5a0fed6b17831768584f7f1e1ae65eb8e5bce585e7d038bfc3849ec55e
|