Unified AI interface with cost optimization and failover
Project description
Cost Katana Python SDK
A simple, unified interface for AI models with built-in cost optimization, failover, and analytics. Use any AI provider through one consistent API - no need to manage API keys or worry about provider-specific implementations!
🚀 Quick Start
Installation
pip install cost-katana
Get Your API Key
- Visit Cost Katana Dashboard
- Create an account or sign in
- Go to API Keys section
- Generate a new API key (starts with
dak_)
Basic Usage
import cost_katana as ck
# Configure once with your API key
ck.configure(api_key='dak_your_key_here')
# Use any AI model with the same simple interface
model = ck.GenerativeModel('nova-lite')
response = model.generate_content("Explain quantum computing in simple terms")
print(response.text)
print(f"Cost: ${response.usage_metadata.cost:.4f}")
Chat Sessions
import cost_katana as ck
ck.configure(api_key='dak_your_key_here')
# Start a conversation
model = ck.GenerativeModel('claude-3-sonnet')
chat = model.start_chat()
# Send messages back and forth
response1 = chat.send_message("Hello! What's your name?")
print("AI:", response1.text)
response2 = chat.send_message("Can you help me write a Python function?")
print("AI:", response2.text)
# Get total conversation cost
total_cost = sum(msg.get('metadata', {}).get('cost', 0) for msg in chat.history)
print(f"Total conversation cost: ${total_cost:.4f}")
🎯 Why Cost Katana?
Simple Interface, Powerful Backend
- One API for all providers: Use Google Gemini, Anthropic Claude, OpenAI GPT, AWS Bedrock models through one interface
- No API key juggling: Store your provider keys securely in Cost Katana, use one key in your code
- Automatic failover: If one provider is down, automatically switch to alternatives
- Cost optimization: Intelligent routing to minimize costs while maintaining quality
Enterprise Features
- Cost tracking: Real-time cost monitoring and budgets
- Usage analytics: Detailed insights into model performance and usage patterns
- Team management: Share projects and manage API usage across teams
- Approval workflows: Set spending limits with approval requirements
📚 Configuration Options
Using Configuration File (Recommended)
Create config.json:
{
"api_key": "dak_your_key_here",
"default_model": "gemini-2.0-flash",
"default_temperature": 0.7,
"cost_limit_per_day": 50.0,
"enable_optimization": true,
"enable_failover": true,
"model_mappings": {
"gemini": "gemini-2.0-flash-exp",
"claude": "anthropic.claude-3-sonnet-20240229-v1:0",
"gpt4": "gpt-4-turbo-preview"
},
"providers": {
"google": {
"priority": 1,
"models": ["gemini-2.0-flash", "gemini-pro"]
},
"anthropic": {
"priority": 2,
"models": ["claude-3-sonnet", "claude-3-haiku"]
}
}
}
import cost_katana as ck
# Configure from file
ck.configure(config_file='config.json')
# Now use any model
model = ck.GenerativeModel('gemini') # Uses mapping from config
Environment Variables
export API_KEY=dak_your_key_here
export COST_KATANA_DEFAULT_MODEL=claude-3-sonnet
import cost_katana as ck
# Automatically loads from environment
ck.configure()
model = ck.GenerativeModel() # Uses default model from env
🤖 Supported Models
Amazon Nova Models (Primary Recommendation)
nova-micro- Ultra-fast and cost-effective for simple tasksnova-lite- Balanced performance and cost for general usenova-pro- High-performance model for complex tasks
Anthropic Claude Models
claude-3-haiku- Fast and cost-effective responsesclaude-3-sonnet- Balanced performance for complex tasksclaude-3-opus- Most capable Claude model for advanced reasoningclaude-3.5-haiku- Latest fast model with enhanced capabilitiesclaude-3.5-sonnet- Advanced reasoning and analysis
Meta Llama Models
llama-3.1-8b- Good balance of performance and efficiencyllama-3.1-70b- Large model for complex reasoningllama-3.1-405b- Most capable Llama modelllama-3.2-1b- Compact and efficientllama-3.2-3b- Efficient for general tasks
Mistral Models
mistral-7b- Efficient open-source modelmixtral-8x7b- High-quality mixture of expertsmistral-large- Advanced reasoning capabilities
Cohere Models
command- General purpose text generationcommand-light- Lighter, faster versioncommand-r- Retrieval-augmented generationcommand-r-plus- Enhanced RAG with better reasoning
Friendly Aliases
fast→ Nova Micro (optimized for speed)balanced→ Nova Lite (balanced cost/performance)powerful→ Nova Pro (maximum capabilities)
⚙️ Advanced Usage
Generation Configuration
from cost_katana import GenerativeModel, GenerationConfig
config = GenerationConfig(
temperature=0.3,
max_output_tokens=1000,
top_p=0.9
)
model = GenerativeModel('claude-3-sonnet', generation_config=config)
response = model.generate_content("Write a haiku about programming")
Multi-Agent Processing
# Enable multi-agent processing for complex queries
model = GenerativeModel('gemini-2.0-flash')
response = model.generate_content(
"Analyze the economic impact of AI on job markets",
use_multi_agent=True,
chat_mode='balanced'
)
# See which agents were involved
print("Agent path:", response.usage_metadata.agent_path)
print("Optimizations applied:", response.usage_metadata.optimizations_applied)
Cost Optimization Modes
# Different optimization strategies
fast_response = model.generate_content(
"Quick summary of today's news",
chat_mode='fastest' # Prioritize speed
)
cheap_response = model.generate_content(
"Detailed analysis of market trends",
chat_mode='cheapest' # Prioritize cost
)
balanced_response = model.generate_content(
"Help me debug this Python code",
chat_mode='balanced' # Balance speed and cost
)
🖥️ Command Line Interface
Cost Katana includes a CLI for easy interaction:
# Initialize configuration
cost-katana init
# Test your setup
cost-katana test
# List available models
cost-katana models
# Start interactive chat
cost-katana chat --model gemini-2.0-flash
# Use specific config file
cost-katana chat --config my-config.json
📊 Usage Analytics
Track your AI usage and costs:
import cost_katana as ck
ck.configure(config_file='config.json')
model = ck.GenerativeModel('claude-3-sonnet')
response = model.generate_content("Explain machine learning")
# Detailed usage information
metadata = response.usage_metadata
print(f"Model used: {metadata.model}")
print(f"Cost: ${metadata.cost:.4f}")
print(f"Latency: {metadata.latency:.2f}s")
print(f"Tokens: {metadata.total_tokens}")
print(f"Cache hit: {metadata.cache_hit}")
print(f"Risk level: {metadata.risk_level}")
🔧 Error Handling
from cost_katana import GenerativeModel
from cost_katana.exceptions import (
CostLimitExceededError,
ModelNotAvailableError,
RateLimitError
)
try:
model = GenerativeModel('expensive-model')
response = model.generate_content("Complex analysis task")
except CostLimitExceededError:
print("Cost limit reached! Check your budget settings.")
except ModelNotAvailableError:
print("Model is currently unavailable. Trying fallback...")
model = GenerativeModel('backup-model')
response = model.generate_content("Complex analysis task")
except RateLimitError:
print("Rate limit hit. Please wait before retrying.")
🌟 Comparison with Direct Provider SDKs
Before (Google Gemini)
import google.generativeai as genai
# Need to manage API key
genai.configure(api_key="your-google-api-key")
# Provider-specific code
model = genai.GenerativeModel('gemini-2.0-flash')
response = model.generate_content("Hello")
# No cost tracking, no failover, provider lock-in
After (Cost Katana)
import cost_katana as ck
# One API key for all providers
ck.configure(api_key='dak_your_key_here')
# Same interface, any provider
model = ck.GenerativeModel('nova-lite')
response = model.generate_content("Hello")
# Built-in cost tracking, failover, optimization
print(f"Cost: ${response.usage_metadata.cost:.4f}")
🏢 Enterprise Features
- Team Management: Share configurations across team members
- Cost Centers: Track usage by project or department
- Approval Workflows: Require approval for high-cost operations
- Analytics Dashboard: Web interface for usage insights
- Custom Models: Support for fine-tuned and custom models
- SLA Monitoring: Track model availability and performance
🔒 Security & Privacy
- Secure Key Storage: API keys encrypted at rest
- No Data Retention: Your prompts and responses are not stored
- Audit Logs: Complete audit trail of API usage
- GDPR Compliant: Full compliance with data protection regulations
📖 API Reference
GenerativeModel
class GenerativeModel:
def __init__(self, model_name: str, generation_config: GenerationConfig = None)
def generate_content(self, prompt: str, **kwargs) -> GenerateContentResponse
def start_chat(self, history: List = None) -> ChatSession
def count_tokens(self, prompt: str) -> Dict[str, int]
ChatSession
class ChatSession:
def send_message(self, message: str, **kwargs) -> GenerateContentResponse
def get_history(self) -> List[Dict]
def clear_history(self) -> None
def delete_conversation(self) -> None
GenerateContentResponse
class GenerateContentResponse:
text: str # Generated text
usage_metadata: UsageMetadata # Cost, tokens, latency info
thinking: Dict # AI reasoning (if available)
🤝 Support
- Documentation: docs.costkatana.com
- Discord Community: discord.gg/costkatana
- Email Support: abdul@hypothesize.tech
- GitHub Issues: github.com/cost-katana/python-sdk
- GitHub Repository: github.com/Hypothesize-Tech/cost-katana-python
📄 License
MIT License - see LICENSE for details.
Ready to optimize your AI costs? Get started at costkatana.com 🚀# cost-katana-python
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cost_katana-1.0.2.tar.gz.
File metadata
- Download URL: cost_katana-1.0.2.tar.gz
- Upload date:
- Size: 44.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aa315b5a02cf783dfd54f4ecee5df8bdee346762c63baad799e0fb35a48be493
|
|
| MD5 |
88aab9dac8f608b30ccaa3139741feea
|
|
| BLAKE2b-256 |
ab73d6762c33e4a1e853df8c984be820c018de5401228e33178360ea69217e11
|
File details
Details for the file cost_katana-1.0.2-py3-none-any.whl.
File metadata
- Download URL: cost_katana-1.0.2-py3-none-any.whl
- Upload date:
- Size: 23.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
40f40933e3b4ae4d52a2f423642553be9e85e9ff92b9d82df018a7aee0899cbd
|
|
| MD5 |
6e62689b2fbe3b5ff11b5b698392fb1f
|
|
| BLAKE2b-256 |
28ebcb336d82c306c3f45ea55c494ead15a44ec7627776214aeec1b3316b4b8e
|