Skip to main content

Unified AI interface with cost optimization and failover

Project description

Cost Katana Python SDK

A simple, unified interface for AI models with built-in cost optimization, failover, and analytics. Use any AI provider through one consistent API - no need to manage API keys or worry about provider-specific implementations!

🚀 Quick Start

Installation

pip install cost-katana

Get Your API Key

  1. Visit Cost Katana Dashboard
  2. Create an account or sign in
  3. Go to API Keys section
  4. Generate a new API key (starts with dak_)

Basic Usage

import cost_katana as ck

# Configure once with your API key
ck.configure(api_key='dak_your_key_here')

# Use any AI model with the same simple interface
model = ck.GenerativeModel('nova-lite')
response = model.generate_content("Explain quantum computing in simple terms")
print(response.text)
print(f"Cost: ${response.usage_metadata.cost:.4f}")

Chat Sessions

import cost_katana as ck

ck.configure(api_key='dak_your_key_here')

# Start a conversation
model = ck.GenerativeModel('claude-3-sonnet')
chat = model.start_chat()

# Send messages back and forth
response1 = chat.send_message("Hello! What's your name?")
print("AI:", response1.text)

response2 = chat.send_message("Can you help me write a Python function?")
print("AI:", response2.text)

# Get total conversation cost
total_cost = sum(msg.get('metadata', {}).get('cost', 0) for msg in chat.history)
print(f"Total conversation cost: ${total_cost:.4f}")

🎯 Why Cost Katana?

Simple Interface, Powerful Backend

  • One API for all providers: Use Google Gemini, Anthropic Claude, OpenAI GPT, AWS Bedrock models through one interface
  • No API key juggling: Store your provider keys securely in Cost Katana, use one key in your code
  • Automatic failover: If one provider is down, automatically switch to alternatives
  • Cost optimization: Intelligent routing to minimize costs while maintaining quality

Enterprise Features

  • Cost tracking: Real-time cost monitoring and budgets
  • Usage analytics: Detailed insights into model performance and usage patterns
  • Team management: Share projects and manage API usage across teams
  • Approval workflows: Set spending limits with approval requirements

📚 Configuration Options

Using Configuration File (Recommended)

Create config.json:

{
  "api_key": "dak_your_key_here",
  "default_model": "gemini-2.0-flash",
  "default_temperature": 0.7,
  "cost_limit_per_day": 50.0,
  "enable_optimization": true,
  "enable_failover": true,
  "model_mappings": {
    "gemini": "gemini-2.0-flash-exp",
    "claude": "anthropic.claude-3-sonnet-20240229-v1:0",
    "gpt4": "gpt-4-turbo-preview"
  },
  "providers": {
    "google": {
      "priority": 1,
      "models": ["gemini-2.0-flash", "gemini-pro"]
    },
    "anthropic": {
      "priority": 2, 
      "models": ["claude-3-sonnet", "claude-3-haiku"]
    }
  }
}
import cost_katana as ck

# Configure from file
ck.configure(config_file='config.json')

# Now use any model
model = ck.GenerativeModel('gemini')  # Uses mapping from config

Environment Variables

export API_KEY=dak_your_key_here
export COST_KATANA_DEFAULT_MODEL=claude-3-sonnet
import cost_katana as ck

# Automatically loads from environment
ck.configure()

model = ck.GenerativeModel()  # Uses default model from env

🤖 Supported Models

Amazon Nova Models (Primary Recommendation)

  • nova-micro - Ultra-fast and cost-effective for simple tasks
  • nova-lite - Balanced performance and cost for general use
  • nova-pro - High-performance model for complex tasks

Anthropic Claude Models

  • claude-3-haiku - Fast and cost-effective responses
  • claude-3-sonnet - Balanced performance for complex tasks
  • claude-3-opus - Most capable Claude model for advanced reasoning
  • claude-3.5-haiku - Latest fast model with enhanced capabilities
  • claude-3.5-sonnet - Advanced reasoning and analysis

Meta Llama Models

  • llama-3.1-8b - Good balance of performance and efficiency
  • llama-3.1-70b - Large model for complex reasoning
  • llama-3.1-405b - Most capable Llama model
  • llama-3.2-1b - Compact and efficient
  • llama-3.2-3b - Efficient for general tasks

Mistral Models

  • mistral-7b - Efficient open-source model
  • mixtral-8x7b - High-quality mixture of experts
  • mistral-large - Advanced reasoning capabilities

Cohere Models

  • command - General purpose text generation
  • command-light - Lighter, faster version
  • command-r - Retrieval-augmented generation
  • command-r-plus - Enhanced RAG with better reasoning

Friendly Aliases

  • fast → Nova Micro (optimized for speed)
  • balanced → Nova Lite (balanced cost/performance)
  • powerful → Nova Pro (maximum capabilities)

⚙️ Advanced Usage

Generation Configuration

from cost_katana import GenerativeModel, GenerationConfig

config = GenerationConfig(
    temperature=0.3,
    max_output_tokens=1000,
    top_p=0.9
)

model = GenerativeModel('claude-3-sonnet', generation_config=config)
response = model.generate_content("Write a haiku about programming")

Multi-Agent Processing

# Enable multi-agent processing for complex queries
model = GenerativeModel('gemini-2.0-flash')
response = model.generate_content(
    "Analyze the economic impact of AI on job markets",
    use_multi_agent=True,
    chat_mode='balanced'
)

# See which agents were involved
print("Agent path:", response.usage_metadata.agent_path)
print("Optimizations applied:", response.usage_metadata.optimizations_applied)

Cost Optimization Modes

# Different optimization strategies
fast_response = model.generate_content(
    "Quick summary of today's news",
    chat_mode='fastest'  # Prioritize speed
)

cheap_response = model.generate_content(
    "Detailed analysis of market trends", 
    chat_mode='cheapest'  # Prioritize cost
)

balanced_response = model.generate_content(
    "Help me debug this Python code",
    chat_mode='balanced'  # Balance speed and cost
)

🖥️ Command Line Interface

Cost Katana includes a CLI for easy interaction:

# Initialize configuration
cost-katana init

# Test your setup
cost-katana test

# List available models
cost-katana models

# Start interactive chat
cost-katana chat --model gemini-2.0-flash

# Use specific config file
cost-katana chat --config my-config.json

📊 Usage Analytics

Track your AI usage and costs:

import cost_katana as ck

ck.configure(config_file='config.json')

model = ck.GenerativeModel('claude-3-sonnet')
response = model.generate_content("Explain machine learning")

# Detailed usage information
metadata = response.usage_metadata
print(f"Model used: {metadata.model}")
print(f"Cost: ${metadata.cost:.4f}")
print(f"Latency: {metadata.latency:.2f}s")
print(f"Tokens: {metadata.total_tokens}")
print(f"Cache hit: {metadata.cache_hit}")
print(f"Risk level: {metadata.risk_level}")

🔧 Error Handling

from cost_katana import GenerativeModel
from cost_katana.exceptions import (
    CostLimitExceededError,
    ModelNotAvailableError,
    RateLimitError
)

try:
    model = GenerativeModel('expensive-model')
    response = model.generate_content("Complex analysis task")
    
except CostLimitExceededError:
    print("Cost limit reached! Check your budget settings.")
    
except ModelNotAvailableError:
    print("Model is currently unavailable. Trying fallback...")
    model = GenerativeModel('backup-model')
    response = model.generate_content("Complex analysis task")
    
except RateLimitError:
    print("Rate limit hit. Please wait before retrying.")

🌟 Comparison with Direct Provider SDKs

Before (Google Gemini)

import google.generativeai as genai

# Need to manage API key
genai.configure(api_key="your-google-api-key")

# Provider-specific code
model = genai.GenerativeModel('gemini-2.0-flash')
response = model.generate_content("Hello")

# No cost tracking, no failover, provider lock-in

After (Cost Katana)

import cost_katana as ck

# One API key for all providers
ck.configure(api_key='dak_your_key_here')

# Same interface, any provider
model = ck.GenerativeModel('nova-lite')
response = model.generate_content("Hello")

# Built-in cost tracking, failover, optimization
print(f"Cost: ${response.usage_metadata.cost:.4f}")

🏢 Enterprise Features

  • Team Management: Share configurations across team members
  • Cost Centers: Track usage by project or department
  • Approval Workflows: Require approval for high-cost operations
  • Analytics Dashboard: Web interface for usage insights
  • Custom Models: Support for fine-tuned and custom models
  • SLA Monitoring: Track model availability and performance

🔒 Security & Privacy

  • Secure Key Storage: API keys encrypted at rest
  • No Data Retention: Your prompts and responses are not stored
  • Audit Logs: Complete audit trail of API usage
  • GDPR Compliant: Full compliance with data protection regulations

📖 API Reference

GenerativeModel

class GenerativeModel:
    def __init__(self, model_name: str, generation_config: GenerationConfig = None)
    def generate_content(self, prompt: str, **kwargs) -> GenerateContentResponse
    def start_chat(self, history: List = None) -> ChatSession
    def count_tokens(self, prompt: str) -> Dict[str, int]

ChatSession

class ChatSession:
    def send_message(self, message: str, **kwargs) -> GenerateContentResponse
    def get_history(self) -> List[Dict]
    def clear_history(self) -> None
    def delete_conversation(self) -> None

GenerateContentResponse

class GenerateContentResponse:
    text: str                           # Generated text
    usage_metadata: UsageMetadata       # Cost, tokens, latency info
    thinking: Dict                      # AI reasoning (if available)

🤝 Support

📄 License

MIT License - see LICENSE for details.


Ready to optimize your AI costs? Get started at costkatana.com 🚀# cost-katana-python

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cost_katana-1.0.2.tar.gz (44.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cost_katana-1.0.2-py3-none-any.whl (23.7 kB view details)

Uploaded Python 3

File details

Details for the file cost_katana-1.0.2.tar.gz.

File metadata

  • Download URL: cost_katana-1.0.2.tar.gz
  • Upload date:
  • Size: 44.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for cost_katana-1.0.2.tar.gz
Algorithm Hash digest
SHA256 aa315b5a02cf783dfd54f4ecee5df8bdee346762c63baad799e0fb35a48be493
MD5 88aab9dac8f608b30ccaa3139741feea
BLAKE2b-256 ab73d6762c33e4a1e853df8c984be820c018de5401228e33178360ea69217e11

See more details on using hashes here.

File details

Details for the file cost_katana-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: cost_katana-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 23.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for cost_katana-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 40f40933e3b4ae4d52a2f423642553be9e85e9ff92b9d82df018a7aee0899cbd
MD5 6e62689b2fbe3b5ff11b5b698392fb1f
BLAKE2b-256 28ebcb336d82c306c3f45ea55c494ead15a44ec7627776214aeec1b3316b4b8e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page