Unified AI interface with cost optimization and failover

These details have not been verified by PyPI

Project links

Project description

Cost Katana Python SDK

A simple, unified interface for AI models with built-in cost optimization, failover, and analytics. Use any AI provider through one consistent API - no need to manage API keys or worry about provider-specific implementations!

🚀 Quick Start

Installation

pip install cost-katana

Get Your API Key

Visit Cost Katana Dashboard
Create an account or sign in
Go to API Keys section
Generate a new API key (starts with dak_)

Basic Usage

import cost_katana as ck

# Configure once with your API key
ck.configure(api_key='dak_your_key_here')

# Use any AI model with the same simple interface
model = ck.GenerativeModel('nova-lite')
response = model.generate_content("Explain quantum computing in simple terms")
print(response.text)
print(f"Cost: ${response.usage_metadata.cost:.4f}")

Chat Sessions

import cost_katana as ck

ck.configure(api_key='dak_your_key_here')

# Start a conversation
model = ck.GenerativeModel('claude-3-sonnet')
chat = model.start_chat()

# Send messages back and forth
response1 = chat.send_message("Hello! What's your name?")
print("AI:", response1.text)

response2 = chat.send_message("Can you help me write a Python function?")
print("AI:", response2.text)

# Get total conversation cost
total_cost = sum(msg.get('metadata', {}).get('cost', 0) for msg in chat.history)
print(f"Total conversation cost: ${total_cost:.4f}")

🎯 Why Cost Katana?

Simple Interface, Powerful Backend

One API for all providers: Use Google Gemini, Anthropic Claude, OpenAI GPT, AWS Bedrock models through one interface
No API key juggling: Store your provider keys securely in Cost Katana, use one key in your code
Automatic failover: If one provider is down, automatically switch to alternatives
Cost optimization: Intelligent routing to minimize costs while maintaining quality

Enterprise Features

Cost tracking: Real-time cost monitoring and budgets
Usage analytics: Detailed insights into model performance and usage patterns
Team management: Share projects and manage API usage across teams
Approval workflows: Set spending limits with approval requirements

📚 Configuration Options

Using Configuration File (Recommended)

Create config.json:

{
  "api_key": "dak_your_key_here",
  "default_model": "gemini-2.0-flash",
  "default_temperature": 0.7,
  "cost_limit_per_day": 50.0,
  "enable_optimization": true,
  "enable_failover": true,
  "model_mappings": {
    "gemini": "gemini-2.0-flash-exp",
    "claude": "anthropic.claude-3-sonnet-20240229-v1:0",
    "gpt4": "gpt-4-turbo-preview"
  },
  "providers": {
    "google": {
      "priority": 1,
      "models": ["gemini-2.0-flash", "gemini-pro"]
    },
    "anthropic": {
      "priority": 2, 
      "models": ["claude-3-sonnet", "claude-3-haiku"]
    }
  }
}

import cost_katana as ck

# Configure from file
ck.configure(config_file='config.json')

# Now use any model
model = ck.GenerativeModel('gemini')  # Uses mapping from config

Environment Variables

export API_KEY=dak_your_key_here
export COST_KATANA_DEFAULT_MODEL=claude-3-sonnet

import cost_katana as ck

# Automatically loads from environment
ck.configure()

model = ck.GenerativeModel()  # Uses default model from env

🤖 Supported Models

Amazon Nova Models (Primary Recommendation)

nova-micro - Ultra-fast and cost-effective for simple tasks
nova-lite - Balanced performance and cost for general use
nova-pro - High-performance model for complex tasks

Anthropic Claude Models

claude-3-haiku - Fast and cost-effective responses
claude-3-sonnet - Balanced performance for complex tasks
claude-3-opus - Most capable Claude model for advanced reasoning
claude-3.5-haiku - Latest fast model with enhanced capabilities
claude-3.5-sonnet - Advanced reasoning and analysis

Meta Llama Models

llama-3.1-8b - Good balance of performance and efficiency
llama-3.1-70b - Large model for complex reasoning
llama-3.1-405b - Most capable Llama model
llama-3.2-1b - Compact and efficient
llama-3.2-3b - Efficient for general tasks

Mistral Models

mistral-7b - Efficient open-source model
mixtral-8x7b - High-quality mixture of experts
mistral-large - Advanced reasoning capabilities

Cohere Models

command - General purpose text generation
command-light - Lighter, faster version
command-r - Retrieval-augmented generation
command-r-plus - Enhanced RAG with better reasoning

Friendly Aliases

fast → Nova Micro (optimized for speed)
balanced → Nova Lite (balanced cost/performance)
powerful → Nova Pro (maximum capabilities)

⚙️ Advanced Usage

Generation Configuration

from cost_katana import GenerativeModel, GenerationConfig

config = GenerationConfig(
    temperature=0.3,
    max_output_tokens=1000,
    top_p=0.9
)

model = GenerativeModel('claude-3-sonnet', generation_config=config)
response = model.generate_content("Write a haiku about programming")

Multi-Agent Processing

# Enable multi-agent processing for complex queries
model = GenerativeModel('gemini-2.0-flash')
response = model.generate_content(
    "Analyze the economic impact of AI on job markets",
    use_multi_agent=True,
    chat_mode='balanced'
)

# See which agents were involved
print("Agent path:", response.usage_metadata.agent_path)
print("Optimizations applied:", response.usage_metadata.optimizations_applied)

Cost Optimization Modes

# Different optimization strategies
fast_response = model.generate_content(
    "Quick summary of today's news",
    chat_mode='fastest'  # Prioritize speed
)

cheap_response = model.generate_content(
    "Detailed analysis of market trends", 
    chat_mode='cheapest'  # Prioritize cost
)

balanced_response = model.generate_content(
    "Help me debug this Python code",
    chat_mode='balanced'  # Balance speed and cost
)

🖥️ Command Line Interface

Cost Katana includes a CLI for easy interaction:

# Initialize configuration
cost-katana init

# Test your setup
cost-katana test

# List available models
cost-katana models

# Start interactive chat
cost-katana chat --model gemini-2.0-flash

# Use specific config file
cost-katana chat --config my-config.json

📊 Usage Analytics

Track your AI usage and costs:

import cost_katana as ck

ck.configure(config_file='config.json')

model = ck.GenerativeModel('claude-3-sonnet')
response = model.generate_content("Explain machine learning")

# Detailed usage information
metadata = response.usage_metadata
print(f"Model used: {metadata.model}")
print(f"Cost: ${metadata.cost:.4f}")
print(f"Latency: {metadata.latency:.2f}s")
print(f"Tokens: {metadata.total_tokens}")
print(f"Cache hit: {metadata.cache_hit}")
print(f"Risk level: {metadata.risk_level}")

🔧 Error Handling

from cost_katana import GenerativeModel
from cost_katana.exceptions import (
    CostLimitExceededError,
    ModelNotAvailableError,
    RateLimitError
)

try:
    model = GenerativeModel('expensive-model')
    response = model.generate_content("Complex analysis task")
    
except CostLimitExceededError:
    print("Cost limit reached! Check your budget settings.")
    
except ModelNotAvailableError:
    print("Model is currently unavailable. Trying fallback...")
    model = GenerativeModel('backup-model')
    response = model.generate_content("Complex analysis task")
    
except RateLimitError:
    print("Rate limit hit. Please wait before retrying.")

🌟 Comparison with Direct Provider SDKs

Before (Google Gemini)

import google.generativeai as genai

# Need to manage API key
genai.configure(api_key="your-google-api-key")

# Provider-specific code
model = genai.GenerativeModel('gemini-2.0-flash')
response = model.generate_content("Hello")

# No cost tracking, no failover, provider lock-in

After (Cost Katana)

import cost_katana as ck

# One API key for all providers
ck.configure(api_key='dak_your_key_here')

# Same interface, any provider
model = ck.GenerativeModel('nova-lite')
response = model.generate_content("Hello")

# Built-in cost tracking, failover, optimization
print(f"Cost: ${response.usage_metadata.cost:.4f}")

🏢 Enterprise Features

Team Management: Share configurations across team members
Cost Centers: Track usage by project or department
Approval Workflows: Require approval for high-cost operations
Analytics Dashboard: Web interface for usage insights
Custom Models: Support for fine-tuned and custom models
SLA Monitoring: Track model availability and performance

🔒 Security & Privacy

Secure Key Storage: API keys encrypted at rest
No Data Retention: Your prompts and responses are not stored
Audit Logs: Complete audit trail of API usage
GDPR Compliant: Full compliance with data protection regulations

📖 API Reference

GenerativeModel

class GenerativeModel:
    def __init__(self, model_name: str, generation_config: GenerationConfig = None)
    def generate_content(self, prompt: str, **kwargs) -> GenerateContentResponse
    def start_chat(self, history: List = None) -> ChatSession
    def count_tokens(self, prompt: str) -> Dict[str, int]

ChatSession

class ChatSession:
    def send_message(self, message: str, **kwargs) -> GenerateContentResponse
    def get_history(self) -> List[Dict]
    def clear_history(self) -> None
    def delete_conversation(self) -> None

GenerateContentResponse

class GenerateContentResponse:
    text: str                           # Generated text
    usage_metadata: UsageMetadata       # Cost, tokens, latency info
    thinking: Dict                      # AI reasoning (if available)

🤝 Support

Documentation: docs.costkatana.com
Discord Community: discord.gg/costkatana
Email Support: abdul@hypothesize.tech
GitHub Issues: github.com/cost-katana/python-sdk
GitHub Repository: github.com/Hypothesize-Tech/cost-katana-python

📄 License

MIT License - see LICENSE for details.

Ready to optimize your AI costs? Get started at costkatana.com 🚀# cost-katana-python

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

2.5.7

Apr 29, 2026

2.5.5

Apr 29, 2026

2.5.4

Apr 21, 2026

2.5.3

Mar 29, 2026

2.5.2

Mar 27, 2026

2.5.1

Mar 25, 2026

2.5.0

Mar 25, 2026

2.4.0

Feb 21, 2026

2.2.6

Jan 31, 2026

2.2.5

Jan 17, 2026

2.2.4

Dec 2, 2025

2.2.3

Nov 26, 2025

2.2.2

Nov 24, 2025

2.2.1

Nov 19, 2025

2.2.0

Nov 19, 2025

2.1.0

Nov 17, 2025

2.0.8

Nov 15, 2025

2.0.7

Oct 23, 2025

2.0.6

Oct 23, 2025

2.0.5

Oct 23, 2025

2.0.4

Oct 12, 2025

2.0.2

Oct 9, 2025

2.0.1

Oct 2, 2025

2.0.0

Sep 11, 2025

1.0.3

Sep 8, 2025

This version

1.0.2

Sep 8, 2025

1.0.1

Aug 4, 2025

1.0.0

Aug 4, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cost_katana-1.0.2.tar.gz (44.4 kB view details)

Uploaded Sep 8, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

cost_katana-1.0.2-py3-none-any.whl (23.7 kB view details)

Uploaded Sep 8, 2025 Python 3

File details

Details for the file cost_katana-1.0.2.tar.gz.

File metadata

Download URL: cost_katana-1.0.2.tar.gz
Upload date: Sep 8, 2025
Size: 44.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for cost_katana-1.0.2.tar.gz
Algorithm	Hash digest
SHA256	`aa315b5a02cf783dfd54f4ecee5df8bdee346762c63baad799e0fb35a48be493`
MD5	`88aab9dac8f608b30ccaa3139741feea`
BLAKE2b-256	`ab73d6762c33e4a1e853df8c984be820c018de5401228e33178360ea69217e11`

See more details on using hashes here.

File details

Details for the file cost_katana-1.0.2-py3-none-any.whl.

File metadata

Download URL: cost_katana-1.0.2-py3-none-any.whl
Upload date: Sep 8, 2025
Size: 23.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for cost_katana-1.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`40f40933e3b4ae4d52a2f423642553be9e85e9ff92b9d82df018a7aee0899cbd`
MD5	`6e62689b2fbe3b5ff11b5b698392fb1f`
BLAKE2b-256	`28ebcb336d82c306c3f45ea55c494ead15a44ec7627776214aeec1b3316b4b8e`

See more details on using hashes here.

cost-katana 1.0.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Cost Katana Python SDK

🚀 Quick Start

Installation

Get Your API Key

Basic Usage

Chat Sessions

🎯 Why Cost Katana?

Simple Interface, Powerful Backend

Enterprise Features

📚 Configuration Options

Using Configuration File (Recommended)

Environment Variables

🤖 Supported Models

Amazon Nova Models (Primary Recommendation)

Anthropic Claude Models

Meta Llama Models

Mistral Models

Cohere Models

Friendly Aliases

⚙️ Advanced Usage

Generation Configuration

Multi-Agent Processing

Cost Optimization Modes

🖥️ Command Line Interface

📊 Usage Analytics

🔧 Error Handling

🌟 Comparison with Direct Provider SDKs

Before (Google Gemini)

After (Cost Katana)

🏢 Enterprise Features

🔒 Security & Privacy

📖 API Reference

GenerativeModel

ChatSession

GenerateContentResponse

🤝 Support

📄 License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes