Unified AI interface with cost optimization and failover

These details have not been verified by PyPI

Project links

Project description

Cost Katana Python SDK

A simple, unified interface for AI models with built-in cost optimization, failover, and analytics. Use any AI provider through one consistent API - no need to manage API keys or worry about provider-specific implementations!

🚀 Quick Start

Installation

pip install cost-katana

Get Your API Key

Visit Cost Katana Dashboard
Create an account or sign in
Go to API Keys section
Generate a new API key (starts with dak_)

Basic Usage

import cost_katana as ck

# Configure once with your API key
ck.configure(api_key='dak_your_key_here')

# Use any AI model with the same simple interface
model = ck.GenerativeModel('nova-lite')
response = model.generate_content("Explain quantum computing in simple terms")
print(response.text)
print(f"Cost: ${response.usage_metadata.cost:.4f}")

Chat Sessions

import cost_katana as ck

ck.configure(api_key='dak_your_key_here')

# Start a conversation
model = ck.GenerativeModel('claude-3-sonnet')
chat = model.start_chat()

# Send messages back and forth
response1 = chat.send_message("Hello! What's your name?")
print("AI:", response1.text)

response2 = chat.send_message("Can you help me write a Python function?")
print("AI:", response2.text)

# Get total conversation cost
total_cost = sum(msg.get('metadata', {}).get('cost', 0) for msg in chat.history)
print(f"Total conversation cost: ${total_cost:.4f}")

🎯 Why Cost Katana?

Simple Interface, Powerful Backend

One API for all providers: Use Google Gemini, Anthropic Claude, OpenAI GPT, AWS Bedrock models through one interface
No API key juggling: Store your provider keys securely in Cost Katana, use one key in your code
Automatic failover: If one provider is down, automatically switch to alternatives
Cost optimization: Intelligent routing to minimize costs while maintaining quality

Enterprise Features

Cost tracking: Real-time cost monitoring and budgets
Usage analytics: Detailed insights into model performance and usage patterns
Team management: Share projects and manage API usage across teams
Approval workflows: Set spending limits with approval requirements

📚 Configuration Options

Using Configuration File (Recommended)

Create config.json:

{
  "api_key": "dak_your_key_here",
  "default_model": "gemini-2.0-flash",
  "default_temperature": 0.7,
  "cost_limit_per_day": 50.0,
  "enable_optimization": true,
  "enable_failover": true,
  "model_mappings": {
    "gemini": "gemini-2.0-flash-exp",
    "claude": "anthropic.claude-3-sonnet-20240229-v1:0",
    "gpt4": "gpt-4-turbo-preview"
  },
  "providers": {
    "google": {
      "priority": 1,
      "models": ["gemini-2.0-flash", "gemini-pro"]
    },
    "anthropic": {
      "priority": 2, 
      "models": ["claude-3-sonnet", "claude-3-haiku"]
    }
  }
}

import cost_katana as ck

# Configure from file
ck.configure(config_file='config.json')

# Now use any model
model = ck.GenerativeModel('gemini')  # Uses mapping from config

Environment Variables

export API_KEY=dak_your_key_here
export COST_KATANA_DEFAULT_MODEL=claude-3-sonnet

import cost_katana as ck

# Automatically loads from environment
ck.configure()

model = ck.GenerativeModel()  # Uses default model from env

🤖 Supported Models

Amazon Nova Models (Primary Recommendation)

nova-micro - Ultra-fast and cost-effective for simple tasks
nova-lite - Balanced performance and cost for general use
nova-pro - High-performance model for complex tasks

Anthropic Claude Models

claude-3-haiku - Fast and cost-effective responses
claude-3-sonnet - Balanced performance for complex tasks
claude-3-opus - Most capable Claude model for advanced reasoning
claude-3.5-haiku - Latest fast model with enhanced capabilities
claude-3.5-sonnet - Advanced reasoning and analysis

Meta Llama Models

llama-3.1-8b - Good balance of performance and efficiency
llama-3.1-70b - Large model for complex reasoning
llama-3.1-405b - Most capable Llama model
llama-3.2-1b - Compact and efficient
llama-3.2-3b - Efficient for general tasks

Mistral Models

mistral-7b - Efficient open-source model
mixtral-8x7b - High-quality mixture of experts
mistral-large - Advanced reasoning capabilities

Cohere Models

command - General purpose text generation
command-light - Lighter, faster version
command-r - Retrieval-augmented generation
command-r-plus - Enhanced RAG with better reasoning

Friendly Aliases

fast → Nova Micro (optimized for speed)
balanced → Nova Lite (balanced cost/performance)
powerful → Nova Pro (maximum capabilities)

⚙️ Advanced Usage

Generation Configuration

from cost_katana import GenerativeModel, GenerationConfig

config = GenerationConfig(
    temperature=0.3,
    max_output_tokens=1000,
    top_p=0.9
)

model = GenerativeModel('claude-3-sonnet', generation_config=config)
response = model.generate_content("Write a haiku about programming")

Multi-Agent Processing

# Enable multi-agent processing for complex queries
model = GenerativeModel('gemini-2.0-flash')
response = model.generate_content(
    "Analyze the economic impact of AI on job markets",
    use_multi_agent=True,
    chat_mode='balanced'
)

# See which agents were involved
print("Agent path:", response.usage_metadata.agent_path)
print("Optimizations applied:", response.usage_metadata.optimizations_applied)

Cost Optimization Modes

# Different optimization strategies
fast_response = model.generate_content(
    "Quick summary of today's news",
    chat_mode='fastest'  # Prioritize speed
)

cheap_response = model.generate_content(
    "Detailed analysis of market trends", 
    chat_mode='cheapest'  # Prioritize cost
)

balanced_response = model.generate_content(
    "Help me debug this Python code",
    chat_mode='balanced'  # Balance speed and cost
)

🖥️ Command Line Interface

Cost Katana includes a comprehensive CLI for easy interaction:

# Initialize configuration
cost-katana init

# Test your setup
cost-katana test

# List available models
cost-katana models

# Start interactive chat
cost-katana chat --model gemini-2.0-flash

# Use specific config file
cost-katana chat --config my-config.json

🧬 SAST (Semantic Abstract Syntax Tree) Features

Cost Katana includes advanced SAST capabilities for semantic optimization and analysis:

SAST Optimization

# Optimize a prompt using SAST
cost-katana sast optimize "Write a detailed analysis of market trends"

# Optimize from file
cost-katana sast optimize --file prompt.txt --output optimized.txt

# Cross-lingual optimization
cost-katana sast optimize "Analyze data" --cross-lingual --language en

# Preserve ambiguity for analysis
cost-katana sast optimize "Complex query" --preserve-ambiguity

SAST Comparison

# Compare traditional vs SAST optimization
cost-katana sast compare "Your prompt here"

# Compare with specific language
cost-katana sast compare --file prompt.txt --language en

SAST Vocabulary & Analytics

# Explore SAST vocabulary
cost-katana sast vocabulary

# Search semantic primitives
cost-katana sast vocabulary --search "analysis" --category "action"

# Get SAST performance statistics
cost-katana sast stats

# View SAST showcase with examples
cost-katana sast showcase

# Telescope ambiguity demonstration
cost-katana sast telescope

# Test universal semantics across languages
cost-katana sast universal "concept" --languages "en,es,fr"

SAST Python API

import cost_katana as ck

ck.configure(api_key='dak_your_key_here')
client = ck.CostKatanaClient()

# Optimize with SAST
result = client.optimize_with_sast(
    prompt="Your prompt here",
    language="en",
    cross_lingual=True,
    preserve_ambiguity=False
)

# Compare SAST vs traditional
comparison = client.compare_sast_vs_traditional(
    prompt="Your prompt here",
    language="en"
)

# Get SAST vocabulary stats
stats = client.get_sast_vocabulary_stats()

# Search semantic primitives
primitives = client.search_semantic_primitives(
    term="analysis",
    category="action",
    limit=10
)

# Test universal semantics
universal_test = client.test_universal_semantics(
    concept="love",
    languages=["en", "es", "fr"]
)

🧠 Cortex Engine Features

Cost Katana's Cortex engine provides intelligent processing capabilities:

Cortex Operations

import cost_katana as ck

ck.configure(api_key='dak_your_key_here')
client = ck.CostKatanaClient()

# Enable Cortex with SAST processing
result = client.optimize_with_sast(
    prompt="Your prompt",
    service="openai",
    model="gpt-4o-mini",
    # Cortex features
    enableCortex=True,
    cortexOperation="sast",
    cortexStyle="conversational",
    cortexFormat="plain",
    cortexSemanticCache=True,
    cortexPreserveSemantics=True,
    cortexIntelligentRouting=True,
    cortexSastProcessing=True,
    cortexAmbiguityResolution=True,
    cortexCrossLingualMode=False
)

Cortex Capabilities

Semantic Caching: Intelligent caching of semantic representations
Intelligent Routing: Smart routing based on content analysis
Ambiguity Resolution: Automatic resolution of ambiguous language
Cross-lingual Processing: Multi-language semantic understanding
Semantic Preservation: Maintains semantic meaning during optimization

🌐 Gateway Features

Cost Katana acts as a unified gateway to multiple AI providers:

Provider Abstraction

import cost_katana as ck

ck.configure(api_key='dak_your_key_here')

# Same interface, different providers
models = [
    'nova-lite',           # Amazon Nova
    'claude-3-sonnet',     # Anthropic Claude
    'gemini-2.0-flash',    # Google Gemini
    'gpt-4',               # OpenAI GPT
    'llama-3.1-70b'        # Meta Llama
]

for model in models:
    response = ck.GenerativeModel(model).generate_content("Hello!")
    print(f"{model}: {response.text[:50]}...")

Intelligent Routing

# Cost Katana automatically routes to the best provider
model = ck.GenerativeModel('balanced')  # Uses intelligent routing

# Different optimization modes
fast_response = model.generate_content(
    "Quick summary",
    chat_mode='fastest'    # Routes to fastest provider
)

cheap_response = model.generate_content(
    "Detailed analysis",
    chat_mode='cheapest'   # Routes to most cost-effective provider
)

balanced_response = model.generate_content(
    "Complex reasoning",
    chat_mode='balanced'   # Balances speed and cost
)

Failover & Redundancy

# Automatic failover if primary provider is down
model = ck.GenerativeModel('claude-3-sonnet')

try:
    response = model.generate_content("Your prompt")
except ck.ModelNotAvailableError:
    # Cost Katana automatically tries alternative providers
    print("Primary model unavailable, using fallback...")
    response = model.generate_content("Your prompt")

📊 Usage Analytics

Track your AI usage and costs:

import cost_katana as ck

ck.configure(config_file='config.json')

model = ck.GenerativeModel('claude-3-sonnet')
response = model.generate_content("Explain machine learning")

# Detailed usage information
metadata = response.usage_metadata
print(f"Model used: {metadata.model}")
print(f"Cost: ${metadata.cost:.4f}")
print(f"Latency: {metadata.latency:.2f}s")
print(f"Tokens: {metadata.total_tokens}")
print(f"Cache hit: {metadata.cache_hit}")
print(f"Risk level: {metadata.risk_level}")

🔧 Error Handling

from cost_katana import GenerativeModel
from cost_katana.exceptions import (
    CostLimitExceededError,
    ModelNotAvailableError,
    RateLimitError
)

try:
    model = GenerativeModel('expensive-model')
    response = model.generate_content("Complex analysis task")
    
except CostLimitExceededError:
    print("Cost limit reached! Check your budget settings.")
    
except ModelNotAvailableError:
    print("Model is currently unavailable. Trying fallback...")
    model = GenerativeModel('backup-model')
    response = model.generate_content("Complex analysis task")
    
except RateLimitError:
    print("Rate limit hit. Please wait before retrying.")

🌟 Comparison with Direct Provider SDKs

Before (Google Gemini)

import google.generativeai as genai

# Need to manage API key
genai.configure(api_key="your-google-api-key")

# Provider-specific code
model = genai.GenerativeModel('gemini-2.0-flash')
response = model.generate_content("Hello")

# No cost tracking, no failover, provider lock-in

After (Cost Katana)

import cost_katana as ck

# One API key for all providers
ck.configure(api_key='dak_your_key_here')

# Same interface, any provider
model = ck.GenerativeModel('nova-lite')
response = model.generate_content("Hello")

# Built-in cost tracking, failover, optimization
print(f"Cost: ${response.usage_metadata.cost:.4f}")

🏢 Enterprise Features

Team Management: Share configurations across team members
Cost Centers: Track usage by project or department
Approval Workflows: Require approval for high-cost operations
Analytics Dashboard: Web interface for usage insights
Custom Models: Support for fine-tuned and custom models
SLA Monitoring: Track model availability and performance

🔒 Security & Privacy

Secure Key Storage: API keys encrypted at rest
No Data Retention: Your prompts and responses are not stored
Audit Logs: Complete audit trail of API usage
GDPR Compliant: Full compliance with data protection regulations

📖 API Reference

GenerativeModel

class GenerativeModel:
    def __init__(self, model_name: str, generation_config: GenerationConfig = None)
    def generate_content(self, prompt: str, **kwargs) -> GenerateContentResponse
    def start_chat(self, history: List = None) -> ChatSession
    def count_tokens(self, prompt: str) -> Dict[str, int]

ChatSession

class ChatSession:
    def send_message(self, message: str, **kwargs) -> GenerateContentResponse
    def get_history(self) -> List[Dict]
    def clear_history(self) -> None
    def delete_conversation(self) -> None

CostKatanaClient

class CostKatanaClient:
    def __init__(self, api_key: str = None, base_url: str = None, config_file: str = None)
    
    # Core Methods
    def send_message(self, message: str, model_id: str, **kwargs) -> Dict[str, Any]
    def get_available_models(self) -> List[Dict[str, Any]]
    def create_conversation(self, title: str = None, model_id: str = None) -> Dict[str, Any]
    def get_conversation_history(self, conversation_id: str) -> Dict[str, Any]
    def delete_conversation(self, conversation_id: str) -> Dict[str, Any]
    
    # SAST Methods
    def optimize_with_sast(self, prompt: str, **kwargs) -> Dict[str, Any]
    def compare_sast_vs_traditional(self, prompt: str, **kwargs) -> Dict[str, Any]
    def get_sast_vocabulary_stats(self) -> Dict[str, Any]
    def search_semantic_primitives(self, term: str = None, **kwargs) -> Dict[str, Any]
    def get_telescope_demo(self) -> Dict[str, Any]
    def test_universal_semantics(self, concept: str, languages: List[str] = None) -> Dict[str, Any]
    def get_sast_stats(self) -> Dict[str, Any]
    def get_sast_showcase(self) -> Dict[str, Any]

GenerateContentResponse

class GenerateContentResponse:
    text: str                           # Generated text
    usage_metadata: UsageMetadata       # Cost, tokens, latency info
    thinking: Dict                      # AI reasoning (if available)

UsageMetadata

class UsageMetadata:
    model: str                          # Model used
    cost: float                         # Cost in USD
    latency: float                      # Response time in seconds
    total_tokens: int                   # Total tokens used
    cache_hit: bool                     # Whether response was cached
    risk_level: str                     # Risk assessment level
    agent_path: List[str]               # Multi-agent processing path
    optimizations_applied: List[str]    # Applied optimizations

🤝 Support

Documentation: docs.costkatana.com
Discord Community: discord.gg/costkatana
Email Support: abdul@hypothesize.tech
GitHub Issues: github.com/cost-katana/python-sdk
GitHub Repository: github.com/Hypothesize-Tech/cost-katana-python

📄 License

MIT License - see LICENSE for details.

Ready to optimize your AI costs? Get started at costkatana.com 🚀# cost-katana-python

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

2.5.7

Apr 29, 2026

2.5.5

Apr 29, 2026

2.5.4

Apr 21, 2026

2.5.3

Mar 29, 2026

2.5.2

Mar 27, 2026

2.5.1

Mar 25, 2026

2.5.0

Mar 25, 2026

2.4.0

Feb 21, 2026

2.2.6

Jan 31, 2026

2.2.5

Jan 17, 2026

2.2.4

Dec 2, 2025

2.2.3

Nov 26, 2025

2.2.2

Nov 24, 2025

2.2.1

Nov 19, 2025

2.2.0

Nov 19, 2025

2.1.0

Nov 17, 2025

2.0.8

Nov 15, 2025

2.0.7

Oct 23, 2025

2.0.6

Oct 23, 2025

2.0.5

Oct 23, 2025

2.0.4

Oct 12, 2025

2.0.2

Oct 9, 2025

2.0.1

Oct 2, 2025

2.0.0

Sep 11, 2025

This version

1.0.3

Sep 8, 2025

1.0.2

Sep 8, 2025

1.0.1

Aug 4, 2025

1.0.0

Aug 4, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cost_katana-1.0.3.tar.gz (47.5 kB view details)

Uploaded Sep 8, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

cost_katana-1.0.3-py3-none-any.whl (25.3 kB view details)

Uploaded Sep 8, 2025 Python 3

File details

Details for the file cost_katana-1.0.3.tar.gz.

File metadata

Download URL: cost_katana-1.0.3.tar.gz
Upload date: Sep 8, 2025
Size: 47.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for cost_katana-1.0.3.tar.gz
Algorithm	Hash digest
SHA256	`611bd94ab3a4424540da8268f3f87d77dd352c15a680c90476a993cd0e8c92b4`
MD5	`42ad9e3ce900c3c69f4201b08333e8e8`
BLAKE2b-256	`ef95bb9e3a3626c3f6dc0a669c80ac0f19f3cf515d37295bad994b747e488371`

See more details on using hashes here.

File details

Details for the file cost_katana-1.0.3-py3-none-any.whl.

File metadata

Download URL: cost_katana-1.0.3-py3-none-any.whl
Upload date: Sep 8, 2025
Size: 25.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for cost_katana-1.0.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`fc4c8061c59897576bfa35f07e7ad5521e669695a4c70bee699e8c3c081158a1`
MD5	`e4cd2ec538bc106f144095bc15631548`
BLAKE2b-256	`2dcbd5bd16f2aae941ae8aff9bb1638eb430cc87349384048dd58c5e38c1ea1b`

See more details on using hashes here.

cost-katana 1.0.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Cost Katana Python SDK

🚀 Quick Start

Installation

Get Your API Key

Basic Usage

Chat Sessions

🎯 Why Cost Katana?

Simple Interface, Powerful Backend

Enterprise Features

📚 Configuration Options

Using Configuration File (Recommended)

Environment Variables

🤖 Supported Models

Amazon Nova Models (Primary Recommendation)

Anthropic Claude Models

Meta Llama Models

Mistral Models

Cohere Models

Friendly Aliases

⚙️ Advanced Usage

Generation Configuration

Multi-Agent Processing

Cost Optimization Modes

🖥️ Command Line Interface

🧬 SAST (Semantic Abstract Syntax Tree) Features

SAST Optimization

SAST Comparison

SAST Vocabulary & Analytics

SAST Python API

🧠 Cortex Engine Features

Cortex Operations

Cortex Capabilities

🌐 Gateway Features

Provider Abstraction

Intelligent Routing

Failover & Redundancy

📊 Usage Analytics

🔧 Error Handling

🌟 Comparison with Direct Provider SDKs

Before (Google Gemini)

After (Cost Katana)

🏢 Enterprise Features

🔒 Security & Privacy

📖 API Reference

GenerativeModel

ChatSession

CostKatanaClient

GenerateContentResponse

UsageMetadata

🤝 Support

📄 License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes