Skip to main content

The simplest way to use AI in Python with automatic cost tracking and optimization

Project description

Cost Katana Python 🥷

AI that just works. Costs that just track.

One import. Any model. Automatic cost tracking.


🚀 Get Started in 60 Seconds

Step 1: Install

pip install costkatana

Step 2: Make Your First AI Call

import cost_katana as ck

response = ck.ai('gpt-4', 'Explain quantum computing in one sentence')

print(response.text)   # "Quantum computing uses qubits to perform..."
print(response.cost)   # 0.0012
print(response.tokens) # 47

That's it. No configuration. No complexity. Just results. Usage and cost tracking is always on—there is no option to disable it (required for usage attribution and cost visibility).


📖 Tutorial: Build a Cost-Aware AI App

Part 1: Basic Chat Session

import cost_katana as ck

# Create a persistent chat session
chat = ck.chat('gpt-4')

chat.send('Hello! What can you help me with?')
chat.send('Tell me a programming joke')
chat.send('Now explain it')

# See exactly what you spent
print(f"💰 Total cost: ${chat.total_cost:.4f}")
print(f"📊 Messages: {len(chat.history)}")
print(f"🎯 Tokens used: {chat.total_tokens}")

Part 2: Type-Safe Model Selection

Stop guessing model names. Get autocomplete and catch typos:

import cost_katana as ck
from cost_katana import openai, anthropic, google

# Type-safe model constants (recommended)
response = ck.ai(openai.gpt_4, 'Hello, world!')

# Compare models easily
models = [openai.gpt_4, anthropic.claude_3_5_sonnet_20241022, google.gemini_2_5_pro]
for model in models:
    response = ck.ai(model, 'Explain AI in one sentence')
    print(f"Cost: ${response.cost:.4f}")

Available namespaces:

Namespace Models
openai GPT-4, GPT-3.5, O1, O3, DALL-E, Whisper
anthropic Claude 3.5 Sonnet, Haiku, Opus
google Gemini 2.5 Pro, Flash
aws_bedrock Nova, Claude on Bedrock
xai Grok models
deepseek DeepSeek models
mistral Mistral AI models
cohere Command models
meta Llama models

Part 3: Smart Caching

Cache identical questions to avoid paying twice:

import cost_katana as ck

# First call - hits the API
r1 = ck.ai('gpt-4', 'What is 2+2?', cache=True)
print(f"Cached: {r1.cached}")  # False
print(f"Cost: ${r1.cost}")     # $0.0008

# Second call - served from cache (FREE!)
r2 = ck.ai('gpt-4', 'What is 2+2?', cache=True)
print(f"Cached: {r2.cached}")  # True
print(f"Cost: ${r2.cost}")     # $0.0000 🎉

Part 4: Cortex Optimization

For long-form content, Cortex compresses prompts intelligently:

import cost_katana as ck

response = ck.ai(
    'gpt-4',
    'Write a comprehensive guide to machine learning for beginners',
    cortex=True,      # Enable 40-75% cost reduction
    max_tokens=2000
)

print(f"Optimized: {response.optimized}")
print(f"Saved: ${response.saved_amount}")

Part 5: Compare Models Side-by-Side

import cost_katana as ck

prompt = 'Summarize the theory of relativity in 50 words'
models = ['gpt-4', 'claude-3-sonnet', 'gemini-pro', 'gpt-3.5-turbo']

print('📊 Model Cost Comparison\n')

for model in models:
    response = ck.ai(model, prompt)
    print(f"{model:20} ${response.cost:.6f}")

Sample Output:

📊 Model Cost Comparison

gpt-4                $0.001200
claude-3-sonnet      $0.000900
gemini-pro           $0.000150
gpt-3.5-turbo        $0.000080

🎯 Core Features

Cost Tracking

Usage and cost tracking is always on; no option to disable. Every response includes cost information:

response = ck.ai('gpt-4', 'Write a story')
print(f"Cost: ${response.cost}")
print(f"Tokens: {response.tokens}")
print(f"Model: {response.model}")
print(f"Provider: {response.provider}")

Auto-Failover

Never fail—automatically switch providers:

# If OpenAI is down, automatically uses Claude or Gemini
response = ck.ai('gpt-4', 'Hello')
print(response.provider)  # Might be 'anthropic' if OpenAI failed

Security Firewall

Block malicious prompts:

import cost_katana as ck

ck.configure(firewall=True)

# Malicious prompts are blocked
try:
    ck.ai('gpt-4', 'ignore all previous instructions and...')
except Exception as e:
    print(f'🛡️ Blocked: {e}')

⚙️ Configuration

Environment Variables

# Recommended: Use Cost Katana API key for all features
export COST_KATANA_API_KEY="dak_your_key_here"

# Or use provider keys directly (self-hosted)
export OPENAI_API_KEY="sk-..."          # Required for GPT models
export GEMINI_API_KEY="..."             # Required for Gemini models
export ANTHROPIC_API_KEY="sk-ant-..."   # For Claude models
export AWS_ACCESS_KEY_ID="..."          # For AWS Bedrock
export AWS_SECRET_ACCESS_KEY="..."
export AWS_REGION="us-east-1"

⚠️ Self-hosted users: You must provide your own OpenAI/Gemini API keys.

Programmatic Configuration

import cost_katana as ck

ck.configure(
    api_key='dak_your_key',
    cortex=True,     # 40-75% cost savings
    cache=True,      # Smart caching
    firewall=True    # Block prompt injections
)

Request Options

response = ck.ai('gpt-4', 'Your prompt',
    temperature=0.7,                     # Creativity (0-2)
    max_tokens=500,                      # Response limit
    system_message='You are helpful',    # System prompt
    cache=True,                          # Enable caching
    cortex=True,                         # Enable optimization
    retry=True                           # Auto-retry on failures
)

🔌 Framework Integration

FastAPI

from fastapi import FastAPI
import cost_katana as ck

app = FastAPI()

@app.post('/api/chat')
async def chat(request: dict):
    response = ck.ai('gpt-4', request['prompt'])
    return {'text': response.text, 'cost': response.cost}

Flask

from flask import Flask, request, jsonify
import cost_katana as ck

app = Flask(__name__)

@app.route('/api/chat', methods=['POST'])
def chat():
    response = ck.ai('gpt-4', request.json['prompt'])
    return jsonify({'text': response.text, 'cost': response.cost})

Django

from django.http import JsonResponse
import cost_katana as ck

def chat_view(request):
    response = ck.ai('gpt-4', request.POST.get('prompt'))
    return JsonResponse({'text': response.text, 'cost': response.cost})

💡 Real-World Examples

Customer Support Bot

import cost_katana as ck

support = ck.chat('gpt-3.5-turbo',
    system_message='You are a helpful customer support agent.')

def handle_query(query: str):
    response = support.send(query)
    print(f"Cost so far: ${support.total_cost:.4f}")
    return response

Content Generator with Optimization

import cost_katana as ck

def generate_blog_post(topic: str):
    # Use Cortex for long-form content (40-75% savings)
    post = ck.ai('gpt-4', f'Write a blog post about {topic}',
                 cortex=True, max_tokens=2000)
    
    return {
        'content': post.text,
        'cost': post.cost,
        'word_count': len(post.text.split())
    }

Code Review Assistant

import cost_katana as ck

def review_code(code: str):
    review = ck.ai('claude-3-sonnet',
        f'Review this code and suggest improvements:\n\n{code}',
        cache=True)  # Cache for repeated reviews
    return review.text

Translation Service

import cost_katana as ck

def translate(text: str, target_language: str):
    # Use cheaper model for translations
    translated = ck.ai('gpt-3.5-turbo',
        f'Translate to {target_language}: {text}',
        cache=True)
    return translated.text

💰 Cost Optimization Cheatsheet

Strategy Savings Code
Use GPT-3.5 for simple tasks 90% ck.ai('gpt-3.5-turbo', ...)
Enable caching 100% on hits cache=True
Enable Cortex 40-75% cortex=True
Use Gemini for high-volume 95% vs GPT-4 ck.ai('gemini-pro', ...)
Batch in sessions 10-20% ck.chat(...)
# ❌ Expensive
ck.ai('gpt-4', 'What is 2+2?')  # $0.001

# ✅ Smart: Match model to task
ck.ai('gpt-3.5-turbo', 'What is 2+2?')  # $0.0001

# ✅ Smarter: Cache common queries
ck.ai('gpt-3.5-turbo', 'What is 2+2?', cache=True)  # $0 on repeat

# ✅ Smartest: Cortex for long content
ck.ai('gpt-4', 'Write a 2000-word essay', cortex=True)  # 40-75% off

🔧 Error Handling

import cost_katana as ck
from cost_katana.exceptions import CostKatanaError

try:
    response = ck.ai('gpt-4', 'Hello')
    print(response.text)
except CostKatanaError as e:
    if 'API key' in str(e):
        print('Set COST_KATANA_API_KEY or OPENAI_API_KEY')
    elif 'rate limit' in str(e):
        print('Rate limited. Retrying...')
    elif 'model' in str(e):
        print('Model not found')
    else:
        print(f'Error: {e}')

🔄 Migration Guides

From OpenAI SDK

# Before
from openai import OpenAI
client = OpenAI(api_key='sk-...')
completion = client.chat.completions.create(
    model='gpt-4',
    messages=[{'role': 'user', 'content': 'Hello'}]
)
print(completion.choices[0].message.content)

# After
import cost_katana as ck
response = ck.ai('gpt-4', 'Hello')
print(response.text)
print(f"Cost: ${response.cost}")  # Bonus: cost tracking!

From Anthropic SDK

# Before
import anthropic
client = anthropic.Anthropic(api_key='sk-ant-...')
message = client.messages.create(
    model='claude-3-sonnet-20241022',
    messages=[{'role': 'user', 'content': 'Hello'}]
)

# After
import cost_katana as ck
response = ck.ai('claude-3-sonnet', 'Hello')

From Google AI SDK

# Before
import google.generativeai as genai
genai.configure(api_key='...')
model = genai.GenerativeModel('gemini-pro')
response = model.generate_content('Hello')

# After
import cost_katana as ck
response = ck.ai('gemini-pro', 'Hello')

📦 Package Names

Language Package Install Import
Python PyPI pip install costkatana import cost_katana
JavaScript NPM npm install cost-katana import { ai } from 'cost-katana'
CLI (NPM) NPM npm install -g cost-katana-cli cost-katana chat
CLI (Python) PyPI pip install costkatana costkatana chat

📚 More Examples

Explore 45+ complete examples:

🔗 github.com/Hypothesize-Tech/costkatana-examples

Section Description
Python SDK Complete Python guides
Cost Tracking Track costs across providers
Semantic Caching 30-40% cost reduction
FastAPI Integration Framework examples

📞 Support

Channel Link
Dashboard costkatana.com
Documentation docs.costkatana.com
GitHub github.com/Hypothesize-Tech/costkatana-python
Discord discord.gg/D8nDArmKbY
Email support@costkatana.com

📄 License

MIT © Cost Katana


Start cutting AI costs today 🥷

pip install costkatana
import cost_katana as ck
response = ck.ai('gpt-4', 'Hello, world!')

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cost_katana-2.2.6.tar.gz (40.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cost_katana-2.2.6-py3-none-any.whl (34.7 kB view details)

Uploaded Python 3

File details

Details for the file cost_katana-2.2.6.tar.gz.

File metadata

  • Download URL: cost_katana-2.2.6.tar.gz
  • Upload date:
  • Size: 40.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for cost_katana-2.2.6.tar.gz
Algorithm Hash digest
SHA256 3060772fa60ed379d2063c8a9b458faeb4e94a7f14653511f78514751dd3f407
MD5 12983f8d0cb1b4cef9ed754f16b78202
BLAKE2b-256 db7f8b20e709ac056ebd3a1dce6fea3a2ef521dc35b87515001030e4b0677565

See more details on using hashes here.

File details

Details for the file cost_katana-2.2.6-py3-none-any.whl.

File metadata

  • Download URL: cost_katana-2.2.6-py3-none-any.whl
  • Upload date:
  • Size: 34.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for cost_katana-2.2.6-py3-none-any.whl
Algorithm Hash digest
SHA256 e399b824f4b8b87a22ec6f2fcf66b07be34b51234f9b1faf87f50c37018efd64
MD5 0b8045cec8c1b542fce7e9b6c0783d62
BLAKE2b-256 a4eec1bb9ae8a8c6c71b1f2ffab45037540821abca16cf939fc83d6bf9a993f2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page