Skip to main content

Measure and analyze Python logging costs

Project description

LogCost

Your cloud logging bill is $2,000/month. One debug statement in a hot path is responsible for $800 of it.

LogCost finds expensive log statements by tracking and aggregating logs at the source code level. Drop-in instrumentation (just import logcost) pinpoints which lines generate the most data, helping you cut cloud logging costs by 40-60% without guessing.

See exactly what's costing you:

src/memory_utils.py:338
  DEBUG: Processing step: %s
  315 GB  |  $157.50  |  1.2M calls

Instead of wondering where your logging costs go, LogCost shows the exact file:line, bytes logged, cost, and call count. Fix the top offenders and save hundreds monthly.

Features

  • Zero-config tracking - Monkey-patches logging and print to measure file/line, level, message template, call count, and bytes
  • Aggregation by location - Logs from the same file:line:level are aggregated together, regardless of message content. This means a loop logging 1000 times shows as one entry with count=1000, not 1000 separate entries
  • Thread-safe - Lock-protected tracking works across concurrent requests
  • Framework support - Examples for Flask, FastAPI, Django, Kubernetes
  • Export options - JSON, CSV, Prometheus, HTML reports
  • Cost analysis - Compute GCP/AWS/Azure cost estimates and identify anti-patterns
  • Performance - Low overhead design for production use
  • GCloud source attribution - Preserves actual source file:line in logs, not wrapper attribution

Quick Start

pip install logcost
import logcost
import logging

logging.getLogger().setLevel(logging.INFO)
logging.info("Processing user %s", 123)

stats_file = logcost.export("/tmp/logcost_stats.json")
print("Exported to", stats_file)

Analyze the results:

python -m logcost.cli analyze /tmp/logcost_stats.json --provider gcp --top 5

Example Output:

Provider: GCP  Currency: USD
Total bytes: 900,000,000,000  Estimated cost: 450.00 USD

Top 5 cost drivers:
- src/memory_utils.py:338 [DEBUG] Processing step: %s... 157.5000 USD
- _trace.py:87 [INFO] connect_tcp.started host='api.github...' 112.5000 USD
- _base_client.py:452 [DEBUG] Request options: %s... 67.5000 USD
- connectionpool.py:544 [DEBUG] %s://%s:%s "%s %s %s" %s %s... 45.0000 USD
- streamable_http.py:385 [DEBUG] Sending client message: root=JSONRPCRequest... 67.5000 USD

Detected anti-patterns:
  * DEBUG level logs producing non-zero cost
  * High-frequency logs (>1000 calls) in hot paths
  * Large payload logging (>5KB per call)

Recommendations:
  * Remove DEBUG statement at src/memory_utils.py:338 - potential $157.50/month savings
  * Silence httpcore DEBUG logging - potential $112.50/month savings

Real-world example: A typical service logging 900 GB/month (30 GB/day average) would see costs like this with GCP, with debug statements and library tracing accounting for the majority of the bill.

Installation

From PyPI (Recommended)

pip install logcost

From Source

git clone https://github.com/ubermorgenland/LogCost.git
cd LogCost
pip install -e .

# Run tests
pytest tests/

# Install with development dependencies
pip install -e ".[dev]"

Usage

Basic Tracking

import logcost  # auto-installs tracker on import
import logging

logger = logging.getLogger(__name__)

# Your normal logging code
logger.info("Processing order %s", order_id)
logger.debug("User data for %s", user_id)
print("Debug output")  # print() is also tracked

# Export stats (automatically on exit, or manually)
stats_path = logcost.export("/tmp/logcost_stats.json")

Controlling External Library Logging

External libraries often emit DEBUG-level logs that can inflate logging costs without providing production value. Silence them selectively:

import logging
import logcost

# Silence external library debug logs (production best practice)
logging.getLogger("httpcore").setLevel(logging.WARNING)      # HTTP connection tracing
logging.getLogger("httpx").setLevel(logging.WARNING)         # HTTP client
logging.getLogger("anthropic").setLevel(logging.WARNING)     # Anthropic SDK
logging.getLogger("urllib3").setLevel(logging.WARNING)       # urllib3

# Your code's logs still tracked - just reduced noise from dependencies
logger = logging.getLogger(__name__)
logger.info("Important app event")  # Still tracked by LogCost

LogCost will still track these suppressed logs if they somehow get through, but setting appropriate levels prevents the noisy ones from being generated in the first place.

Skipping Helper Modules

If you wrap logging in helper utilities:

import logcost

# Ignore helper frames to attribute cost to original caller
logcost.ignore_module("myapp.logging_helpers")

Long-Running Services

For services that don't exit:

import signal
import logcost

def handle_sigusr1(signum, frame):
    logcost.export("/tmp/logcost_snapshot.json")

signal.signal(signal.SIGUSR1, handle_sigusr1)

Or use the CLI:

python -m logcost.cli capture /tmp/logcost_stats.json

Slack Notifications

Get proactive alerts about logging costs in your Slack channel:

Setup:

  1. Create a Slack Incoming Webhook:

    • Go to https://api.slack.com/messaging/webhooks
    • Click "Create your Slack app" → "Incoming Webhooks"
    • Activate and create a webhook for your channel
    • Copy the webhook URL (e.g., https://hooks.slack.com/services/T00.../B00.../XXX...)
  2. Configure environment variables:

export LOGCOST_SLACK_WEBHOOK="https://hooks.slack.com/services/YOUR/WEBHOOK/URL"
export LOGCOST_PROVIDER="gcp"  # or "aws", "azure"
export LOGCOST_NOTIFICATION_TOP_N="5"  # number of top logs to show

Usage:

Automatic notifications with periodic flush:

import logcost

# Start periodic flush - automatically sends Slack notifications
logcost.start_periodic_flush("/var/log/logcost/stats.json")
# Stats flushed every 5 minutes (LOGCOST_FLUSH_INTERVAL=300)
# Notifications sent every 1 hour (LOGCOST_NOTIFICATION_INTERVAL=3600, configurable)

Manual notification:

import logcost
from logcost import send_notification_if_configured

stats = logcost.get_stats()
send_notification_if_configured(stats)  # Uses LOGCOST_SLACK_WEBHOOK env var

Notification includes:

  • Total logging cost and volume
  • Top N most expensive log statements with file:line references
  • Anti-pattern warnings (DEBUG in production, high-frequency loops, large payloads)
  • Week-over-week trend (if available)

Example Slack Notification:

LogCost Report - GCP
Total: 900.00 GB ($450.00)
Log calls: 2,847,000
Trend: 📈 +12% from previous period

🔥 Top 5 Most Expensive Logs:
1. src/memory_utils.py:338 - $157.50 (315.00 GB, 1.2M calls)
   Processing step: %s...
2. _trace.py:87 - $112.50 (225.00 GB, 2.8M calls)
   connect_tcp.started host='api.github...'
3. _base_client.py:452 - $67.50 (135.00 GB, 8.4K calls)
   Request options: %s...
4. connectionpool.py:544 - $45.00 (90.00 GB, 1.1M calls)
   %s://%s:%s "%s %s %s" %s %s...
5. streamable_http.py:385 - $67.50 (135.00 GB, 850K calls)
   Sending client message: root=JSONRPCRequest...

⚠️  Warnings:
• DEBUG level logs producing non-zero cost
• High-frequency logs (>1000 calls) in hot paths
• Large payload logging (>5KB per call)

Total logs tracked: 45 unique locations | Analyzed with LogCost

Security Note: The webhook URL is a credential - treat it like a password. Never commit it to version control. Use environment variables, Kubernetes secrets, or secrets managers.

CLI Commands

Analyze

Show top expensive log statements:

python -m logcost.cli analyze stats.json --top 10 --provider gcp

Report

Export analysis to JSON:

python -m logcost.cli report stats.json reports/analysis.json

Estimate ROI

Calculate potential savings:

python -m logcost.cli estimate stats.json --reduction 0.4 --hours 12 --rate 120
  • --reduction: Expected cost reduction (0.4 = 40%)
  • --hours: Engineering hours to fix
  • --rate: Hourly rate in USD

Diff

Compare before/after:

python -m logcost.cli diff stats_before.json stats_after.json

Capture

Snapshot running service:

python -m logcost.cli capture /tmp/logcost_stats.json

Framework Integration

Flask / WSGI

import logcost
from flask import Flask

app = Flask(__name__)
logger = app.logger

@app.route("/")
def hello():
    logger.info("Homepage accessed")
    return "Hello"

if __name__ == "__main__":
    app.run()
    # Stats exported automatically on exit

See examples/flask_app/ for full example.

FastAPI / ASGI

import logcost
from fastapi import FastAPI

app = FastAPI()

@app.get("/")
async def root():
    logger.info("Root endpoint hit")
    return {"message": "Hello"}

The tracker works with async code since it hooks the core logging machinery. See examples/fastapi_app/ for complete demo.

Django

Import in settings.py so tracker attaches before middleware:

# settings.py
import logcost

# ... rest of settings

Run your app and export stats:

python manage.py runserver
# In another terminal
python -m logcost.cli capture /tmp/django_logcost.json

See examples/django_app/ for full setup.

Docker & Kubernetes (Sidecar Pattern)

For production deployments, LogCost uses a sidecar architecture that separates logging from monitoring:

Architecture:

  • App Container: Your application with LogCost library installed, writes stats to shared volume
  • Sidecar Container: LogCost monitoring container that watches stats, aggregates data, stores history, and sends notifications

Benefits: Separation of concerns, reusable sidecar, no application code changes after setup

Build and Publish Docker Image

Build locally:

cd LogCost/
docker build -t logcost/logcost:latest .

Publish to Docker Hub (requires Docker Hub account):

# Login to Docker Hub
docker login

# Build and push
docker build -t your-username/logcost:latest .
docker push your-username/logcost:latest

# Or build for multiple architectures (recommended)
docker buildx build --platform linux/amd64,linux/arm64 \
  -t your-username/logcost:latest \
  -t your-username/logcost:v0.1.0 \
  --push .

See DOCKER.md for complete publishing guide including GitHub Actions automation, other registries (GCR, ECR, ACR), security scanning, and versioning strategy.

Kubernetes Deployment

Add LogCost sidecar to your deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp-with-logcost
spec:
  template:
    spec:
      containers:
      # Your application
      - name: app
        image: your-registry/myapp:latest
        env:
        - name: LOGCOST_OUTPUT
          value: /var/log/logcost/stats.json
        - name: LOGCOST_FLUSH_INTERVAL
          value: "300"  # 5 minutes
        volumeMounts:
        - name: logcost-data
          mountPath: /var/log/logcost

      # LogCost sidecar
      - name: logcost-sidecar
        image: logcost/logcost:latest
        env:
        - name: LOGCOST_NOTIFICATION_INTERVAL
          value: "3600"  # 1 hour
        - name: LOGCOST_PROVIDER
          value: gcp  # or aws, azure
        - name: LOGCOST_SLACK_WEBHOOK
          valueFrom:
            secretKeyRef:
              name: logcost-slack-webhook
              key: webhook-url
        volumeMounts:
        - name: logcost-data
          mountPath: /var/log/logcost
        resources:
          requests:
            memory: "64Mi"
            cpu: "50m"
          limits:
            memory: "128Mi"
            cpu: "100m"

      volumes:
      - name: logcost-data
        emptyDir: {}

Your app code needs one line:

import logcost
logcost.start_periodic_flush("/var/log/logcost/stats.json")

The sidecar will automatically:

  • Watch for stats updates
  • Store historical snapshots (7 days retention)
  • Send hourly Slack notifications with trends
  • Detect anti-patterns (DEBUG in production, high-frequency logs, large payloads)

File Permissions (Important): The sidecar container runs as UID 1000 and needs read access to stats.json. Solution: Use an init container to set up the shared volume with permissive permissions:

initContainers:
- name: setup-logcost-perms
  image: busybox:latest
  command: ['sh', '-c', 'mkdir -p /var/log/logcost && chmod 777 /var/log/logcost']
  volumeMounts:
  - name: logcost-data
    mountPath: /var/log/logcost

This ensures both containers can read/write to the shared volume. See examples/kubernetes/deployment-with-init-container.yaml for a complete working example.

Cost Calculation

The analyzer estimates cost using:

cost = (bytes_emitted / 1GB) × price_per_gb

Default Pricing:

  • GCP: $0.50/GB
  • AWS: $0.57/GB
  • Azure: $0.63/GB

Override pricing:

from logcost.analyzer import CostAnalyzer

analyzer = CostAnalyzer(stats, price_per_gb=0.75)

Or via CLI:

python -m logcost.cli analyze stats.json --provider gcp

Anti-Pattern Detection

The analyzer flags:

  • High-frequency logs - Statements executed >1,000 times (likely tight loops)
  • Debug logs in production - DEBUG level logs producing non-zero cost
  • Large payloads - Messages exceeding 5 KB per call

ROI Calculation

potential_savings = total_cost × reduction_percent
effort_cost = hours_to_fix × hourly_rate
roi = (potential_savings - effort_cost) / effort_cost

Example:

python -m logcost.cli estimate stats.json --reduction 0.5 --hours 8 --rate 100

Output:

Potential monthly savings: $250.00
Effort cost: $800.00
ROI: -68.75% (not worth it)

FAQ

Does LogCost change my logging behavior? No. It wraps logging.Logger._log and print but always calls the original implementation after recording stats.

What about other logging libs (structlog, loguru)? Most delegate to Python's logging module. If not, you can manually call logcost.tracker._track_call(). Adapters are planned.

How often should I export? For scripts, rely on the built-in atexit export. For long-running services, export on intervals (cron, signal handler, or sidecar) to avoid losing stats on crashes.

Is tracking configurable? Use logcost.ignore_module("module.prefix") to skip helper frames. Anti-pattern thresholds are constants in logcost/analyzer.py (PRs welcome for config support).

Performance impact? Designed for low overhead (lock-protected dict updates + string formatting). Run the benchmark to measure on your hardware:

python benchmarks/tracker_benchmark.py --iterations 100000

Examples

  • examples/flask_app/ - Classic Flask app with tracked routes
  • examples/fastapi_app/ - Async FastAPI integration
  • examples/django_app/ - Minimal Django project with LogCost
  • examples/kubernetes/ - K8s deployment + sidecar pattern

Contributing

Contributions welcome! See CONTRIBUTING.md for guidelines.

License

MIT License - see LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

logcost-0.1.5.tar.gz (36.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

logcost-0.1.5-py3-none-any.whl (21.8 kB view details)

Uploaded Python 3

File details

Details for the file logcost-0.1.5.tar.gz.

File metadata

  • Download URL: logcost-0.1.5.tar.gz
  • Upload date:
  • Size: 36.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for logcost-0.1.5.tar.gz
Algorithm Hash digest
SHA256 d0e0178234bbe07290617dcad2613b9f0e1810de0cd0f4c615770d8cfe84fd16
MD5 8c8b62338fd32f628c85f9dfb4b31268
BLAKE2b-256 e4b256aa7864cfb676d37e83eb9cc66f73c7353cc88ddeda71c3f68eb44136f5

See more details on using hashes here.

File details

Details for the file logcost-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: logcost-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 21.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for logcost-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 578b98286a01e3758375c34a82c10f19adea72ca5b23a2330153eaddf99db121
MD5 01512172d68441521f2757a2e69457aa
BLAKE2b-256 2dcee2a0386124e801ab0573cb42f0e065a318c20f00685e848b49e346bbefa1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page