Structured logging with TSV storage and compression
Project description
dbbasic-logs
Structured logging with TSV storage and compression.
Philosophy: "Log everything. Query anything. Compress the rest."
Part of the DBBasic family of tiny Python modules.
Features
- Simple API: One-line logging, no setup required
- Structured: TSV format with JSON context
- Unix-Native: Plain text files, grep-able, compressible
- Zero Config: Works out of the box
- Searchable: Built-in search with regex support
- Compressed: Automatic gzip compression for old logs
- Multiple Log Types: Separate files for app, errors, and access logs
Installation
pip install dbbasic-logs
Quick Start
from dbbasic_logs import log
# Basic logging
log.info("User logged in", user_id=42, ip="192.168.1.1")
log.warning("Rate limit approaching", user_id=42, count=95)
log.error("Payment failed", order_id=123, error="Timeout")
log.debug("Cache miss", key="user:42")
# Exception logging with automatic stack traces
try:
process_payment(order)
except Exception as e:
log.exception("Payment processing failed", order_id=order.id)
raise
# HTTP access logging
log.access(
method="GET",
path="/api/users",
status=200,
duration=0.05,
ip=request.remote_addr
)
Storage Format
Logs are stored as TSV (tab-separated values) files organized by type and date:
data/logs/
app/
2025-10-09.tsv # Today's application logs
2025-10-08.tsv.gz # Yesterday's logs (compressed)
2025-10-07.tsv.gz # Older logs (compressed)
errors/
2025-10-09.tsv # Today's error logs with stack traces
2025-10-08.tsv.gz
access/
2025-10-09.tsv # Today's HTTP access logs
2025-10-08.tsv.gz
TSV Format
Each log entry is a single line with 4 tab-separated columns:
timestamp level message context
Example:
1696886400 INFO User logged in {"user_id":42,"ip":"192.168.1.1"}
1696886401 ERROR Payment failed {"order_id":123,"error":"Timeout"}
Querying Logs
Python API
from dbbasic_logs import log
# Search for errors in last 7 days
errors = log.search("ERROR", log_type='errors', days=7)
for error in errors:
print(f"{error['timestamp']}: {error['message']}")
print(f" Context: {error['context']}")
# Find user activity (regex search)
user_logs = log.search(r"user_id.*42", log_type='all', days=30)
# Get recent logs
recent = log.tail('app', lines=100)
# Find slow requests
slow_requests = log.search(r'duration.*[5-9]\.', log_type='access', days=1)
Command Line (Unix Way)
# All errors today
grep ERROR data/logs/app/2025-10-09.tsv
# All errors last 7 days (including compressed)
zgrep ERROR data/logs/app/2025-10-*.tsv*
# Specific user activity
zgrep 'user_id.*42' data/logs/app/*.tsv*
# Count 500 errors
grep "status.*500" data/logs/access/2025-10-09.tsv | wc -l
# Slow requests (> 1 second)
grep "duration.*[1-9]\." data/logs/access/2025-10-09.tsv
# View logs with the included script
./scripts/view-logs.sh tail app 100
./scripts/view-logs.sh search ERROR app 7
./scripts/view-logs.sh today errors
./scripts/view-logs.sh errors 1
Log Rotation & Compression
The included rotation script automatically:
- Compresses yesterday's logs (10:1 compression ratio)
- Deletes old logs based on retention policy
- Keeps errors longer than access logs
Setup Automatic Rotation
# Install to cron (runs daily)
sudo cp scripts/rotate-logs.sh /etc/cron.daily/dbbasic-logs-rotate
sudo chmod +x /etc/cron.daily/dbbasic-logs-rotate
Manual Rotation
./scripts/rotate-logs.sh
Custom Retention Policies
# Set custom retention (in days)
export APP_RETENTION_DAYS=30 # App logs (default: 30)
export ACCESS_RETENTION_DAYS=7 # Access logs (default: 30)
export ERROR_RETENTION_DAYS=90 # Error logs (default: 90)
./scripts/rotate-logs.sh
Configuration
Custom Log Directory
from dbbasic_logs import DBBasicLogger
# Use custom directory
logger = DBBasicLogger(log_dir='/var/log/myapp')
logger.info("Custom location", app="myapp")
Environment Variable
# Set globally via environment
export LOG_DIR=/var/log/myapp
# Now all logs go to /var/log/myapp
python your_app.py
Integration Examples
Flask Application
from flask import Flask, request
from dbbasic_logs import log
import time
app = Flask(__name__)
@app.before_request
def log_request_start():
request.start_time = time.time()
@app.after_request
def log_request_end(response):
duration = time.time() - request.start_time
log.access(
method=request.method,
path=request.path,
status=response.status_code,
duration=duration,
ip=request.remote_addr,
user_agent=request.user_agent.string
)
return response
@app.route('/api/users/<user_id>')
def get_user(user_id):
log.info("Fetching user", user_id=user_id, ip=request.remote_addr)
try:
user = User.get(user_id)
if not user:
log.warning("User not found", user_id=user_id)
return {"error": "Not found"}, 404
return user.to_dict()
except Exception as e:
log.exception("Error fetching user", user_id=user_id)
return {"error": "Internal error"}, 500
Django Middleware
from dbbasic_logs import log
import time
class LoggingMiddleware:
def __init__(self, get_response):
self.get_response = get_response
def __call__(self, request):
start_time = time.time()
response = self.get_response(request)
duration = time.time() - start_time
log.access(
method=request.method,
path=request.path,
status=response.status_code,
duration=duration,
ip=request.META.get('REMOTE_ADDR'),
)
return response
Background Job Queue
from dbbasic_logs import log
def process_jobs(handlers):
jobs = get_pending_jobs()
log.info("Processing jobs", count=len(jobs))
for job in jobs:
try:
log.info("Starting job", job_id=job['id'], type=job['type'])
start = time.time()
result = handlers[job['type']](job['payload'])
duration = time.time() - start
log.info("Job completed", job_id=job['id'], duration=duration)
except Exception as e:
log.exception("Job failed", job_id=job['id'], type=job['type'])
Performance
Benchmarks
| Operation | Time | Notes |
|---|---|---|
| Write log | 0.1ms | Append to file |
| Search today | 0.5s | grep 10MB file |
| Search compressed | 2s | zgrep 1MB file |
| Tail recent | 0.01s | Read last N lines |
Storage
| Daily Logs | Uncompressed | Compressed | 30 Days Total |
|---|---|---|---|
| Low traffic | 1MB | 100KB | 3MB |
| Medium traffic | 10MB | 1MB | 30MB |
| High traffic | 100MB | 10MB | 300MB |
Even high-traffic sites: < 1GB for 30 days of logs
Comparison to Alternatives
vs. Sentry (Error Tracking SaaS)
Sentry:
- Setup: SDK integration, API keys
- Cost: $29-$299/month
- Features: Grouping, alerts, dashboards
- Privacy: Sends errors to third-party
dbbasic-logs:
- Setup: Import and use
- Cost: $0
- Features: TSV storage, grep search
- Privacy: All local
vs. Python logging + RotatingFileHandler
Python stdlib:
# 8 lines just for config
import logging
from logging.handlers import RotatingFileHandler
handler = RotatingFileHandler('app.log', maxBytes=10000, backupCount=5)
formatter = logging.Formatter('%(asctime)s - %(levelname)s - %(message)s')
handler.setFormatter(formatter)
logger = logging.getLogger('app')
logger.addHandler(handler)
logger.info("User logged in", extra={'user_id': 42}) # Awkward
dbbasic-logs:
from dbbasic_logs import log
log.info("User logged in", user_id=42) # Natural
vs. ELK Stack
ELK:
- Setup: Docker compose with 3 services
- Memory: 4GB+ RAM
- Complexity: High
dbbasic-logs:
- Setup: Import
- Memory: 0 (just files)
- Complexity: Low
Design Principles
- Foundational: All other DBBasic modules use this
- Simple: One-line logging, no setup
- Structured: TSV format, queryable
- Unix-Native: Plain text, grep-able, compressible
- Zero Config: Works out of the box
Security & Privacy
What NOT to Log
Never log sensitive data:
# BAD - Don't do this
log.info("User logged in", password=password) # ❌
log.info("Payment", credit_card=card_number) # ❌
# GOOD - Log safely
log.info("User logged in", user_id=user.id) # ✓
log.info("Payment", order_id=order.id) # ✓
Log File Permissions
Protect your logs:
# Restrict access to logs
chmod 700 data/logs
chmod 600 data/logs/*/*.tsv*
Contributing
This module is part of the DBBasic project. Contributions welcome!
- GitHub: https://github.com/askrobots/dbbasic-logs
- Documentation: https://dbbasic.com/logs-spec
License
MIT License - see LICENSE file for details
Credits
Created as part of the DBBasic project - a collection of tiny, composable Python web framework modules.
Philosophy: Keep it simple. Keep it readable. Keep it under 500 lines.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dbbasic_logs-1.0.0.tar.gz.
File metadata
- Download URL: dbbasic_logs-1.0.0.tar.gz
- Upload date:
- Size: 13.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1d2aee46259b3e39c9d87063570a2465a12b751d17f32daf07207a34ec428ac9
|
|
| MD5 |
5e695756c2d0ae3bffe5eb04c7c4894b
|
|
| BLAKE2b-256 |
a64a723ce31d87346b5a6f94d7d5be52930b15c6b4709686d282cdfb63ee47e8
|
File details
Details for the file dbbasic_logs-1.0.0-py3-none-any.whl.
File metadata
- Download URL: dbbasic_logs-1.0.0-py3-none-any.whl
- Upload date:
- Size: 11.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1d156d83b47dc05abc33d92442f79f160780342ccd35959a324bf1baac6b31ef
|
|
| MD5 |
76e0bd62b3ec04083a1dfcff038c0abd
|
|
| BLAKE2b-256 |
c00ce69762cea592a190a818e128cf0b7e79a855fc325e004c41994aa4777d21
|