A lightweight Python package for collecting LLM traces

These details have not been verified by PyPI

Project links

Project description

Burt Logger

A lightweight, production-ready Python package for collecting LLM training data. Automatically pipe LLM request/response data to your backend for model fine-tuning and dataset creation.

Features

✨ Non-blocking & Asynchronous - Uses background threads and queues to ensure zero impact on your application performance

🔄 Intelligent Batching - Automatically batches logs by size or time interval for optimal network efficiency

🛡️ Production-Ready - Thread-safe, graceful error handling, and automatic retry with exponential backoff

🚀 Minimal Dependencies - Only requires requests library, everything else from Python stdlib

⚙️ Highly Configurable - Customize batch sizes, flush intervals, queue sizes, retry logic, and more

🔌 Provider Agnostic - Works with OpenAI, Anthropic, or any LLM provider

Installation

pip install burt-logger

Or install from source:

git clone https://github.com/trainburt/burt-logger-python.git
cd burt-logger-python
pip install -e .

Quick Start

from burt_logger import LLMLogger

# Initialize the logger
logger = LLMLogger(
    endpoint="https://your-api.com/logs",
    api_key="your-api-key"
)

# Log your LLM requests and responses
response = openai.ChatCompletion.create(...)  # Your existing LLM call

logger.log(
    request={
        "model": "gpt-3.5-turbo",
        "messages": [...],
    },
    response={
        "content": response.choices[0].message.content,
        "usage": {
            "prompt_tokens": usage.get("prompt_tokens", 0),
            "completion_tokens": usage.get("completion_tokens", 0),
            "total_tokens": usage.get("total_tokens", 0),
        },
    }
)

# Gracefully shutdown (flushes remaining logs)
logger.shutdown()

That's it! The logger handles everything asynchronously in the background.

Using Context Manager

The logger supports context managers for automatic cleanup:

with LLMLogger(endpoint="...", api_key="...") as logger:
    # Your code here
    logger.log(request=..., response=...)
    # Automatic shutdown and flush on exit

Configuration

The LLMLogger class accepts the following parameters:

Parameter	Type	Default	Description
`endpoint`	str	Required	Backend API endpoint to send logs to
`api_key`	str	Required	API key for authentication
`batch_size`	int	10	Number of logs to batch before sending
`flush_interval`	float	5.0	Seconds to wait before flushing incomplete batch
`max_queue_size`	int	10000	Maximum number of logs to queue
`max_retries`	int	3	Maximum number of retry attempts
`initial_retry_delay`	float	1.0	Initial delay for exponential backoff (seconds)
`max_retry_delay`	float	60.0	Maximum retry delay (seconds)
`timeout`	float	10.0	HTTP request timeout (seconds)
`debug`	bool	False	Enable debug logging

Example with Custom Configuration

logger = LLMLogger(
    endpoint="https://your-api.com/logs",
    api_key="your-api-key",
    batch_size=20,           # Send in batches of 20
    flush_interval=10.0,     # Or every 10 seconds
    max_queue_size=50000,    # Large queue for high-volume apps
    max_retries=5,           # More retries for flaky networks
    debug=True,              # See what's happening
)

API Reference

`log(request, response, metadata=None)`

Log an LLM request/response pair.

Parameters:

request (dict): The LLM request data (prompt, model, parameters, etc.)
response (dict): The LLM response data (completion, tokens, etc.)
metadata (dict, optional): Additional metadata (user_id, session_id, etc.)

Returns:

bool: True if log was queued successfully, False if queue is full

Example:

success = logger.log(
    request={"model": "gpt-4", "prompt": "..."},
    response={"completion": "...", "tokens": 150},
    metadata={"user_id": "123", "environment": "production"}
)

`flush(timeout=None)`

Flush all queued logs and wait for them to be sent.

Parameters:

timeout (float, optional): Maximum time to wait in seconds. None means wait indefinitely.

Example:

logger.flush(timeout=5.0)  # Wait up to 5 seconds

`shutdown(timeout=10.0)`

Gracefully shutdown the logger, flushing all remaining logs.

Parameters:

timeout (float): Maximum time to wait for shutdown in seconds

Example:

logger.shutdown(timeout=10.0)

`get_stats()`

Get statistics about logger performance.

Returns:

dict: Dictionary containing statistics

Example:

stats = logger.get_stats()
print(stats)
# {
#     'logs_queued': 150,
#     'logs_sent': 145,
#     'logs_failed': 5,
#     'batches_sent': 15,
#     'batches_failed': 1
# }

How It Works

Queueing: When you call log(), the entry is immediately added to a thread-safe queue and the method returns instantly (non-blocking)
Batching: A background worker thread monitors the queue and batches logs based on:
- Batch size (e.g., 10 logs)
- Time interval (e.g., every 5 seconds)
Sending: Batches are sent to your backend API via HTTP POST with proper authentication headers
Retry Logic: If sending fails:
- 5xx errors: Retries with exponential backoff
- 429 (rate limit): Retries with exponential backoff
- 4xx errors: No retry (client error)
- Network errors: Retries with exponential backoff
Shutdown: On program exit or explicit shutdown, all remaining logs are flushed

Backend API Expected Format

Your backend should expect POST requests with the following format:

Headers:

Content-Type: application/json
Authorization: Bearer <api_key>

Payload:

{
  "logs": [
    {
      "request": { /* your request data */ },
      "response": { /* your response data */ },
      "metadata": { /* optional metadata */ },
      "timestamp": 1234567890.123
    },
    ...
  ],
  "timestamp": 1234567890.456
}

Expected Response:

Success: HTTP 200, 201, or 202
Server Error: HTTP 5xx (will retry)
Client Error: HTTP 4xx (will not retry)
Rate Limited: HTTP 429 (will retry)

Error Handling

The logger is designed to be resilient and never crash your application:

Queue Full: If the queue is full, log() returns False and the log is dropped
Network Errors: Automatic retry with exponential backoff
Backend Down: Retries up to max_retries times, then drops the batch
Thread Crashes: The worker thread is monitored and restarted if needed

All errors are logged to Python's logging system. Enable debug mode to see detailed logs:

logger = LLMLogger(..., debug=True)

Testing

Run the test suite:

# Install dev dependencies
pip install -e ".[dev]"

# Run tests
pytest tests/ -v

# With coverage
pytest tests/ --cov=burt_logger --cov-report=html

Development

# Clone the repository
git clone https://github.com/trainburt/burt-logger-python.git
cd burt-logger-python

# Install in development mode
pip install -e ".[dev]"

# Run tests
pytest

# Format code
black burt_logger/ tests/

# Lint
flake8 burt_logger/ tests/

Performance Considerations

Non-blocking: log() calls take ~0.001ms (just queue insertion)
Memory: Each log entry is ~1-5KB. Default max queue size is 10,000 logs = ~10-50MB
Network: Batching reduces network overhead. 1000 logs/second = 100 batches (batch_size=10)
Threads: Uses a single background worker thread

Production Recommendations

Set appropriate batch_size: Larger batches are more efficient but increase memory usage
```
logger = LLMLogger(..., batch_size=50)  # For high-volume apps
```

Monitor queue size: If logs are being dropped, increase max_queue_size or reduce traffic

stats = logger.get_stats()
if stats['logs_failed'] > 0:
    # Handle appropriately

Use metadata: Add user_id, session_id, etc. for better data analysis
```
logger.log(..., metadata={"user_id": user_id, "env": "prod"})
```
Graceful shutdown: Always call shutdown() or use context manager
```
import atexit
atexit.register(logger.shutdown)
```

License

MIT License - see LICENSE file for details

Support

Issues: https://github.com/yourusername/burt-logger-python/issues
Email: support@burt.ai

Contributing

Contributions are welcome! Please:

Fork the repository
Create a feature branch
Add tests for new functionality
Ensure all tests pass
Submit a pull request

Changelog

0.1.0 (Initial Release)

✅ Non-blocking asynchronous logging
✅ Intelligent batching (by size and time)
✅ Thread-safe operations
✅ Retry with exponential backoff
✅ Graceful shutdown and cleanup
✅ Comprehensive test suite
✅ Context manager support
✅ Statistics tracking

Built with ❤️ for the LLM training data collection community

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.1

Nov 23, 2025

This version

0.1.0

Nov 23, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

burt_logger-0.1.0.tar.gz (32.4 kB view details)

Uploaded Nov 23, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

burt_logger-0.1.0-py3-none-any.whl (13.1 kB view details)

Uploaded Nov 23, 2025 Python 3

File details

Details for the file burt_logger-0.1.0.tar.gz.

File metadata

Download URL: burt_logger-0.1.0.tar.gz
Upload date: Nov 23, 2025
Size: 32.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for burt_logger-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`147001dd39f9438e3b13817c21afdeec4b524cd4d0da9e5520ec5a2f6ac1a370`
MD5	`2f4e61aa50f9b88c175849afbca2d44d`
BLAKE2b-256	`566854f1dd9685db5eb4d9af42c54710fc38abc93d56b6be92504cec7ce43b03`

See more details on using hashes here.

File details

Details for the file burt_logger-0.1.0-py3-none-any.whl.

File metadata

Download URL: burt_logger-0.1.0-py3-none-any.whl
Upload date: Nov 23, 2025
Size: 13.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for burt_logger-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`424a74247d2c7efe416b4fcca5afc90a0fe2920217155349fb561b391c9e16c3`
MD5	`a8123bb3caf1194bbc95661c785e0e96`
BLAKE2b-256	`bab42a6b54559845337b36fbd25c89e4ccf2cc389f1ccc1cec0336652657a07f`

See more details on using hashes here.

burt-logger 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Burt Logger

Features

Installation

Quick Start

Using Context Manager

Configuration

Example with Custom Configuration

API Reference

log(request, response, metadata=None)

flush(timeout=None)

shutdown(timeout=10.0)

get_stats()

How It Works

Backend API Expected Format

Error Handling

Testing

Development

Performance Considerations

Production Recommendations

License

Support

Contributing

Changelog

0.1.0 (Initial Release)

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`log(request, response, metadata=None)`

`flush(timeout=None)`

`shutdown(timeout=10.0)`

`get_stats()`