Tracelet is an automagic pytorch metric exporter
Project description
tracelet
Tracelet is an intelligent experiment tracking library that automatically captures PyTorch and PyTorch Lightning metrics, seamlessly integrating with popular experiment tracking platforms through a modular plugin system.
Key Features
๐ Modular Plugin System
- Dynamic plugin discovery and lifecycle management
- Easy to extend with custom backends and collectors
- Thread-safe metric routing with configurable workers
- Dependency resolution for complex plugin hierarchies
๐ Automatic Metric Capture
- ๐ PyTorch TensorBoard integration - automatically captures
writer.add_scalar()calls - โก PyTorch Lightning support - seamlessly tracks trainer metrics
- ๐ System metrics monitoring (CPU, Memory, GPU support planned)
- ๐ Automatic Git repository and environment tracking
๐ฏ Production-Ready Backends
- MLflow - Local and remote server support with full experiment tracking
- ClearML - Enterprise-grade experiment management with artifact storage
- Weights & Biases - Cloud-based tracking with rich visualizations
- AIM - Open-source experiment tracking with powerful UI
๐ก๏ธ Robust Architecture
- Thread-safe data flow orchestration
- Backpressure handling for high-frequency metrics
- Configurable metric routing and filtering
- Comprehensive error handling and logging
Installation
Install the base package (includes PyTorch, TensorBoard, and W&B):
pip install tracelet
Additional Backend Dependencies
Install specific backends as needed:
# Additional backend integrations
pip install tracelet[mlflow] # MLflow backend
pip install tracelet[clearml] # ClearML backend
pip install tracelet[aim] # AIM backend (Python <3.13)
# Framework integrations
pip install tracelet[lightning] # PyTorch Lightning support
# Install multiple extras
pip install tracelet[mlflow,clearml] # Multiple backends
pip install tracelet[backends] # All backends
pip install tracelet[all] # Everything
Base dependencies included: PyTorch, TorchVision, TensorBoard, Weights & Biases, GitPython, Psutil
Supported Python versions: 3.9, 3.10, 3.11, 3.12, 3.13
Note: The AIM backend currently requires Python <3.13 due to dependency constraints.
Demo
๐บ See Tracelet in action! Click the button above to download and watch our demo video showing how easy it is to get started with automatic experiment tracking.
Note: GitHub doesn't support embedded video playback in README files. The link above will download the MP4 file directly.
Quick Start
Basic Usage
import tracelet
import torch
from torch.utils.tensorboard import SummaryWriter
# Start experiment tracking with your preferred backend
tracelet.start_logging(
exp_name="my_experiment",
project="my_project",
backend="mlflow" # or "clearml", "wandb", "aim"
)
# Use TensorBoard as usual - metrics are automatically captured
writer = SummaryWriter()
for epoch in range(100):
loss = train_one_epoch() # Your training logic
writer.add_scalar('Loss/train', loss, epoch)
# Metrics are automatically sent to MLflow!
# Stop tracking when done
tracelet.stop_logging()
PyTorch Lightning Integration
import tracelet
import pytorch_lightning as pl
# Start Tracelet before training
tracelet.start_logging("lightning_experiment", backend="clearml")
# Train your model - all Lightning metrics are captured
trainer = pl.Trainer(max_epochs=10)
trainer.fit(model, datamodule)
# Experiment data is automatically tracked!
tracelet.stop_logging()
Advanced Configuration
import tracelet
from tracelet import get_active_experiment
# Start with custom configuration
experiment = tracelet.start_logging(
exp_name="advanced_example",
project="my_project",
backend="mlflow",
config={
"track_system": True, # System monitoring
"metrics_interval": 5.0, # Log every 5 seconds
"track_git": True, # Git info tracking
"track_env": True, # Environment capture
"track_tensorboard": True, # Auto-capture TB metrics
"track_lightning": True, # PyTorch Lightning support
}
)
# Log custom parameters
experiment.log_params({
"model": "resnet50",
"batch_size": 32,
"learning_rate": 0.001
})
# Log custom metrics programmatically
for epoch in range(10):
metrics = train_epoch()
experiment.log_metric("accuracy", metrics["acc"], epoch)
experiment.log_metric("loss", metrics["loss"], epoch)
Configuration
Tracelet can be configured via environment variables or through the settings interface:
from tracelet.settings import TraceletSettings
settings = TraceletSettings(
project="my_project", # or project_name (alias)
backend=["mlflow"], # List of backends
track_system=True, # System metrics tracking
metrics_interval=10.0, # Collection interval
track_tensorboard=True, # TensorBoard integration
track_lightning=True, # PyTorch Lightning support
track_git=True, # Git repository info
track_env=True # Environment capture
)
Key environment variables:
TRACELET_PROJECT: Project nameTRACELET_BACKEND: Comma-separated backends ("mlflow,wandb")TRACELET_BACKEND_URL: Backend server URLTRACELET_API_KEY: API key for backend serviceTRACELET_TRACK_SYSTEM: Enable system metrics trackingTRACELET_METRICS_INTERVAL: System metrics collection intervalTRACELET_TRACK_TENSORBOARD: Enable TensorBoard integrationTRACELET_TRACK_LIGHTNING: Enable PyTorch Lightning supportTRACELET_TRACK_GIT: Enable Git repository trackingTRACELET_TRACK_ENV: Enable environment capture
Plugin Development
Tracelet's plugin system makes it easy to add new backends or metric collectors:
from tracelet.core.plugins import BackendPlugin, PluginMetadata, PluginType
class MyCustomBackend(BackendPlugin):
@classmethod
def get_metadata(cls) -> PluginMetadata:
return PluginMetadata(
name="my_backend",
version="1.0.0",
type=PluginType.BACKEND,
description="My custom experiment tracking backend"
)
def initialize(self, config: dict):
# Set up your backend connection
self.client = MyBackendClient(config["api_key"])
def log_metric(self, name: str, value: float, iteration: int):
# Send metrics to your backend
self.client.log(name, value, iteration)
Plugins are automatically discovered from:
- Built-in:
tracelet/plugins/directory - User:
~/.tracelet/plugins/directory - Custom: Set
TRACELET_PLUGIN_PATHenvironment variable
Documentation
For more detailed documentation, visit:
Architecture
Tracelet uses a sophisticated multi-threaded architecture:
โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ
โ Framework โโโโโโถโ Orchestrator โโโโโโถโ Backend โ
โ (PyTorch) โ โ (Router) โ โ (MLflow) โ
โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ
โ โ โ
โผ โผ โผ
โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ
โ Collector โโโโโโถโ Queue โโโโโโถโ Plugin โ
โ (System) โ โ (Threaded) โ โ (ClearML) โ
โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ
- Metric Sources: Frameworks and collectors that generate metrics
- Orchestrator: Routes metrics to appropriate backends based on rules
- Backends: Plugins that handle experiment tracking and storage
Performance
Tracelet is designed for minimal overhead:
- Non-blocking metric collection using thread-safe queues
- Configurable worker threads for parallel processing
- Automatic backpressure handling to prevent memory issues
- Efficient metric batching for reduced network calls
Troubleshooting
Common Issues
Import errors for backends: Make sure you've installed the appropriate extras:
# If you see: ImportError: MLflow is not installed
pip install tracelet[mlflow]
ClearML offline mode: For testing or CI environments without ClearML credentials:
import os
os.environ["CLEARML_WEB_HOST"] = ""
os.environ["CLEARML_API_HOST"] = ""
os.environ["CLEARML_FILES_HOST"] = ""
High memory usage: Disable unnecessary tracking features:
experiment = tracelet.start_logging(
config={
"track_system": False, # Disable system metrics
"track_git": False, # Disable git tracking
"metrics_interval": 30.0, # Reduce collection frequency
}
)
Roadmap
- AWS SageMaker integration
- Prometheus metrics export
- Real-time metric streaming
- Web UI for local experiments
- Distributed training support
Contributing
We welcome contributions! Please see our Contributing Guidelines for details.
Development Setup
# Clone the repository
git clone https://github.com/prassanna-ravishankar/tracelet.git
cd tracelet
# Install with development dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Run linting
make check
License
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
Acknowledgments
- Built with the excellent uv package manager
- Repository initiated with fpgmaas/cookiecutter-uv
- Thanks to all contributors and the open-source community!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tracelet-0.0.2.tar.gz.
File metadata
- Download URL: tracelet-0.0.2.tar.gz
- Upload date:
- Size: 8.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5d5f72160ea497d0bbc046fdad3d10684a486dd720e57c706828646f3704b8d0
|
|
| MD5 |
cb41b1056a5aa61d5ffe867448921b52
|
|
| BLAKE2b-256 |
784ff5b877a456ac2bf679a821580d56c8743ff56a9c7c06b491ef6b47c135a9
|
File details
Details for the file tracelet-0.0.2-py3-none-any.whl.
File metadata
- Download URL: tracelet-0.0.2-py3-none-any.whl
- Upload date:
- Size: 6.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e3579546b275b62625376496659df67fc0b374c409e9d5636b39348e021cee93
|
|
| MD5 |
9800c32f1df750ffe9d07d9037b04502
|
|
| BLAKE2b-256 |
242c2f23fb6589ae0c79308f7e279d1ca97ac2bc467837774669d53c1b4df259
|