Observability framework for robotics research
Project description
😎 Goggles - Observability for Robotics Research
A lightweight, flexible Python observability framework designed for robotics research. Goggles provides structured logging, experiment tracking, performance profiling, and device-resident temporal memory management for JAX-based pipelines.
✨ Features
- 🤖 Multi-process logging on a single machine - Synchronize logs across spawned processes via a Unix-domain-socket transport; large numpy payloads travel through shared memory when they cross the threshold.
- 🎯 Multi-output support - Log to console, files, and remote services simultaneously.
- 📊 Experiment tracking - Native integration with Weights & Biases for metrics, images, and videos.
- 🕒 Performance profiling -
@goggles.timeitdecorator for automatic runtime measurement. - 🐞 Error tracing -
@goggles.trace_on_errorauto-logs full stack traces on exceptions. - 🧠 Device-resident histories - JAX-based GPU memory management for efficient, long-running experiments metrics.
- 🚦 Graceful shutdown - Automatic cleanup of resources and handlers.
- ⚙️ Structured configuration - YAML-based config loading with validation.
- 🔌 Extensible handlers - Plugin architecture for custom logging backends.
🏗️ Projects Built with Goggles
This framework has been battle-tested across multiple research projects:
🚀 Quick Start
Installation
# Basic installation
uv add robo-goggles # or pip install robo-goggles
# With Weights & Biases support
uv add "robo-goggles[wandb]"
# With JAX device-resident histories
uv add "robo-goggles[jax]"
For the development installation, see our How to contribute page.
[!WARNING] Socket selection: Goggles uses a Unix domain socket to route events to a single host process per machine. The first process to bind becomes the host; later processes connect as clients. Two unrelated projects sharing the same socket path will end up sharing a bus, which is not what you want. Pin a per-project path in your
.envviaGOGGLES_SOCKET=/tmp/goggles-<project>.sock.
Basic usage
import goggles as gg
import logging
# Set up console logging
logger = gg.get_logger("my_experiment")
gg.attach(
gg.ConsoleHandler(name="console", level=logging.INFO),
)
# Basic logging
logger.info("Experiment started")
logger.warning("This is a warning")
logger.error("An error occurred")
# Goggles works by default in async mode,
# to ensure all the jobs are finished use
gg.finish()
See also Example 1, which you can run after cloning the repo with
uv run examples/01_basic_run.py
Experiment tracking with W&B
import goggles as gg
import numpy as np
# Enable metrics logging. `group` and `tags` are forwarded straight to
# `wandb.init`: use `group` to keep related runs together in the W&B UI
# (see "Multiple runs in WandB" below) and `tags` to make a run easy to
# find or filter on later. Pass `tags` as a list — a bare string is
# rejected because W&B would silently iterate it character by character.
logger = gg.get_logger("experiment", with_metrics=True)
gg.attach(
gg.WandBHandler(
project="my_project",
run_name="run_1",
group="experiment_v2",
tags=["baseline", "smoke-test"],
),
)
# Log metrics, images, and videos
for step in range(100):
logger.scalar("loss", np.random.random(), step=step)
logger.scalar("accuracy", 0.8 + 0.2 * np.random.random(), step=step)
# Log images and videos
image = np.random.randint(0, 255, (64, 64, 3), dtype=np.uint8)
logger.image(image, name="sample_image", step=100)
video = np.random.randint(0, 255, (30, 3, 64, 64), dtype=np.uint8)
logger.video(video, name="sample_video", fps=10, step=100)
gg.finish()
Performance profiling and error tracking
import goggles as gg
import logging
class Trainer:
@gg.timeit(severity=logging.INFO)
def train_step(self, batch):
# Your training logic here
return {"loss": 0.1}
@gg.trace_on_error()
def risky_operation(self, data):
# This will log full traceback on any exception
return data / 0 # Will trigger trace logging
trainer = Trainer()
trainer.train_step({"x": [1, 2, 3]}) # Logs execution time
try:
trainer.risky_operation(10)
except ZeroDivisionError:
pass # Full traceback was automatically logged
Configuration Management
Load and validate YAML configurations:
import goggles
# Load configuration with automatic validation
config = goggles.load_configuration("config.yaml")
print(config) # Pretty print
print(config["learning_rate"]) # Access as dict
# Pretty-print configuration
goggles.save_configuration(config, "output.yaml")
Supported Platforms 💻
| Platform | Basic | W&B | JAX/GPU | Development |
|---|---|---|---|---|
| Linux | ✅ | ✅ | ✅ | ✅ |
| macOS | ✅ | ✅ | ✅ | ✅ |
| Windows | ✅ | ✅ | ❌ | ✅ |
GPU support requires CUDA-compatible hardware and drivers
🔥 Examples
Explore the examples/ directory for comprehensive usage patterns:
# Basic logging setup
uv run examples/01_basic_run.py
# Advanced: Multi-scope logging
uv run examples/02_multi_scope.py
# File-based logging (local storage)
uv run examples/03_local_storage.py
# Weights & Biases integration
uv run examples/04_wandb.py
# Advanced: Weights & Biases multi-run setup
uv run examples/05_wandb_multiple_runs.py
# Advanced: Custom handler
uv run examples/06_custom_handler.py
# Graceful shutdown utils
uv run examples/100_interrupt.py
# Pretty and convenient utils for configuration loading
uv run examples/101_config.py
# Advanced: Performance decorators
uv run examples/102_decorators.py
# Advanced: JAX device-resident histories
uv run examples/103_history.py
# Filters: smoothing, outlier rejection, composition
uv run examples/104_filters.py
# Benchmark: producer-side logging latency under Hydra presets
uv run examples/105_benchmark.py
🧠 For Goggles power user
This section includes some cool functionalities of goggles. Enjoy!
Multi-scope logging
Goggles allow easily to set up different handlers for different scopes. That is, one can have an handler attached to multiple scopes, and a scope having multiple handlers. Each logger is associated to a single scope (by default: global), and logging with that logger will invoke all the loggers associated with the scope.
Why?
Within the same run, we may have logs that belong to different scopes. An example is training in Reinforcement Learning, where in a single training run there are multiple episodes. A complete example for this is provided in the multiple runs in WandB section.
Usage
# In this example, we set up a handlers associated
# to different scopes.
handler1 = gg.ConsoleHandler(name="examples.basic.console.1", level=logging.INFO)
gg.attach(handler1, scopes=["global", "scope1"])
handler2 = gg.ConsoleHandler(name="examples.basic.console.2", level=logging.INFO)
gg.attach(handler2, scopes=["global", "scope2"])
# We need to get separate loggers for each scope
logger_scope1 = gg.get_logger("examples.basic.scope1", scope="scope1")
logger_scope2 = gg.get_logger("examples.basic.scope2")
logger_scope2.bind(scope="scope2") # You can also bind the scope after creation
logger_global = gg.get_logger("examples.basic.global", scope="global")
# Now we can log messages to different scopes, so that only the interested
# handlers will process them.
logger_scope1.info(f"This will be logged only by {handler1.name}")
logger_scope2.info(f"This will be logged only by {handler2.name}")
logger_global.info("This will be logged by both handlers.")
# The same result can be achieved using namespaces,
# which are indicated by dot notation.
logger_namespace = gg.get_logger("examples.basic.namespace", scope="namespace")
logger_namespace.info("This will be logged by both handlers.")
gg.finish()
See also examples/02_multi_scope.py for a running example.
Multiple runs in WandB
An example of the benefit of scopes is given by the WandBHandler, which instantiate a different WandB run for each scope and groups them together:
import goggles as gg
from goggles import WandBHandler
# In this example, we set up multiple runs in Weights & Biases (W&B).
# All runs created by the handler will be grouped under
# the same project and group.
logger: gg.GogglesLogger = gg.get_logger("examples.basic", with_metrics=True)
handler = WandBHandler(
project="goggles_example", reinit="create_new", group="multiple_runs"
)
# In particular, we set up multiple runs in an RL training loop, with each
# episode being a separate W&B run and a global run tracking all episodes.
num_episodes = 3
episode_length = 10
scopes = [f"episode_{episode}" for episode in range(num_episodes + 1)]
scopes.append("global")
gg.attach(handler, scopes=scopes)
def my_episode(index: int):
episode_logger = gg.get_logger(scope=f"episode_{index}", with_metrics=True)
for step in range(episode_length):
# Supports scopes transparently
# and has its own step counter
episode_logger.scalar("env/reward", index * episode_length + step, step=step)
for i in range(num_episodes):
my_episode(i)
logger.scalar("total_reward", i, step=i)
gg.finish()
Fully asynchronous logging
As in the WandB example, all the handlers work in the background. By default, the logging calls are not blocking, but can be made blocking by setting the environment variable GOGGLES_ASYNC to 0 or false. When you use the async mode, remember to call gg.finish() at the end from your host machine!
[!WARNING] This functionality still needs thorough tesing, as well as a better documentation. Help is appreciated! 🤗
Multi-process logging (same machine)
All processes that share the same GOGGLES_SOCKET path converge on a single EventBus. The first process to bind the socket becomes the host and runs the attached handlers; later processes connect as clients and forward events to it. Cross-machine logging is not supported in the built-in transport; if you need it, add a new implementation of goggles._core.transport.Transport.
Reducing GC jitter in hot loops
At high logging frequency (≥1 kHz) Python's gen-2 garbage collector can cause millisecond-scale latency spikes. After you have attached handlers and finished setup, call gg.freeze() once before your hot loop:
gg.attach(...)
gg.freeze() # promote startup objects out of the GC scan set
for step in range(steps):
logger.scalar("loss", loss, step=step)
gg.freeze() wraps gc.freeze(); the collector still runs on churn allocated after the call, but it stops rescanning the long-lived startup state.
Why this is opt-in (not automatic).
gc.freeze()is process-global, not goggles-scoped: it promotes every currently-tracked Python object — including whatever your code has built so far — into a permanent generation that the GC will skip on subsequent collections. If goggles called it from insideattach()orget_logger(), we'd be making that decision on objects you haven't finished allocating yet. The right call site is "after your setup is done, before your hot loop starts" — and only you know where that line is.
W&B online vs offline
The W&B upload runs on a background thread that W&B's own SDK manages; Goggles' producer thread only calls wandb.log({...}), which enqueues locally and returns quickly. Online vs offline mode therefore does not change the hot-path latency your training loop sees — only where the data ends up changes.
When to set WANDB_MODE=offline:
- Airgapped or flaky-network hosts (shared clusters, HPC compute nodes without outbound Internet access).
- Benchmarking / reproducible latency measurements — removes network jitter from the picture (see examples/105_benchmark.py).
- Untrusted environments where you don't want to stream data out during the run; you review and then sync.
- Faster startup — no auth round-trip at
wandb.init.
Offline runs are written to ./wandb/offline-run-<timestamp>-<id>/ and can be uploaded later with:
wandb sync wandb/ # all offline runs
wandb sync wandb/offline-run-<id> # a specific run
Adding a custom handler
[!NOTE] Ideally, you should open a PR: We would love to integrate your work!
Adding a custom handler is straightforward:
import goggles as gg
import logging
class CustomConsoleHandler(gg.ConsoleHandler):
"""A custom console handler that adds a prefix to each log message."""
def handle(self, event: gg.Event) -> None:
dict = event.to_dict()
dict["payload"] = f"[CUSTOM PREFIX] {dict['payload']}"
event = gg.Event.from_dict(dict)
super().handle(event)
# Register the custom handler so it can be serialized/deserialized
gg.register_handler(CustomConsoleHandler)
# In this basic example, we set up a logger that outputs to the console.
logger = gg.get_logger("examples.custom_handler")
gg.attach(
CustomConsoleHandler(name="examples.custom.console", level=logging.INFO),
scopes=["global"],
)
# Because the logging level is set to INFO, the debug message will not be shown.
logger.info("Hello, world!")
logger.debug("you won't see this at INFO")
gg.finish()
See also examples/05_custom_handler.py for a complete example.
Device-resident histories
For long-running GPU experiments that need efficient temporal memory management:
Why?
During development of fluid control experiments and reinforcement learning pipelines, we needed to:
- Track detailed metrics during GPU-accelerated training
- Avoid expensive device-to-host transfers
- Maintain temporal state across episodes
- Support JIT compilation for maximum performance
Features
- Pure functional and JIT-safe buffer updates
- Per-field history lengths with episodic reset support
- Batch-first convention:
(B, T, *shape)for all tensors - Zero host-device synchronization during updates
- Integrated with FlowGym's
EstimatorStatefor temporal RL memory
Usage
from goggles.history import HistorySpec, create_history, update_history
import jax.numpy as jnp
# Define what to track over time
spec = HistorySpec.from_config({
"states": {"length": 100, "shape": (64, 64, 2), "dtype": jnp.float32},
"actions": {"length": 50, "shape": (8,), "dtype": jnp.float32},
"rewards": {"length": 100, "shape": (), "dtype": jnp.float32},
})
# Create GPU-resident history buffers
history = create_history(spec, batch_size=32)
print(history["states"].shape) # (32, 100, 64, 64, 2)
# Update buffers during training (JIT-compiled)
new_state = jnp.ones((32, 64, 64, 2))
history = update_history(history, {"states": new_state})
See also examples/103_history.py for a running example.
🤝 Contributing
We welcome contributions! Please see our Contributing Guide for detailed information on:
• Development workflow and environment setup • Code style requirements and automated checks • Testing standards and coverage expectations • PR preparation and commit message conventions
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file robo_goggles-0.2.0.tar.gz.
File metadata
- Download URL: robo_goggles-0.2.0.tar.gz
- Upload date:
- Size: 96.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
16fd1df2f14e64dc198c65e208bd20b9ef5b3334cff146852a58169ead01d9fa
|
|
| MD5 |
dc26794e42ca23166e80999393d0b2aa
|
|
| BLAKE2b-256 |
5361a1dd6f8148fc2948e767f310eff30c2a78de81477cc0c888043a003c540e
|
Provenance
The following attestation bundles were made for robo_goggles-0.2.0.tar.gz:
Publisher:
release.yaml on antonioterpin/goggles
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
robo_goggles-0.2.0.tar.gz -
Subject digest:
16fd1df2f14e64dc198c65e208bd20b9ef5b3334cff146852a58169ead01d9fa - Sigstore transparency entry: 1417540336
- Sigstore integration time:
-
Permalink:
antonioterpin/goggles@c2a4c894e905383ef619122fe874708e7658c545 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/antonioterpin
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yaml@c2a4c894e905383ef619122fe874708e7658c545 -
Trigger Event:
push
-
Statement type:
File details
Details for the file robo_goggles-0.2.0-py3-none-any.whl.
File metadata
- Download URL: robo_goggles-0.2.0-py3-none-any.whl
- Upload date:
- Size: 86.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bf24a98a09467a0d949db69c823c8e9eb8d68855acc008318e96002847502dae
|
|
| MD5 |
894d4714584a0e40bc630a46e15f2ab1
|
|
| BLAKE2b-256 |
2d67d644914b09193f0e31559c5d73e2e80a64e9a67c0f2882376a7d7006f9d7
|
Provenance
The following attestation bundles were made for robo_goggles-0.2.0-py3-none-any.whl:
Publisher:
release.yaml on antonioterpin/goggles
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
robo_goggles-0.2.0-py3-none-any.whl -
Subject digest:
bf24a98a09467a0d949db69c823c8e9eb8d68855acc008318e96002847502dae - Sigstore transparency entry: 1417540344
- Sigstore integration time:
-
Permalink:
antonioterpin/goggles@c2a4c894e905383ef619122fe874708e7658c545 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/antonioterpin
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yaml@c2a4c894e905383ef619122fe874708e7658c545 -
Trigger Event:
push
-
Statement type: