Skip to main content

Simple experiment logging library

Project description

expt_logger

Simple experiment tracking for RL training with a W&B-style API.

Quick Start

Install:

uv add expt-logger
# or
pip install expt-logger

Set your API key:

export EXPT_LOGGER_API_KEY=your_api_key

Start logging:

import expt_logger

# Initialize run with config
expt_logger.init(
    name="grpo-math",
    config={"lr": 3e-6, "batch_size": 8}
)

# Get experiment URLs
print(f"View experiment: {expt_logger.experiment_url()}")
print(f"Base URL: {expt_logger.base_url()}")

# Log scalar metrics
expt_logger.log({
    "train/loss": 0.45,
    "train/kl": 0.02,
    "train/reward": 0.85
}, commit=False)
# Not committing means the step count will not increase
# and the logs will be buffered

# Log RL rollouts with rewards
expt_logger.log_rollout(
    prompt="What is 2+2?",
    messages=[{"role": "assistant", "content": "The answer is 4."}],
    rewards={"correctness": 1.0, "format": 0.9},
    mode="train",
    commit=True 
)
# When commit is True (the default),
# this log and all buffered logs will be pushed
# and the step count will be incremented

expt_logger.end()

Core Features

Scalar Metrics

Log training metrics with automatic step tracking:

# Batch multiple metrics at the same step
expt_logger.log({"loss": 0.5}, commit=False)
expt_logger.log({"accuracy": 0.9}, commit=False)
expt_logger.commit()  # Commit both at step 1, then increment to step 2

# Or commit immediately
expt_logger.log({"loss": 0.4})  # Commit at step 2, increment to 3

# Use slash prefixes for train/eval modes
expt_logger.log({
    "train/loss": 0.5,
    "eval/loss": 0.6
}, step=10)

# Or set mode explicitly
expt_logger.log({"loss": 0.5}, mode="eval")

Note: Metrics default to "train" mode when no mode is specified and keys don't have slash prefixes.

Rollouts (RL-specific)

Log conversation rollouts with multiple reward functions:

# Batch multiple rollouts at the same step
expt_logger.log_rollout(
    prompt="Solve: x^2 - 5x + 6 = 0",
    messages=[
        {"role": "assistant", "content": "Let me factor this..."},
        {"role": "user", "content": "Can you verify?"},
        {"role": "assistant", "content": "Sure! (x-2)(x-3) = 0..."}
    ],
    rewards={
        "correctness": 1.0,
        "format": 0.9,
        "helpfulness": 0.85
    },
    mode="train",
    commit=False
)

expt_logger.log_rollout(
    prompt="Another problem...",
    messages=[{"role": "assistant", "content": "Solution..."}],
    rewards={"correctness": 0.8},
    mode="train"
)
# Commit both rollouts at the same step

# Or commit immediately
expt_logger.log_rollout(
    prompt="Yet another...",
    messages=[{"role": "assistant", "content": "Answer..."}],
    rewards={"correctness": 1.0},
    step=5,
    mode="train"
)

Flexible Prompt Format:

The prompt parameter accepts either a string or a dict with a 'content' key:

# String format (simple)
expt_logger.log_rollout(
    prompt="What is 2+2?",
    messages=[{"role": "assistant", "content": "4"}],
    rewards={"correctness": 1.0}
)

# Dict format (when prompt is part of a structured object)
expt_logger.log_rollout(
    prompt={"role": "user", "content": "What is 2+2?"},  # extracts 'content'
    messages=[{"role": "assistant", "content": "4"}],
    rewards={"correctness": 1.0}
)
  • Messages format: List of dicts with "role" and "content" keys (both must be strings)
  • Rewards format: Dict of reward names to numeric values (no NaN or Infinity)
  • Mode: "train" or "eval" (default: "train")
  • Commit: True (default) to commit immediately, False to batch

Configuration

Track hyperparameters and update them dynamically:

expt_logger.init(config={"lr": 0.001, "batch_size": 32})

# Update config during training
config = expt_logger.config()
config.lr = 0.0005              # attribute style
config["epochs"] = 100          # dict style
config.update({"model": "gpt2"}) # bulk update

API Key & Server Configuration

API Key (required):

export EXPT_LOGGER_API_KEY=your_api_key

Or pass directly:

expt_logger.init(api_key="your_key")

Custom server URL (optional, for self-hosting):

export EXPT_LOGGER_BASE_URL=https://your-server.com

Or:

expt_logger.init(base_url="https://your-server.com")

Accessing Experiment URLs

Get the experiment URL and base URL:

expt_logger.init(name="my-experiment")

# Get the full experiment URL to view in browser
print(expt_logger.experiment_url())
# https://app.cgft.io/experiments/ccf1f879-50a6-492b-9072-fed6effac731

# Get the base URL of the tracking server
print(expt_logger.base_url())
# https://app.cgft.io

API Reference

expt_logger.init()

init(
    name: str | None = None,
    config: dict[str, Any] | None = None,
    api_key: str | None = None,
    base_url: str | None = None
) -> Run
  • name: Experiment name (auto-generated if not provided)
  • config: Initial hyperparameters
  • api_key: API key (or set EXPT_LOGGER_API_KEY)
  • base_url: Custom server URL (or set EXPT_LOGGER_BASE_URL)

expt_logger.log()

log(
    metrics: dict[str, float],
    step: int | None = None,
    mode: str | None = None,
    commit: bool = True
)
  • metrics: Dict of metric names to values
  • step: Step number (auto-increments if not provided)
  • mode: Default mode for keys without slashes (default: "train")
  • commit: If True (default), commit immediately and increment step. If False, buffer metrics until commit.

expt_logger.log_rollout()

log_rollout(
    prompt: str | dict[str, str],
    messages: list[dict[str, str]],
    rewards: dict[str, float],
    step: int | None = None,
    mode: str = "train",
    commit: bool = True
)
  • prompt: The prompt text (str) or dict with 'content' key (content will be extracted)
  • messages: List of {"role": ..., "content": ...} dicts (both must be strings)
  • rewards: Dict of reward names to numeric values (must be valid numbers, not NaN/Inf)
  • step: Step number (must be non-negative integer if provided)
  • mode: "train" or "eval" (must be non-empty string)
  • commit: If True (default), commit immediately and increment step. If False, buffer metrics until commit.

Input Validation:

  • All parameters are strictly validated
  • Invalid inputs raise ValidationError with descriptive error messages
  • Metric and reward values must be numeric (int/float) and cannot be NaN or Infinity

expt_logger.commit()

commit()

Commit all pending metrics and rollouts, then increment the step counter.

expt_logger.end()

end()

Finish the run and clean up resources.

Graceful Shutdown

The library handles cleanup on:

  • Normal exit (atexit)
  • Ctrl+C (SIGINT)
  • SIGTERM

All buffered data is flushed before exit.

Input Validation

The library performs strict input validation to catch errors early and provide clear error messages:

Validated Inputs

For log():

  • Metrics dict keys must be non-empty strings
  • Metrics dict values must be numeric (int/float), not NaN or Infinity
  • Step must be non-negative integer (if provided)
  • Mode must be non-empty string (if provided)

For log_rollout():

  • Prompt can be str or dict (if dict, must have 'content' key with string value)
  • Messages must be list of dicts, each with 'role' and 'content' string keys
  • Rewards dict keys must be non-empty strings
  • Rewards dict values must be numeric (int/float), not NaN or Infinity
  • Step must be non-negative integer (if provided)
  • Mode must be non-empty string (if provided)

Error Handling

Invalid inputs raise ValidationError with specific, actionable error messages:

from expt_logger import ValidationError
import math

try:
    expt_logger.log({"loss": math.nan})  # Invalid: NaN
except ValidationError as e:
    print(f"Validation failed: {e}")
    # Output: Validation failed: Metric 'loss' has invalid value: nan (NaN is not allowed)

try:
    expt_logger.log_rollout(
        prompt="Test",
        messages=[{"role": "assistant"}],  # Invalid: missing 'content'
        rewards={"score": 1.0}
    )
except ValidationError as e:
    print(f"Validation failed: {e}")
    # Output: Validation failed: Message at index 0 is missing required key 'content'

Development

For local development, see DEVELOPMENT.md.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

expt_logger-0.1.0.dev9.tar.gz (36.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

expt_logger-0.1.0.dev9-py3-none-any.whl (18.3 kB view details)

Uploaded Python 3

File details

Details for the file expt_logger-0.1.0.dev9.tar.gz.

File metadata

  • Download URL: expt_logger-0.1.0.dev9.tar.gz
  • Upload date:
  • Size: 36.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.13

File hashes

Hashes for expt_logger-0.1.0.dev9.tar.gz
Algorithm Hash digest
SHA256 36c58dee4a2f7c249581f70178f777ca4d17be0bc7f8cdf39b59046588f161a2
MD5 757121312c94e16f7f61dab3ef9db411
BLAKE2b-256 629fe3e5565f9bf0c2428c5a9970893d2ed5beeec9d0063b3cf656ed57953a5d

See more details on using hashes here.

File details

Details for the file expt_logger-0.1.0.dev9-py3-none-any.whl.

File metadata

File hashes

Hashes for expt_logger-0.1.0.dev9-py3-none-any.whl
Algorithm Hash digest
SHA256 dc180e2e433f5e2e13b72182c08cc535d4642251baa9613b612a701ed6746963
MD5 25487945d1170166c32d44a525b42e62
BLAKE2b-256 162bbb228ef8e0ab3db83bebd1cf166c61c9d739f9ad2b883ea216bc8d75212c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page