Library for analyzing AI agent trajectories — extract actions, summarize, embed, and visualize.

These details have not been verified by PyPI

Project links

Project description

Hodoscope

Analyze AI agent trajectories: extract actions, summarize them with LLMs, embed the summaries, and create interactive visualizations to identify behavioral patterns.

Installation

pip install hodoscope

For development (editable install with tests):

pip install -e ".[dev]"

Configuration

Create a .env file in the project root:

OPENAI_API_KEY=your-openai-key
GEMINI_API_KEY=your-gemini-key

# Optional: override defaults
# SUMMARIZE_MODEL=openai/gpt-5.2
# EMBEDDING_MODEL=gemini/gemini-embedding-001
# EMBED_DIM=768
# MAX_WORKERS=10

Quick Start

CLI

# Analyze a single .eval file
hodoscope analyze run.eval

# Analyze all .eval files in a directory
hodoscope analyze evals/

# Visualize results
hodoscope viz run.hodoscope.json

# Compare models
hodoscope viz model_a.hodoscope.json model_b.hodoscope.json --group-by model

# Show metadata
hodoscope info run.hodoscope.json

Python API

import hodoscope

# Load trajectories from .eval file
trajectories, fields = hodoscope.load_eval("run.eval", limit=5, sample=True)

# Or from a directory of trajectory JSONs
trajectories, fields = hodoscope.load_trajectory_dir("path/to/samples/")

# Summarize + embed (requires API keys)
summaries = hodoscope.process_trajectories(trajectories, summarize_model="openai/gpt-4o")

# Extract actions only (no LLM calls)
actions = hodoscope.extract_actions(trajectories[0]["messages"])

# Group and visualize in-memory summaries
grouped = hodoscope.group_summaries_from_list(summaries, group_by="score")
hodoscope.visualize_action_summaries(grouped, "plots/", methods=["tsne"])

# Or save to disk and use the file-based workflow
hodoscope.write_analysis_json("output.hodoscope.json", summaries, fields, source="run.eval")

CLI Reference

`hodoscope analyze`

Process source files (.eval, directories, Docent collections) into .hodoscope.json analysis files.

hodoscope analyze SOURCES [OPTIONS]

Options:
  --docent-id TEXT                Docent collection ID as source
  -o, --output TEXT               Output JSON path (single source only)
  --field TEXT                    KEY=VALUE metadata (repeatable)
  -l, --limit INTEGER            Limit trajectories per source
  --save-samples PATH            Save extracted trajectory JSONs to directory
  --embed-dim INTEGER             Embedding dimensionality (default: 768)
  -m, --model-name TEXT           Override auto-detected model name
  --summarize-model TEXT          LiteLLM model for summarization (default: openai/gpt-5.2)
  --embedding-model TEXT          LiteLLM model for embeddings (default: gemini/gemini-embedding-001)
  --sample / --no-sample          Randomly sample trajectories (use with --limit)
  --seed INTEGER                  Random seed for --sample reproducibility
  --resume / --no-resume          Resume from existing output (default: on)
  --reasoning-effort [low|medium|high]  Reasoning effort for summarization model
  --max-workers INTEGER           Max parallel workers for LLM calls (default: 10)

Examples:

hodoscope analyze run.eval                             # .eval → analysis JSON
hodoscope analyze *.eval                               # batch: all .eval files
hodoscope analyze evals/                               # batch: dir of .eval files
hodoscope analyze run.eval -o my_output.json           # custom output path
hodoscope analyze run.eval --field env=prod            # add custom metadata
hodoscope analyze run.eval --save-samples ./samples/   # save extracted trajectories
hodoscope analyze --docent-id COLLECTION_ID            # docent source
hodoscope analyze path/to/samples/                     # directory of trajectory JSONs
hodoscope analyze run.eval --summarize-model gemini/gemini-2.0-flash
hodoscope analyze run.eval --limit 5 --sample --seed 42
hodoscope analyze run.eval --no-resume                 # overwrite existing output

`hodoscope viz`

Visualize analysis JSON files with interactive plots. Groups summaries by any metadata field.

hodoscope viz SOURCES [OPTIONS]

Options:
  --group-by TEXT     Field to group by (default: model)
  --plots TEXT        Plot types: pca, tsne, umap, trimap, pacmap, dynamic, density
  --output-dir TEXT   Directory for HTML output files

Examples:

hodoscope viz output.json                              # visualize (groups by model)
hodoscope viz output.json --group-by task              # group by task
hodoscope viz output.json --group-by score             # group by score field
hodoscope viz *.json                                   # batch: all JSONs
hodoscope viz a.json b.json --group-by model           # cross-file comparison
hodoscope viz output.json --plots tsne umap            # specific plot types

`hodoscope info`

Show metadata, summary counts, and API key status for analysis JSON files.

hodoscope info output.json
hodoscope info results/

Python API Reference

The library exposes composable building blocks as first-class public functions. The CLI is a thin wrapper on top.

Loading trajectories

import hodoscope

# From .eval file (Inspect AI format)
trajectories, fields = hodoscope.load_eval("run.eval", limit=10, sample=True, seed=42)

# From directory of trajectory JSONs
trajectories, fields = hodoscope.load_trajectory_dir("path/to/samples/")

# From Docent collection
trajectories, fields = hodoscope.load_docent("COLLECTION_ID")

All loaders return (trajectories, fields) where trajectories is a list of trajectory dicts (each with a messages key) and fields is auto-detected file-level metadata. For .eval files, fields include model, task, dataset_name, solver, run_id, accuracy, and more.

Processing

# Full pipeline: extract actions → summarize with LLM → embed (requires API keys)
summaries = hodoscope.process_trajectories(
    trajectories,
    summarize_model="openai/gpt-4o",       # optional, defaults from env/config
    embedding_model="gemini/gemini-embedding-001",
    embed_dim=768,
)

# Extract actions only (no LLM calls, pure data transform)
actions = hodoscope.extract_actions(trajectories[0]["messages"])

Grouping and visualization

# Group in-memory summaries by any metadata field
grouped = hodoscope.group_summaries_from_list(summaries, group_by="score")

# Or group from saved analysis files
doc = hodoscope.read_analysis_json("output.hodoscope.json")
grouped = hodoscope.group_summaries([doc], group_by="model")

# Visualize
hodoscope.visualize_action_summaries(grouped, "plots/", methods=["tsne", "pca"])

Saving results

hodoscope.write_analysis_json(
    "output.hodoscope.json",
    summaries=summaries,
    fields=fields,
    source="run.eval",
    embedding_model="gemini/gemini-embedding-001",
    embedding_dimensionality=768,
)

Output Format

Each hodoscope analyze run produces a .hodoscope.json file:

{
  "version": 1,
  "created_at": "2026-02-10T12:00:00Z",
  "source": "path/to/run.eval",
  "fields": {
    "model": "gpt-5",
    "task": "swe_bench",
    "dataset_name": "swe_bench_verified",
    "solver": "system_message, generate",
    "accuracy": 0.8
  },
  "embedding_model": "gemini-embedding-001",
  "embedding_dimensionality": 3072,
  "summaries": [
    {
      "trajectory_id": "django__django-12345_epoch_1",
      "turn_id": 3,
      "summary": "Update assertion to match expected output",
      "action_text": "...",
      "embedding": "<base85-encoded float32 array>",
      "metadata": {
        "score": 1.0,
        "epoch": 1,
        "instance_id": "django__django-12345",
        "target": "expected output",
        "input_tokens": 620,
        "output_tokens": 20,
        "total_tokens": 640
      }
    }
  ]
}

Key concepts:

fields: File-level metadata auto-detected from .eval header (model, task, dataset_name, solver, run_id, accuracy, etc.) plus custom --field values. Same for all summaries.
metadata: Per-trajectory metadata. All sample.metadata keys from .eval files are passed through, plus extracted keys (score, epoch, target, token usage, etc.). Varies per summary.
--group-by resolution: Checks per-summary metadata first, then file-level fields.
embedding: RFC 1924 base85-encoded float32 numpy array.

Universal Trajectory Format

All trajectory sources are normalized to a canonical JSON schema before processing:

{
  "id": "unique-trajectory-id",
  "source": "eval",
  "model": "gpt-5",
  "input": "Task description...",
  "messages": [{"role": "user", "content": "..."}],
  "metadata": {
    "epoch": 1,
    "score": 1.0,
    "instance_id": "django__django-12345",
    "target": "expected output",
    "input_tokens": 620,
    "output_tokens": 20,
    "total_tokens": 640,
    "response_time": 1.23,
    "label_confidence": 0.89
  }
}

Testing

# Unit tests (no API keys needed)
pytest tests/test_io.py tests/test_viz.py tests/test_api.py

# End-to-end tests (requires API keys)
pytest tests/test_analyze.py

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.4

Feb 20, 2026

0.2.3

Feb 20, 2026

0.2.2

Feb 20, 2026

0.2.1

Feb 20, 2026

This version

0.2.0

Feb 19, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hodoscope-0.2.0.tar.gz (64.7 kB view details)

Uploaded Feb 19, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

hodoscope-0.2.0-py3-none-any.whl (60.2 kB view details)

Uploaded Feb 19, 2026 Python 3

File details

Details for the file hodoscope-0.2.0.tar.gz.

File metadata

Download URL: hodoscope-0.2.0.tar.gz
Upload date: Feb 19, 2026
Size: 64.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.1

File hashes

Hashes for hodoscope-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`60ecc42bead5c04f13ff9dc0690cfb8454c9377ad79040aeea6a6294716395af`
MD5	`d5e23c080e2ec544bfbb85a0d8d2cdd0`
BLAKE2b-256	`96df2f14a7b8be0c2dd670af85f0693fc0260a191fef675e424cd41323d7b756`

See more details on using hashes here.

File details

Details for the file hodoscope-0.2.0-py3-none-any.whl.

File metadata

Download URL: hodoscope-0.2.0-py3-none-any.whl
Upload date: Feb 19, 2026
Size: 60.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.1

File hashes

Hashes for hodoscope-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b0dc318c1e9e0b52d4180ab7b71361cfe07723034a685894f9100a0ba7af4b3c`
MD5	`18c3e675383dd8caceeb302107c04ec8`
BLAKE2b-256	`3f44ae1da8a699e302b3e75ed693f91db5ae92922a3d42fcb019233bc571f84d`

See more details on using hashes here.

hodoscope 0.2.0

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

Hodoscope

Installation

Configuration

Quick Start

CLI

Python API

CLI Reference

hodoscope analyze

hodoscope viz

hodoscope info

Python API Reference

Loading trajectories

Processing

Grouping and visualization

Saving results

Output Format

Universal Trajectory Format

Testing

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`hodoscope analyze`

`hodoscope viz`

`hodoscope info`