Library for analyzing AI agent trajectories — extract actions, summarize, embed, and visualize.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

fjzzq2002

These details have not been verified by PyPI

Project description

Hodoscope

Unsupervised, human-in-the-loop trajectory analysis for AI agents. Summarize, embed, and visualize thousands of agent actions to find patterns across models and configurations. Supports common evaluation formats and any LiteLLM-compatible model for summarization and embedding.

Homepage · Announcement blog

Why Hodoscope?

Running evals across multiple models and configurations produces a mountain of raw logs, but reading them one-by-one doesn't scale. Hodoscope gives you a bird's-eye view: it extracts every agent action from your eval trajectories, summarizes each one with an LLM, embeds the summaries into a shared vector space, and then projects them into interactive 2D plots. The result is a visual map where you can spot behavioral clusters, group by any metadata field, and use density overlays to see exactly where two groups of trajectories diverge or converge. No labels or pre-defined taxonomies required.

Features

Multiple supported formats -- Inspect AI .eval files, OpenHands JSONL trajectories, Docent collections, and raw trajectory JSONs
Summarization & embedding -- distill raw agent actions into concise natural-language summaries and embed them via any LLM supported by LiteLLM
Dimensionality reduction -- project embedded summaries into interactive 2D scatter plots with t-SNE (recommended), PCA, UMAP, TriMap, or PaCMAP
Density diffing and overlay -- overlay difference in kernel density estimates to visualize where trajectory distributions differ
Flexible grouping -- group summaries by any metadata field (--group-by model, --group-by score, --group-by task, etc.) to compare
Resumable processing -- interrupt and resume long analysis runs with --resume; already-processed trajectories are skipped
Python API -- every CLI command maps to a public function you can call directly in notebooks or scripts

How It Works

source file ─→ actions ─→ summarize ─→ embed ─→ distribution diffing ─→ visualize

Prerequisites
Installation
Configuration
Quick Start
CLI Reference
Trajectory Format
Output Format
Testing
Contributing
Citation
License

Prerequisites

Python 3.11+
By default: An OpenAI and a Gemini API key for summarization and embedding
It's also possible to use other LLM API keys. For example, a single OpenRouter API key

Installation

pip install hodoscope

For development (editable install with tests):

pip install -e ".[dev]"

Configuration

Create a .env file in the project root. Hodoscope loads it automatically at startup.

OPENAI_API_KEY=your-openai-key
GEMINI_API_KEY=your-gemini-key

# Optional: override defaults
# ⚠️ Default summarization model (gpt-5.2) could be expensive!
# SUMMARIZE_MODEL=openai/gpt-5.2
# EMBEDDING_MODEL=gemini/gemini-embedding-001
# MAX_WORKERS=10

You can also export these variables directly in your shell instead of using a .env file.

Using OpenRouter (single API key): If you prefer to use an OpenRouter key for both summarization and embedding, set OPENROUTER_API_KEY and prefix your model names with openrouter/:

OPENROUTER_API_KEY=your-openrouter-key
SUMMARIZE_MODEL=openrouter/openai/gpt-5.2
EMBEDDING_MODEL=openrouter/gemini/gemini-embedding-001

Quick Start

# Analyze a single .eval file
hodoscope analyze run.eval

# Analyze all trajectory files in a directory
hodoscope analyze evals/

# Compare models
hodoscope viz model_*.hodoscope.json --group-by model --open

# Visualize a single result
hodoscope viz run.hodoscope.json --open

CLI Reference

`hodoscope analyze`

Process source files (.eval, directories, Docent collections) into .hodoscope.json analysis files.

hodoscope analyze SOURCES [OPTIONS]

Options:
  --docent-id TEXT          Docent collection ID as source
  -o, --output TEXT         Output JSON path (single source only)
  --field TEXT              KEY=VALUE metadata (repeatable)
  -l, --limit INTEGER       Limit trajectories per source
  --save-samples PATH       Save extracted trajectory JSONs to directory
  --embed-dim INTEGER       Embedding dimensionality (default: follow API default)
  -m, --model-name TEXT     Override auto-detected model name
  --summarize-model TEXT    LiteLLM model for summarization (default: openai/gpt-5.2)
  --embedding-model TEXT    LiteLLM model for embeddings (default: gemini/gemini-embedding-001)
  --sample / --no-sample    Randomly sample trajectories (use with --limit)
  --seed INTEGER            Random seed for --sample reproducibility
  --resume / --no-resume    Resume from existing output (default: on)
  --reasoning-effort [low|medium|high]
                            Reasoning effort for summarization model
  --max-workers INTEGER     Max parallel workers for LLM calls (default: 10)
  --reembed                 Re-embed existing summaries (e.g. after changing embedding model/dim)

Examples:

hodoscope analyze run.eval                             # .eval → analysis JSON
hodoscope analyze *.eval                               # batch: all .eval files
hodoscope analyze evals/                               # batch: dir of .eval files
hodoscope analyze run.eval -o my_output.hodoscope.json  # custom output path
hodoscope analyze run.eval --field env=prod            # add custom metadata
hodoscope analyze run.eval --save-samples ./samples/   # save extracted trajectories
hodoscope analyze --docent-id COLLECTION_ID            # docent source
hodoscope analyze path/to/samples/                     # directory of trajectory JSONs
hodoscope analyze run.eval --summarize-model gemini/gemini-2.0-flash
hodoscope analyze run.eval --limit 5 --sample --seed 42
hodoscope analyze run.eval --no-resume                 # overwrite existing output

`hodoscope viz`

Visualize analysis JSON files with interactive plots. Groups summaries by any metadata field.

hodoscope viz SOURCES [OPTIONS]

Options:
  --group-by TEXT     Field to group by (default: model)
  --proj TEXT         Projection methods: pca, tsne, umap, trimap, pacmap
                      (comma-separated or repeated; * or all for all; default: tsne)
  -o, --output TEXT   Output HTML file path (default: auto-generated timestamped name)
  --filter TEXT       KEY=VALUE metadata filter (repeatable, AND logic)
  --open              Open the generated HTML in the default browser

Examples:

hodoscope viz output.hodoscope.json                    # visualize a single analysis file (grouped by model)
hodoscope viz *.hodoscope.json --group-by score        # group by score field
hodoscope viz *.hodoscope.json --proj tsne,umap        # specific projection methods
hodoscope viz *.hodoscope.json --proj '*'              # all methods (will be slow!)
hodoscope viz *.hodoscope.json --filter score=1.0      # only score=1.0 ones
hodoscope viz *.hodoscope.json --open                  # open in default browser

`hodoscope sample`

Sample representative summaries using density-weighted Farthest Point Sampling on 2D projections.

Note: While this command could be useful for scripting and automated pipelines, we find the interactive visualization (hodoscope viz) to be more intuitive and effective for human-in-the-loop explorations.

hodoscope sample SOURCES [OPTIONS]

Options:
  --group-by TEXT       Field to group by (default: model)
  -n, --samples-per-group INTEGER
                        Number of representative samples per group (default: 10)
  --proj TEXT           Projection method for FPS ranking (pca, tsne, umap, trimap, pacmap; default: tsne)
  -o, --output TEXT     JSON output file (default: paginated terminal display)
  --interleave          Interleave groups by rank (#1 from each group, then #2, etc.)
  --filter TEXT         KEY=VALUE metadata filter (repeatable, AND logic)

Examples:

hodoscope sample output.hodoscope.json                        # suggest 10 per group
hodoscope sample output.hodoscope.json --group-by score -n 5  # suggest 5 per score group
hodoscope sample output.hodoscope.json --proj pca             # use PCA projection
hodoscope sample output.hodoscope.json -o sampled.json        # write JSON output
hodoscope sample a.hodoscope.json b.hodoscope.json --interleave  # interleave groups by rank for easier comparison
hodoscope sample output.hodoscope.json --filter score=1.0     # only score=1.0 summaries

`hodoscope info`

Show metadata, summary counts, and API key status for analysis JSON files.

hodoscope info output.hodoscope.json
hodoscope info results/

Trajectory Format

Hodoscope first converts other trajectory sources (.eval files, Docent collections, etc.) to the canonical JSON format before processing. You can also pass trajectories directly in this format:

{
  "id": "unique-trajectory-id",
  "messages": [
    {"role": "user", "content": "..."},
    {"role": "assistant", "content": "..."},
    ...
  ],
  "metadata": {...}
}

Output Format

hodoscope analyze produces .hodoscope.json files:

{
  "version": 1,
  "created_at": "...",
  "source": "path/to/run.eval",
  "fields": {"model": "gpt-5", "task": "swe_bench", "accuracy": 0.8, "...": "..."},
  "embedding_model": "gemini/gemini-embedding-001",
  "embedding_dimensionality": 768,
  "summaries": [
    {
      "trajectory_id": "django__django-12345_epoch_1",
      "turn_id": 3,
      "summary": "Update assertion to match expected output",
      "action_text": "...",
      "task_context": "...",
      "embedding": "<base85-encoded float32 array>",
      "metadata": {"score": 1.0, "instance_id": "django__django-12345", "...": "..."}
    },
    "..."
  ]
}

Key concepts:

fields: File-level metadata auto-detected from .eval header (model, task, dataset_name, solver, run_id, accuracy, etc.) plus custom --field values. Same for all summaries.
metadata: Per-trajectory metadata. All sample.metadata keys from .eval files are passed through, plus extracted keys (score, epoch, target, token usage, etc.). Varies per summary.
--group-by resolution: Checks per-summary metadata first, then file-level fields.
embedding: RFC 1924 base85-encoded float32 numpy array.

Testing

# Run the full test suite
pytest

# Unit tests only (no API keys needed)
pytest tests/test_io.py tests/test_viz.py tests/test_api.py tests/test_sampling.py

# End-to-end tests (requires API keys)
pytest tests/test_analyze.py

Contributing

Contributions are welcome! We recommend opening an issue to discuss what you'd like to change before submitting a pull request.

Citation

@article{zhong2026hodoscope,
  title={Hodoscope: Unsupervised Behavior Discovery in AI Agents},
  author={Zhong, Ziqian and Saxena, Shashwat and Raghunathan, Aditi},
  year={2026},
  url={https://hodoscope.dev/blog/announcement.html}
}

License

This project is licensed under the MIT License. See LICENSE for details.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

fjzzq2002

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.2.4

Feb 20, 2026

0.2.3

Feb 20, 2026

0.2.2

Feb 20, 2026

0.2.1

Feb 20, 2026

0.2.0

Feb 19, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hodoscope-0.2.4.tar.gz (67.9 kB view details)

Uploaded Feb 20, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

hodoscope-0.2.4-py3-none-any.whl (62.3 kB view details)

Uploaded Feb 20, 2026 Python 3

File details

Details for the file hodoscope-0.2.4.tar.gz.

File metadata

Download URL: hodoscope-0.2.4.tar.gz
Upload date: Feb 20, 2026
Size: 67.9 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for hodoscope-0.2.4.tar.gz
Algorithm	Hash digest
SHA256	`d87578bfbd0890cbfb2cf572ec7fa088024e70de72d70b0a347213e251633442`
MD5	`beebb3758f96c9be339f88b75b543437`
BLAKE2b-256	`eb6994c5ca68a807bb81b51987deb24300222c35fe01b47b4ce3f83f74854455`

See more details on using hashes here.

Provenance

The following attestation bundles were made for hodoscope-0.2.4.tar.gz:

Publisher: publish.yml on AR-FORUM/hodoscope

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: hodoscope-0.2.4.tar.gz
- Subject digest: d87578bfbd0890cbfb2cf572ec7fa088024e70de72d70b0a347213e251633442
- Sigstore transparency entry: 974556365
- Sigstore integration time: Feb 20, 2026
Source repository:
- Permalink: AR-FORUM/hodoscope@b20b0454cf62ff24b74434b2c1962b1ac656a410
- Branch / Tag: refs/tags/v0.2.4
- Owner: https://github.com/AR-FORUM
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@b20b0454cf62ff24b74434b2c1962b1ac656a410
- Trigger Event: push

File details

Details for the file hodoscope-0.2.4-py3-none-any.whl.

File metadata

Download URL: hodoscope-0.2.4-py3-none-any.whl
Upload date: Feb 20, 2026
Size: 62.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for hodoscope-0.2.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`773a98ca8a135cf707d07ae66897447041e0afe0e64c180e4cbb5915ab9244fc`
MD5	`bf020ecb7e093b2cb45296e0f2ebb626`
BLAKE2b-256	`b7ba7933d1109a1423713ffcb7cb2e41977f4755c726e6d8679cd0e5cb42540a`

See more details on using hashes here.

Provenance

The following attestation bundles were made for hodoscope-0.2.4-py3-none-any.whl:

Publisher: publish.yml on AR-FORUM/hodoscope

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: hodoscope-0.2.4-py3-none-any.whl
- Subject digest: 773a98ca8a135cf707d07ae66897447041e0afe0e64c180e4cbb5915ab9244fc
- Sigstore transparency entry: 974556388
- Sigstore integration time: Feb 20, 2026
Source repository:
- Permalink: AR-FORUM/hodoscope@b20b0454cf62ff24b74434b2c1962b1ac656a410
- Branch / Tag: refs/tags/v0.2.4
- Owner: https://github.com/AR-FORUM
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@b20b0454cf62ff24b74434b2c1962b1ac656a410
- Trigger Event: push

hodoscope 0.2.4

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Project description

Hodoscope

Why Hodoscope?

Features

How It Works

Table of Contents

Prerequisites

Installation

Configuration

Quick Start

CLI Reference

hodoscope analyze

hodoscope viz

hodoscope sample

hodoscope info

Trajectory Format

Output Format

Testing

Contributing

Citation

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

`hodoscope analyze`

`hodoscope viz`

`hodoscope sample`

`hodoscope info`