Skip to main content

InspectAI <-> Weights and Biases integration

Project description

Inspect WandB

Integration between Inspect and Weights & Biases, including support for both the Models API for experiment tracking, and Weave for evaluation analysis and transcripts.

Docs Build Package Build Publish to PyPI

Demo Video

Check out this brief demo video for an overview of Inspect WandB

If you prefer to read, you can check out a written tutorial on the Inspect WandB docs site

You can also check out the WandB UI and navigate through some example results in our demo project, found here

Usage

Inspect WandB can be installed with:

pip install inspect-wandb

To install the optional Weave extra:

pip install "inspect-wandb[weave]"

Once Inspect WandB is installed in an environment authenticated with Weights & Biases (either by running wandb login or setting WANDB_API_KEY), the integration will be enabled for future Inspect runs by default. The Inspect logger output will link to the Models dashboard where you can track runs, and also, if you have enabled the weave extra, to the Weave dashboard where you can visualise eval results.

Some configuration options are available, including adjusting wandb config, settings tags, and adjusting Weave trace naming. To dive deeper with Inspect WandB, please see the documentation at https://inspect-wandb.readthedocs.io/en/latest/

Examples

The following are some examples of the types of data that can be automatically logged to W&B when Inspect WandB is enabled:

Models

The Models integration allows you to track each Inspect eval or eval-set run as a WandB run. This can be useful for having a shared source-of-truth for which evals have been run, as well as storing exact configurations for faithful reproductions in future.

Screenshot of Runs table Inspect evals tracked in W&B Runs table

Screenshot of run overview Reproduction information tracked in a W&B Run, including Inspect metadata

Weave

The Weave integration traces Inspect evaluations, allowing you to track and analyse performance of different models on multiple tasks, visualise and compare result sets, and dig into individual transcripts.

Screenshot of Weave evals table Table of Inspect evaluations with score summaries in Weave

Screenshot of Weave traces Trace tree of an Inspect task, with the main solver transcript selected for a given sample

Screenshot of Weave compare Comparison of performance on AgentHarm between Claude 4 Sonnet and GPT 4o-mini

Contributing

Please see our contributing guidelines if you'd like to make contributions to Inspect WandB

Feedback

We welcome all feedback; the best way to get in touch to discuss the project is the Inspect Community #inspect_wandb Slack Channel

Project notes

This project was primarily developed by DanielPolatajko, Qi Guo, Matan Shtepel, and supervised by Justin Olive. It was supported through the MARS (Mentorship for Alignment Research Students) program at the Cambridge AI Safety Hub.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

inspect_wandb-0.2.3.tar.gz (18.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

inspect_wandb-0.2.3-py3-none-any.whl (21.5 kB view details)

Uploaded Python 3

File details

Details for the file inspect_wandb-0.2.3.tar.gz.

File metadata

  • Download URL: inspect_wandb-0.2.3.tar.gz
  • Upload date:
  • Size: 18.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for inspect_wandb-0.2.3.tar.gz
Algorithm Hash digest
SHA256 28bd9b3672f99838e30f82495ab1882064491748b513ed54bef69045c99c8539
MD5 443da09d30ec749855c1f015f1c09a85
BLAKE2b-256 4d01fa82f83b0371c03efe7ea2d5b04c164b36fbd8b291618b821319c8b66441

See more details on using hashes here.

Provenance

The following attestation bundles were made for inspect_wandb-0.2.3.tar.gz:

Publisher: publish-to-pypi.yml on DanielPolatajko/inspect_wandb

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file inspect_wandb-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: inspect_wandb-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 21.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for inspect_wandb-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 7c65f549ade3285c626eedcc5e169594a5a5a2502838e7c6d67b3306b48410e5
MD5 5be8205b9cfd2e984c3cd5df62ed4fc3
BLAKE2b-256 6f0fb71f1f4f5cc6c00d18c3d8eab8284f67f38cca9d9062de70920e6b89c969

See more details on using hashes here.

Provenance

The following attestation bundles were made for inspect_wandb-0.2.3-py3-none-any.whl:

Publisher: publish-to-pypi.yml on DanielPolatajko/inspect_wandb

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page