Skip to main content

Open-source SAE visualizer, based on Anthropic's published visualizer.

Project description

This codebase was designed to replicate Anthropic's sparse autoencoder visualisations, which you can see here. The codebase provides 2 different views: a feature-centric view (which is like the one in the link, i.e. we look at one particular feature and see things like which tokens fire strongest on that feature) and a prompt-centric view (where we look at once particular prompt and see which features fire strongest on that prompt according to a variety of different metrics).

Install with pip install sae-vis. Link to PyPI page here.

Features & Links

Important note - this repo was significantly restructured in March 2024 (we'll remove this message at the end of April). The recent changes include:

  • The ability to view multiple features on the same page (rather than just one feature at a time)
  • D3-backed visualisations (which can do things like add lines to histograms as you hover over tokens)
  • More freedom to customize exactly what the visualisation looks like (we provide full cutomizability, rather than just being able to change certain parameters)

Here is a link to a Google Drive folder containing 3 files:

  • User Guide, which covers the basics of how to use the repo (the core essentials haven't changed much from the previous version, but there are significantly more features)
  • Dev Guide, which we recommend for anyone who wants to understand how the repo works (and make edits to it)
  • Demo, which is a Colab notebook that gives a few examples

In the demo Colab, we show the two different types of vis which are supported by this library:

  1. Feature-centric vis, where you look at a single feature and see e.g. which sequences in a large dataset this feature fires strongest on.
  1. Prompt-centric vis, where you input a custom prompt and see which features score highest on that prompt, according to a variety of possible metrics.

Citing this work

To cite this work, you can use this bibtex citation:

@misc{sae_vis,
    title  = {{SAE Visualizer}},
    author = {Callum McDougall},
    howpublished    = {\url{https://github.com/callummcdougall/sae_vis}},
    year   = {2024}
}

Contributing

This project is uses Poetry for dependency management. After cloning the repo, install dependencies with poetry install.

This project uses Ruff for formatting and linting, Pyright for type-checking, and Pytest for tests. If you submit a PR, make sure that your code passes all checks. You can run all checks with make check-all.

Version history (recording started at 0.2.9)

  • 0.2.9 - added table for pairwise feature correlations (not just encoder-B correlations)
  • 0.2.10 - fix some anomalous characters
  • 0.2.11 - update PyPI with longer description
  • 0.2.12 - fix height parameter of config, add videos to PyPI description
  • 0.2.13 - add to dependencies, and fix SAELens section
  • 0.2.14 - fix mistake in dependencies
  • 0.2.15 - refactor to support eventual scatterplot-based feature browser, fix ’ HTML
  • 0.2.16 - allow disabling buffer in feature generation, fix demo notebook, fix sae-lens compatibility & type checking
  • 0.2.17 - use main branch of sae-lens
  • 0.2.18 - remove circular dependency with sae-lens
  • 0.2.19 - formatting, error-checking
  • 0.2.20 - fix bugs, remove use of batch_size in config
  • 0.2.21 - formatting

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sae_vis-0.2.21.tar.gz (62.7 kB view details)

Uploaded Source

Built Distribution

sae_vis-0.2.21-py3-none-any.whl (69.4 kB view details)

Uploaded Python 3

File details

Details for the file sae_vis-0.2.21.tar.gz.

File metadata

  • Download URL: sae_vis-0.2.21.tar.gz
  • Upload date:
  • Size: 62.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.2 CPython/3.12.2 Windows/11

File hashes

Hashes for sae_vis-0.2.21.tar.gz
Algorithm Hash digest
SHA256 a9953a55187c688392a91f56e3a37deee379dbfb9f85819db562c8140ecdf01c
MD5 b2a18c1816121a53bfd3bfbcf8e91b9b
BLAKE2b-256 87a633ed8110f01a0c898de41e6460cb6367e3527217db2f46622a9b04cddb37

See more details on using hashes here.

File details

Details for the file sae_vis-0.2.21-py3-none-any.whl.

File metadata

  • Download URL: sae_vis-0.2.21-py3-none-any.whl
  • Upload date:
  • Size: 69.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.2 CPython/3.12.2 Windows/11

File hashes

Hashes for sae_vis-0.2.21-py3-none-any.whl
Algorithm Hash digest
SHA256 c725c3a83ad7348be4a85398b58915c5d4345b438562eb378230ecbe574b6b02
MD5 05dd4291db4bd724f44da9bcd4a5de04
BLAKE2b-256 cd6acb710b2bed78e3a1301db9a48bf5a3a0f9b5a343f6739e6e3d54cb95073b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page