Open-source SAE visualizer, based on Anthropic's published visualizer.

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

Project description

Summary

This codebase was designed to replicate Anthropic's sparse autoencoder visualisations, which you can see here. The codebase provides 2 different views: a feature-centric view (which is like the one in the link, i.e. we look at one particular feature and see things like which tokens fire strongest on that feature) and a prompt-centric view (where we look at once particular prompt and see which features fire strongest on that prompt according to a variety of different metrics).

Install with pip install sae-vis. Link to PyPI page here.

Features & Links

Important note - this repo was significantly restructured in March 2024 (we'll remove this message at the end of April). The recent changes include:

The ability to view multiple features on the same page (rather than just one feature at a time)
D3-backed visualisations (which can do things like add lines to histograms as you hover over tokens)
More freedom to customize exactly what the visualisation looks like (we provide full cutomizability, rather than just being able to change certain parameters)

Here is a link to a Google Drive folder containing 3 files:

User Guide, which covers the basics of how to use the repo (the core essentials haven't changed much from the previous version, but there are significantly more features)
Dev Guide, which we recommend for anyone who wants to understand how the repo works (and make edits to it)
Demo, which is a Colab notebook that gives a few examples

In the demo Colab, we show the two different types of vis which are supported by this library:

Feature-centric vis, where you look at a single feature and see e.g. which sequences in a large dataset this feature fires strongest on.

Prompt-centric vis, where you input a custom prompt and see which features score highest on that prompt, according to a variety of possible metrics.

Version history (recording started at `0.2.9`)

0.2.9 - added table for pairwise feature correlations (not just encoder-B correlations)
0.2.10 - fix some anomalous characters, and update PyPI with longer description

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

Release history Release notifications | RSS feed

0.2.21

Jul 12, 2024

0.2.20

Jul 12, 2024

0.2.19

Jun 12, 2024

0.2.18

Apr 22, 2024

0.2.17

Apr 21, 2024

0.2.16

Apr 20, 2024

0.2.15

Apr 14, 2024

0.2.14

Apr 13, 2024

0.2.13

Apr 13, 2024

0.2.12

Apr 8, 2024

0.2.11

Apr 8, 2024

This version

0.2.10

Apr 8, 2024

0.2.9

Apr 6, 2024

0.2.8

Apr 6, 2024

0.2.7

Apr 6, 2024

0.2.6

Apr 5, 2024

0.2.5

Apr 1, 2024

0.2.4

Mar 31, 2024

0.2.3

Mar 31, 2024

0.2.2

Mar 30, 2024

0.2.1

Mar 27, 2024

0.2

Mar 27, 2024

0.0.0

Apr 21, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

sae_vis-0.2.10-py3-none-any.whl (154.1 kB view hashes)

Uploaded Apr 8, 2024 Python 3

Hashes for sae_vis-0.2.10-py3-none-any.whl

Hashes for sae_vis-0.2.10-py3-none-any.whl
Algorithm	Hash digest
SHA256	`635f2cb22a262df3e290b3ad1cc9908839fa808451d760ef97f4e5487df0a930`
MD5	`516df847307e4248e6dfc8c9763fff9d`
BLAKE2b-256	`ab56048dc1dd1a5324d94cf6b20a95d4aca29d4d547c7b018b56c12d28f45bb1`

sae-vis 0.2.10

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

Summary

Features & Links

Version history (recording started at `0.2.9`)

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

sae-vis 0.2.10

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

Summary

Features & Links

Version history (recording started at 0.2.9)

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

Version history (recording started at `0.2.9`)