Skip to main content

Training and Analyzing Sparse Autoencoders (SAEs)

Project description

saes_pic

SAE Lens

PyPI License: MIT build Deploy Docs codecov

SAELens exists to help researchers:

  • Train sparse autoencoders.
  • Analyse sparse autoencoders / research mechanistic interpretability.
  • Generate insights which make it easier to create safe and aligned AI systems.

SAELens inference works with any PyTorch-based model, not just TransformerLens. While we provide deep integration with TransformerLens via HookedSAETransformer, SAEs can be used with Hugging Face Transformers, NNsight, or any other framework by extracting activations and passing them to the SAE's encode() and decode() methods.

Please refer to the documentation for information on how to:

  • Download and Analyse pre-trained sparse autoencoders.
  • Train your own sparse autoencoders.
  • Generate feature dashboards with the SAE-Vis Library.

SAE Lens is the result of many contributors working collectively to improve humanity's understanding of neural networks, many of whom are motivated by a desire to safeguard humanity from risks posed by artificial intelligence.

This library is maintained by Joseph Bloom, Curt Tigges, Anthony Duong and David Chanin.

Loading Pre-trained SAEs.

Pre-trained SAEs for various models can be imported via SAE Lens. See this page for a list of all SAEs.

Migrating to SAELens v6

The new v6 update is a major refactor to SAELens and changes the way training code is structured. Check out the migration guide for more details.

Tutorials

Join the Slack!

Feel free to join the Open Source Mechanistic Interpretability Slack for support!

Other SAE Projects

  • dictionary-learning: An SAE training library that focuses on having hackable code.
  • Sparsify: A lean SAE training library focused on TopK SAEs.
  • Overcomplete: SAE training library focused on vision models.
  • SAE-Vis: A library for visualizing SAE features, works with SAELens.
  • SAEBench: A suite of LLM SAE benchmarks, works with SAELens.

Citation

Please cite the package as follows:

@misc{bloom2024saetrainingcodebase,
   title = {SAELens},
   author = {Bloom, Joseph and Tigges, Curt and Duong, Anthony and Chanin, David},
   year = {2024},
   howpublished = {\url{https://github.com/decoderesearch/SAELens}},
}

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sae_lens-6.39.0.tar.gz (256.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sae_lens-6.39.0-py3-none-any.whl (290.9 kB view details)

Uploaded Python 3

File details

Details for the file sae_lens-6.39.0.tar.gz.

File metadata

  • Download URL: sae_lens-6.39.0.tar.gz
  • Upload date:
  • Size: 256.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sae_lens-6.39.0.tar.gz
Algorithm Hash digest
SHA256 ac2b370f76462e564c83e66749345b2be3e7822832b94c02eea4eaa1e3da4726
MD5 52129a799486d3c61bb2a5c1850afd22
BLAKE2b-256 570281dc2fdb0bd720c9232551f2d80b324f5a25015c8f36389a07b566a6c00e

See more details on using hashes here.

Provenance

The following attestation bundles were made for sae_lens-6.39.0.tar.gz:

Publisher: build.yml on decoderesearch/SAELens

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sae_lens-6.39.0-py3-none-any.whl.

File metadata

  • Download URL: sae_lens-6.39.0-py3-none-any.whl
  • Upload date:
  • Size: 290.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sae_lens-6.39.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fbbc1a8ed4d71391bb4d91c23229c5f14dbb084785c7167cad7a7cb87331e971
MD5 ce7465d676a5008841d9d843f1c2db82
BLAKE2b-256 d97bc5f5c5f0c1548031c77ee449d525c9a44a8d40cc34ad47894ce3773ab27d

See more details on using hashes here.

Provenance

The following attestation bundles were made for sae_lens-6.39.0-py3-none-any.whl:

Publisher: build.yml on decoderesearch/SAELens

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page