Steering vectors for transformer language models in Pytorch / Huggingface
Project description
Steering Vectors
Steering vectors / representation engineering for transformer language models in Pytorch / Huggingface
Check out our example notebook.
Full docs: https://steering-vectors.github.io/steering-vectors
About
This library provides utilies for training and applying steering vectors to language models (LMs) from Huggingface, like GPT, LLaMa, Gemma, Mistral, Pythia, and many more!
This library is inspired by ideas and code from the following two papers. For more info on steering vectors and representation engineering, check out these works:
- Steering Llama 2 via Contrastive Activation Addition Rimsky et al., 2023
- Representation Engineering: A Top-Down Approach to AI Transparency Zou et al., 2023
Installation
pip install steering-vectors
Check out the full documentation for more usage info.
Contributing
Any contributions to improve this project are welcome! Please open an issue or pull request in this repo with any bugfixes / changes / improvements you have.
This project uses Ruff for code formatting and linting, MyPy for type checking, and Pytest for tests. Make sure any changes you submit pass these code checks in your PR. If you have trouble getting these to run feel free to open a pull-request regardless and we can discuss further in the PR.
License
This code is released under a MIT license.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file steering_vectors-0.12.2.tar.gz.
File metadata
- Download URL: steering_vectors-0.12.2.tar.gz
- Upload date:
- Size: 22.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
90053dfe6092496fb252226143e2b9b3233558f10b538596df7efa98db02a7ef
|
|
| MD5 |
a49ead531bd5eb40fb7b117dcb7036dc
|
|
| BLAKE2b-256 |
97fc49e05fac6ace7bcccce255d50f2ed57b4cf0f0fa5c61f9dea46c017b6b53
|
Provenance
The following attestation bundles were made for steering_vectors-0.12.2.tar.gz:
Publisher:
ci.yaml on steering-vectors/steering-vectors
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
steering_vectors-0.12.2.tar.gz -
Subject digest:
90053dfe6092496fb252226143e2b9b3233558f10b538596df7efa98db02a7ef - Sigstore transparency entry: 173364387
- Sigstore integration time:
-
Permalink:
steering-vectors/steering-vectors@aba59b43b2424b700143c7a359f027472e987376 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/steering-vectors
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yaml@aba59b43b2424b700143c7a359f027472e987376 -
Trigger Event:
push
-
Statement type:
File details
Details for the file steering_vectors-0.12.2-py3-none-any.whl.
File metadata
- Download URL: steering_vectors-0.12.2-py3-none-any.whl
- Upload date:
- Size: 15.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e2556b27bb2344c7306d7768818900579590307e516e83e9b646187824e79b9f
|
|
| MD5 |
aa4382790dfa7a3bef484b149289e1d6
|
|
| BLAKE2b-256 |
ea8e8eac0d356ef0d60881e468e8117b2026464fafb165c79fdb8270145fdd23
|
Provenance
The following attestation bundles were made for steering_vectors-0.12.2-py3-none-any.whl:
Publisher:
ci.yaml on steering-vectors/steering-vectors
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
steering_vectors-0.12.2-py3-none-any.whl -
Subject digest:
e2556b27bb2344c7306d7768818900579590307e516e83e9b646187824e79b9f - Sigstore transparency entry: 173364390
- Sigstore integration time:
-
Permalink:
steering-vectors/steering-vectors@aba59b43b2424b700143c7a359f027472e987376 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/steering-vectors
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yaml@aba59b43b2424b700143c7a359f027472e987376 -
Trigger Event:
push
-
Statement type: