Python activation-steering library for PyTorch and Hugging Face-style language models.
This project has been archived.
The maintainers of this project have marked this project as archived. No new releases are expected.
Project description
Python activation steering for LLMs and transformer language models.
pysteer: Python Activation Steering for LLMs
pysteer is a lightweight Python library for activation steering,
representation engineering, and inference-time model steering in PyTorch
transformer language models. It learns steering artifacts from labeled
prompt/response examples, then applies interventions to intermediate
activations without fine-tuning or modifying model weights.
The package is designed for researchers and developers working on LLM control, mechanistic interpretability, AI safety experiments, and activation engineering workflows with Hugging Face-style models.
- PyPI package: https://pypi.org/project/pysteer/
- Documentation: https://mattiapiazzalunga.github.io/pysteer/
- Source code: https://github.com/mattiapiazzalunga/pysteer
- Issues: https://github.com/mattiapiazzalunga/pysteer/issues
Why Use pysteer
- Steer LLM behavior at inference time without retraining the model.
- Compare multiple activation-steering methods behind one
ExecutorAPI. - Build prompt-routed, adaptive, or gradient-derived steering workflows.
- Extend the steering engine with custom derivation and runtime strategies.
- Keep activation hooks scoped with a context-managed runtime wrapper.
Features
- Training-time activation extraction from selected transformer layers.
- Built-in steering methods: CMD, CPCA, ACTS-CMD, ACTS-CPCA, MBS-CMD, Angular Steering, Adaptive Activation Steering, COLD-Kernel, and COLD-Steer.
- A registry-based extension layer for adding new derivation/runtime methods
without editing
Executor. - A context-managed runtime wrapper that keeps steering hooks scoped to the calls where they are intended.
- Sphinx documentation with autodoc, Napoleon docstrings, API reference pages,
and an
opentarget.
Use Cases
- LLM activation steering and behavior control from labeled examples.
- Representation engineering experiments on residual stream activations.
- Mechanistic interpretability prototypes that compare steering directions.
- Inference-time intervention workflows where model weights should stay frozen.
- Custom activation-engineering methods for PyTorch transformer models.
Installation
Install from PyPI:
python -m pip install pysteer
Install from a local checkout for development:
python -m pip install -e ".[dev,docs]"
Install only the runtime dependencies when working from source without an editable install:
python -m pip install -r REQUIREMENTS.txt
Install documentation dependencies only when building the docs:
python -m pip install -r docs/requirements.txt
Minimal Example
The core entry point is pysteer.Executor. Training data uses prompt,
response, and reference columns, where reference identifies the desired
or positive response class for contrastive steering methods.
import pandas as pd
from pysteer import Executor
train_df = pd.DataFrame(
[
{"prompt": "Question", "response": "Helpful answer", "reference": 1},
{"prompt": "Question", "response": "Unhelpful answer", "reference": 0},
]
)
executor = Executor(
model=model,
tokenizer=tokenizer,
train_df=train_df,
method="cmd",
layers_to_extract=[12, 16, 20],
alpha=0.5,
)
wrapper = executor.representation_extractor()
with wrapper as steered_model:
output = steered_model.generate(**inputs, max_new_tokens=64)
Built-in unsupervised methods expect prompt, response, and reference.
Routed methods add their own grouping columns, such as task_id, mbs_layer,
or ACT grouping identifiers.
Training rows are validated before hooks are attached. reference must contain
only 0 and 1, and every contrastive training scope needs at least one
positive and one negative row. For standard methods the scope is the full
dataframe; ACTS validates each integer-like task_id; MBS-CMD validates each
selected mbs_layer; ACT validates each normalized ACT group.
Supported Steering Methods
pysteer ships with a default registry of activation-steering methods:
cmd: Contrastive Mean Difference steering vectors.cpca: Contrastive PCA steering directions.acts_cmd: ACTS prompt-routed CMD steering.acts_cpca: ACTS prompt-routed CPCA steering.mbs_cmd: layer-balanced CMD steering.angular: Angular Steering with plane rotations.act: Adaptive Activation Steering with prompt clustering and probes.cold_kernel: gradient-derived COLD-Kernel steering directions.cold_steer: inference-efficient COLD-Steer alias.
Architecture
The library separates steering into four concerns:
- Derivation: how an artifact is learned from activations.
- Artifact: the vector, plane, routing table, probe, or richer object produced.
- Site: where the artifact reads or writes model state.
- Runtime policy: when and how the intervention is applied.
The steering_engine package contains the extension API:
domain.pydefines declarative data structures such asActivationSite,InterventionSpec,SteeringArtifact, andSteeringMethodSpec.components.pydefines protocols for readers, derivers, runtime strategies, schedules, controllers, and compilers.registry.pyprovidesSteeringMethodRegistryandMethodDefinition.defaults.pyregisters the built-in methods.
See docs/activation_steering_architecture.md for the design rationale and
taxonomy.
Extending Methods
Register a new method with a vector factory and a runtime strategy builder:
from steering_engine import MethodDefinition, SteeringMethodRegistry
from steering_engine.domain import DerivationFamily, InterventionKind
from steering_engine.domain import RuntimeFamily, SteeringMethodSpec
registry = SteeringMethodRegistry()
registry.register(
MethodDefinition(
spec=SteeringMethodSpec(
method_id="my_method",
label="My Method",
derivation_family=DerivationFamily.CUSTOM,
runtime_family=RuntimeFamily.STATIC,
intervention_kind=InterventionKind.ADD,
),
vector_factory=lambda ctx: MyVectorDeriver(...),
strategy_builder=lambda deriver, ctx: MyRuntimeStrategy(...),
)
)
Documentation
Build the Sphinx HTML documentation:
make -C docs html
Build and open it in your default browser:
make -C docs open
On Windows without make:
docs\make.bat html
docs\make.bat open
The generated site is written to docs/_build/html/index.html.
Contributing
See CONTRIBUTING.md for development setup, local checks, and the preferred
extension path for new steering methods. Security reports should follow
SECURITY.md.
Evaluation Data
pysteer focuses on the generic steering engine and expects callers to provide
their own training dataframes for application-specific evaluations.
License
This project is licensed under the Mozilla Public License 2.0. See
LICENSE.txt.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pysteer_adaptation-0.1.1.tar.gz.
File metadata
- Download URL: pysteer_adaptation-0.1.1.tar.gz
- Upload date:
- Size: 372.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a257611a6850189da6d8a12e3076eda1a09bb0dd139b3cfd294319033fedc91d
|
|
| MD5 |
595b8ecd4bd7f449ece9558fc5a72415
|
|
| BLAKE2b-256 |
39b86cdcb70d6a7993d183ab169fb83ea6962b55e75c088f0c98d2f95657f08d
|
Provenance
The following attestation bundles were made for pysteer_adaptation-0.1.1.tar.gz:
Publisher:
publish.yml on mattiapiazzalunga/pysteer
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pysteer_adaptation-0.1.1.tar.gz -
Subject digest:
a257611a6850189da6d8a12e3076eda1a09bb0dd139b3cfd294319033fedc91d - Sigstore transparency entry: 1800645428
- Sigstore integration time:
-
Permalink:
mattiapiazzalunga/pysteer@054b381f9a13df19e69a6b9819c1d4890ee53dd1 -
Branch / Tag:
refs/heads/master - Owner: https://github.com/mattiapiazzalunga
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@054b381f9a13df19e69a6b9819c1d4890ee53dd1 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file pysteer_adaptation-0.1.1-py3-none-any.whl.
File metadata
- Download URL: pysteer_adaptation-0.1.1-py3-none-any.whl
- Upload date:
- Size: 100.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
47d707bea66b1ef78088f9916644d449355670752710e498e1ef37aa47790f7a
|
|
| MD5 |
bcd6bde438cd225104af77c0ab7d056e
|
|
| BLAKE2b-256 |
8d25d1b8906c6465c5b4f2ea24e41eb913c5c816863b0539eca4e3f25fe1e9ec
|
Provenance
The following attestation bundles were made for pysteer_adaptation-0.1.1-py3-none-any.whl:
Publisher:
publish.yml on mattiapiazzalunga/pysteer
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pysteer_adaptation-0.1.1-py3-none-any.whl -
Subject digest:
47d707bea66b1ef78088f9916644d449355670752710e498e1ef37aa47790f7a - Sigstore transparency entry: 1800645590
- Sigstore integration time:
-
Permalink:
mattiapiazzalunga/pysteer@054b381f9a13df19e69a6b9819c1d4890ee53dd1 -
Branch / Tag:
refs/heads/master - Owner: https://github.com/mattiapiazzalunga
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@054b381f9a13df19e69a6b9819c1d4890ee53dd1 -
Trigger Event:
workflow_dispatch
-
Statement type: