Steerling: An interpretable causal diffusion language model with concept steering

These details have not been verified by PyPI

Project links

Project description

Steerling

An interpretable causal diffusion language model.

Steerling-8B combines masked diffusion language modeling with concept decomposition, enabling:

Generation: Non-autoregressive text generation via confidence-based unmasking
Attribution: Decompose predictions into known concept contributions
Steering: Intervene on concept activations to control generation
Embeddings: Extract hidden, composed, known, or unknown representations

Quick Start

pip install steerling

from steerling import SteerlingGenerator, GenerationConfig

generator = SteerlingGenerator.from_pretrained("guidelabs/steerling-8b")

text = generator.generate(
    "The key to understanding neural networks is",
    GenerationConfig(max_new_tokens=100, seed=42),
)
print(text)

Model Details

Property	Value
Parameters	~8B
Architecture	CausalDiffusionLM + Interpretable Concept Head
Context Length	4096
Vocabulary	100,281 (cl100k_base + specials)
Known Concepts	33,732
Unknown Concepts	101,196
GQA	32 heads, 4 KV heads
Precision	bfloat16
License	Apache 2.0

Architecture

Steerling uses block-causal attention (bidirectional within 64-token blocks, causal across blocks) with masked diffusion training. At inference, tokens are generated by iteratively unmasking positions in order of model confidence. The interpretable concept heads decompose transformer hidden states h into:

h → known_features + unk_hat + epsilon = composed → lm_head → logits

known_features: Weighted sum of top-k learned concept embeddings
unk_hat: Residual features captured by a factorized unknown head
epsilon: Small correction term for reconstruction fidelity

Installation

# From PyPI
pip install steerling

# From source
git clone https://github.com/guidelabs/steerling.git
cd steerling
pip install -e ".[dev]"

# With evaluation support
pip install -e ".[all]"

FAQ

Where can I read more about the details of this architecture?
You can read more about the architecture in these blog posts: Scaling Interpretable Models with 8B Parameters and Causal Diffusion Language Models. We will be releasing a more detailed technical report in a few months.
This is a base model, what about an instruction-tuned model?
Stay tuned.
Is training code available?
This release is inference-only. Training code is not included. If you're interested in training or fine-tuning, please reach out to Guide Labs.
What dataset did you train on?
We trained on an augmented version of the Nemontron-cc-hq data for a total of about 1.3 Trillion tokens.
What is block-causal attention?
Standard causal attention only lets each token attend to previous tokens. Block-causal attention groups tokens into blocks of say 64 and allows bidirectional attention within each block, while maintaining causal ordering across blocks. This gives the model local bidirectional context while preserving the ability to generate sequentially. Refer to this post: Causal Diffusion Language Models, for more details.
What are "known" and "unknown" concepts?
The model decomposes its internal representations into two parts:
- Known concepts (33,732): learned and supervised features that correspond to identifiable patterns that a human will understand.
- Unknown concepts (101,196): capture the signal that known concepts don't explain in the hidden representations.
- Together they reconstruct the full hidden state with an error: hidden ≈ known_features + unknown_features + epsilon.
How do I find concept IDs for steering?
The concept metadata is in concepts/complete_concept_info.csv (shipped with the HuggingFace model). Each row maps a concept ID to its description. Use positive values to amplify a concept and negative values to suppress it:
```
config = GenerationConfig(steer_known={concept_id: 2.0})   # amplify
config = GenerationConfig(steer_known={concept_id: -1.0})  # suppress
```
What GPU do I need?
Steerling-8B in bfloat16 requires approximately 18GB VRAM. It fits on a single H100, A100 (40GB or 80GB), A6000 (48GB), or RTX 4090 (24GB). It does not fit on consumer GPUs with 16GB or less.
Can I fine-tune this model?
Yes. However, we have not included finetuning code with this package. Steerling is an inference-only release; if there is increasing request, we will support fine-tuning in a future release.
What tokenizer does Steerling-8B use?
Steerling uses OpenAI's cl100k_base tokenizer (via tiktoken) with 4 additional special tokens: <|pad|>, <|bos|>, <|endofchunk|>, and <|mask|>, for a total vocabulary of 100,281 tokens.
Can I use this with the Hugging Face transformers library?
Not directly, Steerling uses a custom architecture (block-causal attention, concept heads) that isn't in the transformers library. Use the steerling package instead, which provides SteerlingGenerator.from_pretrained() with a similar interface.
How do I get training data attributions?
This release is a light-weight version of the pipeline, so it doesn't directly support training data attribution. We have provided notebooks to enable concept, and feature attributions. If you're interested in supporting training data attribution, please reach out to Guide Labs.

License

The Steerling source code is released under the Apache License 2.0.

The model weights are provided for research and evaluation purposes. The weights were trained on datasets with varying license terms, including Nemotron-CC-HQ and Dolmino Mix. Some training data includes synthetic content generated by third-party models with their own license terms. We are currently reviewing the implications of these upstream licenses for downstream use of the model weights. Please check back for updates on the weight licensing terms.

For questions about commercial use of the model weights, contact us at info@guidelabs.ai

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.2

Feb 23, 2026

0.1.1

Feb 23, 2026

This version

0.1.0

Feb 23, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

steerling-0.1.0.tar.gz (213.1 kB view details)

Uploaded Feb 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

steerling-0.1.0-py3-none-any.whl (42.1 kB view details)

Uploaded Feb 23, 2026 Python 3

File details

Details for the file steerling-0.1.0.tar.gz.

File metadata

Download URL: steerling-0.1.0.tar.gz
Upload date: Feb 23, 2026
Size: 213.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.2

File hashes

Hashes for steerling-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`0c23ca766473d896566679d1d994556a496b02ef274bebd0a989aada50073f62`
MD5	`d22d07a8428e71360b20e39fa2f50a22`
BLAKE2b-256	`a2e93465bd63b13a853c08602b603b20f7fa29e73c37259f2c2b6f65afd538ed`

See more details on using hashes here.

File details

Details for the file steerling-0.1.0-py3-none-any.whl.

File metadata

Download URL: steerling-0.1.0-py3-none-any.whl
Upload date: Feb 23, 2026
Size: 42.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.2

File hashes

Hashes for steerling-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2d25389d432e6286da92ddafc416e5d067ed9f954c85099ca4980f92902fd6da`
MD5	`abdadcdd75ad5f4ffe8fc70f0c334256`
BLAKE2b-256	`627f2d97c0a406c96c60f0d502f898222e87acce9f3721548385f1a374ef8beb`

See more details on using hashes here.

steerling 0.1.0

Navigation

Verified details

Owner

Unverified details

Project links

Meta

Classifiers

Project description

Steerling

Quick Start

Model Details

Architecture

Installation

FAQ

License

Project details

Verified details

Owner

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes