InterpreLens: A Lens for Interpreting Large Language Models based on Transformers architecture.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

Project description

inteprelens

inteprelens is the extracted core package for transformer architecture analysis. It keeps the Causal Head Gating (CHG) workflow and adds tracing utilities for named transformer stages and internal logits.

Overview

Use inteprelens when you want to:

train CHG masks over attention heads
inspect necessary, sufficient, and facilitating heads
trace intermediate transformer states such as attention output projection inputs, block outputs, final norm states, and per-layer logits
score final-token logits, log-probabilities, and probabilities for calibration-style analysis
run gradient-based attribution through facilitating head circuits

This repo is the reusable core package only. It does not include downstream task pipelines or visualization workflows.

Installation

Install the runtime package:

pip install inteprelens

For local development:

uv sync --group dev
uv run pytest -q

For build and publish checks:

uv sync --group publish
uv run --group publish python -m build
uv run --group publish twine check dist/*

If you use a gated Hugging Face model such as meta-llama/Llama-3.2-1B, make sure HF_TOKEN is available in your environment or .env.

Quick Start

from inteprelens import LensAPI

analyzer = LensAPI.from_pretrained("meta-llama/Llama-3.2-1B")

results = analyzer.fit(
    texts=["What is the capital of France?"],
    targets=["Paris"],
    num_masks=1,
    num_updates=1,
    num_reg_updates=1,
    batch_size=1,
    verbose=False,
)

print(results.summary())
print(results.necessary_heads().head())

Usage Examples

CHG analysis

from inteprelens import LensAPI

analyzer = LensAPI.from_pretrained("meta-llama/Llama-3.2-1B")

results = analyzer.fit(
    texts=[
        "The capital of France is",
        "2 + 2 equals",
    ],
    targets=[
        "Paris",
        "4",
    ],
    num_masks=1,
    num_updates=1,
    num_reg_updates=1,
    batch_size=1,
    verbose=False,
)

necessary = results.necessary_heads()
taxonomy = results.head_taxonomy()

print(necessary.head())
print(taxonomy.head())

Trace transformer stages and logits

from inteprelens import LensAPI

analyzer = LensAPI.from_pretrained("meta-llama/Llama-3.2-1B")

trace = analyzer.trace(
    texts="Paris is the capital of",
    layers=[0],
    sites=["attn_o_proj_pre", "final_norm", "logits"],
)

print(trace.get("attn_o_proj_pre", 0).shape)
print(trace.get("logits", 0).shape)
print(trace.final_logits.shape)

trace.final_logits contains the model's final output logits for the traced batch.

Score final-token logits, log-probabilities, and probabilities

from inteprelens import LensAPI

analyzer = LensAPI.from_pretrained("meta-llama/Llama-3.2-1B")

scores = analyzer.score(
    texts=[
        "The capital of France is",
        "2 + 2 equals",
    ],
    temperature=1.0,
)

print(scores.logits.shape)
print(scores.log_probs.shape)
print(scores.probs.shape)

final_logits = scores.final_token_logits()
final_log_probs = scores.final_token_log_probs()
final_probs = scores.final_token_probs()

print(final_logits.shape)
print(final_log_probs.shape)
print(final_probs.shape)

Use logits as the canonical calibration output, derive log_probs for stable token scoring, and use probs when you need confidence-style metrics.

Gradient-based attribution through facilitating heads

from inteprelens import CausalCircuitAttribution, LensAPI

analyzer = LensAPI.from_pretrained("meta-llama/Llama-3.2-1B")

results = analyzer.fit(
    texts=["The capital of France is"],
    targets=["Paris"],
    num_masks=1,
    num_updates=1,
    num_reg_updates=1,
    batch_size=1,
    verbose=False,
)

facilitating_mask = results.get_facilitating_mask()
attribution = CausalCircuitAttribution(analyzer.model, analyzer.tokenizer)

sentence_scores = attribution.compute_sentence_importance(
    document="Paris is the capital of France. It is one of Europe's largest cities.",
    sentences=[
        "Paris is the capital of France.",
        "It is one of Europe's largest cities.",
    ],
    summary="Paris is the capital of France.",
    facilitating_mask=facilitating_mask,
)

print(sentence_scores)

Public API

LensAPI: high-level interface for CHG fitting, token scoring, and named-site tracing
TransformerTracer: lower-level tracer for direct transformer-site collection
CausalCircuitAttribution: gradient attribution through facilitating CHG heads
CHGDataset: helper for building CHG-ready datasets from prompt/target pairs

Acknowledgements

inteprelens builds on and adapts code and ideas from the Causal Head Gating project. The extracted core package keeps CHG support and extends it with transformer-stage tracing and internal-logit inspection for architecture analysis workflows.

License

This project is released under the MIT License. See LICENSE.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

Aisuko

Release history Release notifications | RSS feed

0.2.1

Apr 5, 2026

This version

0.1.0

Mar 9, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

inteprelens-0.1.0.tar.gz (53.9 kB view details)

Uploaded Mar 9, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

inteprelens-0.1.0-py3-none-any.whl (59.7 kB view details)

Uploaded Mar 9, 2026 Python 3

File details

Details for the file inteprelens-0.1.0.tar.gz.

File metadata

Download URL: inteprelens-0.1.0.tar.gz
Upload date: Mar 9, 2026
Size: 53.9 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for inteprelens-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`590b0b34f7846130f3737e7a8826ac130bfa9e8e7168aedf80ebe7dfe2391399`
MD5	`cd7be8ec6f16d51678fb2aabc2c0ef7e`
BLAKE2b-256	`3606a777c48588e568728d43dde05842c9dfeed619b9c08264f80ca9c15eb29e`

See more details on using hashes here.

Provenance

The following attestation bundles were made for inteprelens-0.1.0.tar.gz:

Publisher: publish.yml on Aisuko/inteprelens

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: inteprelens-0.1.0.tar.gz
- Subject digest: 590b0b34f7846130f3737e7a8826ac130bfa9e8e7168aedf80ebe7dfe2391399
- Sigstore transparency entry: 1065751160
- Sigstore integration time: Mar 9, 2026
Source repository:
- Permalink: Aisuko/inteprelens@b8a802257d2bcd17c690da7121de32a11f806043
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/Aisuko
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@b8a802257d2bcd17c690da7121de32a11f806043
- Trigger Event: release

File details

Details for the file inteprelens-0.1.0-py3-none-any.whl.

File metadata

Download URL: inteprelens-0.1.0-py3-none-any.whl
Upload date: Mar 9, 2026
Size: 59.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for inteprelens-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`fb41bee98cf3aac6c7e2033ce6954d4bc48712ae9c78246c9c9d18ae9a40b4db`
MD5	`650cbd8bb2211ad4f4c896f18c65b14c`
BLAKE2b-256	`daed511453e2c0df8e8344af91484a550ebbdd7ad12c47512a13d7e7854c2dd2`

See more details on using hashes here.

Provenance

The following attestation bundles were made for inteprelens-0.1.0-py3-none-any.whl:

Publisher: publish.yml on Aisuko/inteprelens

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: inteprelens-0.1.0-py3-none-any.whl
- Subject digest: fb41bee98cf3aac6c7e2033ce6954d4bc48712ae9c78246c9c9d18ae9a40b4db
- Sigstore transparency entry: 1065751164
- Sigstore integration time: Mar 9, 2026
Source repository:
- Permalink: Aisuko/inteprelens@b8a802257d2bcd17c690da7121de32a11f806043
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/Aisuko
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@b8a802257d2bcd17c690da7121de32a11f806043
- Trigger Event: release

inteprelens 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Project description

inteprelens

Overview

Installation

Quick Start

Usage Examples

CHG analysis

Trace transformer stages and logits

Score final-token logits, log-probabilities, and probabilities

Gradient-based attribution through facilitating heads

Public API

Acknowledgements

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance