Skip to main content

Add your description here

Project description

XLens

A Library for Mechanistic Interpretability of Generative Language Models using JAX. Inspired by TransformerLens.

Overview

XLens is designed for mechanistic interpretability of Transformer language models, leveraging the power and efficiency of JAX. The primary goal of mechanistic interpretability is to reverse engineer the algorithms that a model has learned during training, enabling researchers and practitioners to understand the inner workings of generative language models.

Features

⚠️ Please Note: Some features are currently in development and may not yet be fully functional. We appreciate your understanding as we work to improve and stabilize the library.

  • Support for Hooked Modules: Interact with and modify internal model components seamlessly.
  • Model Alignment with Hugging Face: Outputs from XLens are consistent with Hugging Face's implementation, making it easier to integrate and compare results.
  • Caching Mechanism: Cache any internal activation for further analysis or manipulation during model inference.
  • Full Type Annotations: Comprehensive type annotations with generics and jaxtyping for better code completion and type checking.
  • Intuitive API: Designed with ease of use in mind, facilitating quick experimentation and exploration.

Installation

XLens can be installed via pip:

pip install xlens

Examples

Here are some basic examples to get you started with XLens.

Capturing Activations

from xlens import HookedTransformer
from transformers import AutoTokenizer

# Load a pre-trained model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.2-1B")
model = HookedTransformer.from_pretrained("meta-llama/Llama-3.2-1B")

# Capture the activations of the model
inputs = tokenizer("Hello, world!", return_tensors="np")
logits, cache = model.run_with_cache(**inputs, hook_names=["blocks.0.hook_attn_out"])
print(cache["blocks.0.hook_attn_out"].shape) # (1, 5, 2048)

Supported Models

XLens currently supports the following models:

Feel free to open an issue or pull request if you would like to see support for additional models.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xlens-0.1.1.tar.gz (14.0 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

xlens-0.1.1-py3-none-any.whl (35.2 kB view details)

Uploaded Python 3

File details

Details for the file xlens-0.1.1.tar.gz.

File metadata

  • Download URL: xlens-0.1.1.tar.gz
  • Upload date:
  • Size: 14.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.4.30

File hashes

Hashes for xlens-0.1.1.tar.gz
Algorithm Hash digest
SHA256 2252e854c7dc5c6559c8a7945beff1cba0977915ce45cb4f2944b66555904739
MD5 9a377158262ce7d81d216059425008df
BLAKE2b-256 9731855263d07716dad6125a5f499438b7a50faeb0a797e5ff59330fade50ae3

See more details on using hashes here.

File details

Details for the file xlens-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: xlens-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 35.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.4.30

File hashes

Hashes for xlens-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 5e8390e57fd698710060f1e1b3ee34c45bbf845b6d52fb6b060981c46df6a504
MD5 5881f13eed935292a7fa48ed7a7c7625
BLAKE2b-256 cee866e202d6b93341ae9eb1a8c7699aaee8f4ee3a2f2dfcb39ed96a5fc39d8f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page