Skip to main content

Neural Network Easy Extraction - Simple LLM activation extraction in one line

Project description

nnez

Neural Network Easy Extraction - A lightweight Python package for extracting activation patterns from transformer language models with just a few lines of code. The package caches your embeddings, so after you create your embedding for a piece of text once, the package will just quickly load that embedding in the future rather than recomputing it.

This is designed to be maximally simple and was originally meant to help in cases where you just want to create an embedding based on a text, which is a common usecase in cognitive neuroscience research. This package is built ontop of NNsight.

Below, I describe how to use the package. See also examples/quickstart.py. The example below uses "gpt2", which is quick to run, but its embeddings will be low quality. You may want to try a more recent model like "meta-llama/Llama-3.2-3B"

📦 Installation

pip install nnez

🎮 Quick Start

Extract Activations from any Transformer LLM

from nnez import get_activity_from_text

# Extract activations from specific layers of GPT-2
text = "The capital of France is Paris."
layers = [5, 10]  # Extract from layers 5 and 10

activations = get_activity_from_text(
    text=text,
    layers_list=layers,
    model_name="gpt2" # You can specify any huggingface model (e.g., "meta-llama/Llama-3.2-3B")
)

print(activations.shape)  # (2, 768) - 2 layers, 768 dimensions each

Single Layer Extraction

# Extract from a single layer (returns 1D array)
act = get_activity_from_text("Hello world!", 11)  # Layer 11 only
print(act.shape)  # (768,) - Single layer, flattened

Batch Processing

import numpy as np

texts = ["First text", "Second text", "Third text"]
all_activations = []

for text in texts:
    act = get_activity_from_text(text, [0, 6, 11])
    all_activations.append(act)

# Stack into 3D array: (num_texts, num_layers, hidden_size)
batch_activations = np.stack(all_activations)
print(batch_activations.shape)  # (3, 3, 768)

Grammar Utilities

The package includes some grammar utilities leveraging the inflect library. These are helpful if you want to do an analysis like that in the associated LLM-RSA paper (Bogdan et al., under review).

from nnez.grammar import get_article, pluralize, quantify

# Smart article detection
get_article("hour")       # "an" (silent h)
get_article("university") # "a"  (y-sound)
get_article("FBI")        # "an" (eff-bee-eye)

# Pluralization
pluralize("child")       # "children"
pluralize("analysis")    # "analyses"
pluralize("octopus")     # "octopuses"

# Quantification
quantify(0, "cat")        # "no cats"
quantify(1, "child")      # "1 child"
quantify(3, "child")      # "3 children"

📊 Output Shape Reference

Model HuggingFace Name Hidden Size Output Shape (3 layers)
GPT-2 gpt2 768 (3, 768)
Llama 3.2 3B meta-llama/Llama-3.2-3B 3072 (3, 3072)
Llama 3.1 8B meta-llama/Llama-3.1-8B 4096 (3, 4096)
Qwen 2.5 3B Qwen/Qwen2.5-3B 2048 (3, 2048)
Qwen 2.5 7B Qwen/Qwen2.5-7B 3584 (3, 3584)
Mistral 7B mistralai/Mistral-7B-v0.1 4096 (3, 4096)
Gemma 2 2B google/gemma-2-2b 2304 (3, 2304)
Phi-3 Mini microsoft/Phi-3-mini-4k-instruct 3072 (3, 3072)
BERT Base bert-base-uncased 768 (3, 768)
BERT Large bert-large-uncased 1024 (3, 1024)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nnez-0.1.3.tar.gz (18.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nnez-0.1.3-py3-none-any.whl (13.8 kB view details)

Uploaded Python 3

File details

Details for the file nnez-0.1.3.tar.gz.

File metadata

  • Download URL: nnez-0.1.3.tar.gz
  • Upload date:
  • Size: 18.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.4

File hashes

Hashes for nnez-0.1.3.tar.gz
Algorithm Hash digest
SHA256 c33de7d1f245faff810c26a3c650a495a9a744f51846aaf3eb43f33aea8fc517
MD5 5212a577e2ea3c7327e5d50dfbcee0bf
BLAKE2b-256 875652a8ca050241d17deda6f8c0dfb7705890d6bb4f92f520a7fc470c5e8809

See more details on using hashes here.

File details

Details for the file nnez-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: nnez-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 13.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.4

File hashes

Hashes for nnez-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 6a4f8fc28cc406b43ac6f66feaa2b184c2dd4a3cf2d3b081dc627a890ca06cb7
MD5 79826f5f291f8d059b56ca27adc7245e
BLAKE2b-256 07bccf33c6000e15508f7517ec16149ab6e8894b3642207994a853d89e53045c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page