Tracing the memory of neural nets with data attribution

These details have not been verified by PyPI

Project description

Bergson

This library enables you to trace the memory of deep neural nets with gradient-based data attribution techniques. We currently focus on TrackStar, as described in Scalable Influence and Fact Tracing for Large Language Model Pretraining by Chang et al. (2024), although we plan to add support for other methods inspired by influence functions in the near future.

We view attribution as a counterfactual question: If we "unlearned" this training sample, how would the model's behavior change? This formulation ties attribution to some notion of what it means to "unlearn" a training sample. Here we focus on a very simple notion of unlearning: taking a gradient ascent step on the loss with respect to the training sample. To mimic the behavior of popular optimizers, we precondition the gradient using Adam or Adafactor-style estimates of the second moments of the gradient.

Announcements

September 2025

Saving per-head gradients: https://github.com/EleutherAI/bergson/pull/40
Eigendecompositions of preconditioners: https://github.com/EleutherAI/bergson/pull/34
Dr. GRPO-based loss gradients: https://github.com/EleutherAI/bergson/pull/35
Choosing between summing and averaging losses across tokens: https://github.com/EleutherAI/bergson/pull/36
Saving the order training data is seen in while using the gradient collector callback for HF's Trainer/SFTTrainer: https://github.com/EleutherAI/bergson/pull/40
- Saving training gradients adds a ~17% wall clock overhead
Improved static index build ETA accuracy: https://github.com/EleutherAI/bergson/pull/41
Several small quality of life improvements for querying indexes: https://github.com/EleutherAI/bergson/pull/38

Installation

We're not yet on PyPI, but you can git clone the repo and install it as a package using pip:

git clone https://github.com/EleutherAI/bergson.git
cd bergson
pip install .

Usage

The first step is to build an index of gradients for each training sample. You can do this from the command line, using bergson as a CLI tool:

bergson <output_path> --model <model_name> --dataset <dataset_name>

This will create a directory at <output_path> containing the gradients for each training sample in the specified dataset. The --model and --dataset arguments should be compatible with the Hugging Face transformers library. By default it assumes that the dataset has a text column, but you can specify other columns using --prompt_column and optionally --completion_column. The --help flag will show you all available options.

You can also use the library programmatically to build the index. The collect_gradients function is just a bit lower level the CLI tool, and allows you to specify the model and dataset directly as arguments. The result is a HuggingFace dataset which contains a handful of new columns, including gradients, which contains the gradients for each training sample. You can then use this dataset to compute attributions.

At the lowest level of abstraction, the GradientCollector context manager allows you to efficiently collect gradients for each individual example in a batch during a backward pass, simultaneously randomly projecting the gradients to a lower-dimensional space to save memory. If you use Adafactor normalization, which is the default, we will do this in a very compute-efficient way which avoids computing the full gradient for each example before projecting it to the lower dimension. There are two main ways you can use GradientCollector:

Using a closure argument, which enables you to make use of the per-example gradients immediately after they are computed, during the backward pass. If you're computing summary statistics or other per-example metrics, this is the most efficient way to do it.
Without a closure argument, in which case the gradients are collected and returned as a dictionary mapping module names to batches of gradients. This is the simplest and most flexible approach but is a bit more memory-intensive.

Training Gradients

Gradient collection during training is supported via an integration with HuggingFace's Trainer and SFTTrainer classes. Training gradients are saved in the original order corresponding to their dataset items, and when the track_order flag is set the training steps associated with each training item are separately saved.

from bergson import GradientCollectorCallback, prepare_for_gradient_collection

callback = GradientCollectorCallback(
    path="runs/example",
    track_order=True,
    use_optimizer_state=False,
)
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=dataset,
    eval_dataset=dataset,
    callbacks=[callback],
)
trainer = prepare_for_gradient_collection(trainer)
trainer.train()

Attention Head Gradients

By default Bergson collects gradients for named parameter matrices, but gradients for individual attention heads within a named matrix can be collected too. To collect head gradients add a head_cfgs dictionary to the training calllback or static index config.

from bergson import HeadConfig, IndexConfig, DataConfig
from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("RonenEldan/TinyStories-1M", trust_remote_code=True, use_safetensors=True)

collect_gradients(
    model=model,
    data=data,
    processor=processor,
    path="runs/example_with_heads",
    head_cfgs={
        # Head configuration for the TinyStories-1M transformer
        "h.0.attn.attention.out_proj": HeadConfig(num_heads=16, head_size=4, head_dim=2),
    },
)

GRPO

Where a reward signal is available we compute gradients using a weighted advantage estimate based on Dr. GRPO:

bergson <output_path> --model <model_name> --dataset <dataset_name> --reward_column <reward_column_name>

Queries

We provide a query Attributor which supports unit normalized gradients and KNN search out of the box.

from bergson import Attributor, FaissConfig

attr = Attributor(args.index, device="cuda")

...
query_tokens = tokenizer(query, return_tensors="pt").to("cuda:0")["input_ids"]

# Query the index
with attr.trace(model.base_model, 5) as result:
    model(query_tokens, labels=query_tokens).loss.backward()
    model.zero_grad()

To efficiently query on-disk indexes, perform ANN searches, and explore many other scalability features add a FAISS config:

attr = Attributor(args.index, device="cuda", faiss_cfg=FaissConfig("IVF1,SQfp16", mmap_index=True))

with attr.trace(model.base_model, 5) as result:
    model(query_tokens, labels=query_tokens).loss.backward()
    model.zero_grad()

Development

pip install -e .[dev]
pytest

We use conventional commits for releases.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.5.1

Jan 30, 2026

0.5.0

Jan 8, 2026

0.4.6

Jan 6, 2026

This version

0.0.1

Oct 7, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bergson-0.0.1.tar.gz (40.1 kB view details)

Uploaded Oct 7, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

bergson-0.0.1-py3-none-any.whl (38.0 kB view details)

Uploaded Oct 7, 2025 Python 3

File details

Details for the file bergson-0.0.1.tar.gz.

File metadata

Download URL: bergson-0.0.1.tar.gz
Upload date: Oct 7, 2025
Size: 40.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.11

File hashes

Hashes for bergson-0.0.1.tar.gz
Algorithm	Hash digest
SHA256	`058599cae648579f728a2bdf5961f520b4edc68a4a65ed68b00ebd81e2dd5a48`
MD5	`8c5afa8f425ff923193bdf0e9d3650d0`
BLAKE2b-256	`45a749178a1798b578e07fe2465c0b7a64b90d99e0715286ea0750316daa7d4f`

See more details on using hashes here.

File details

Details for the file bergson-0.0.1-py3-none-any.whl.

File metadata

Download URL: bergson-0.0.1-py3-none-any.whl
Upload date: Oct 7, 2025
Size: 38.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.11

File hashes

Hashes for bergson-0.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6c046de9b978e4fdf8bf27f4b1875a73c462a7c42ac1fae0fa84f881d05e8363`
MD5	`66592bd988b24b83a02367637ef53606`
BLAKE2b-256	`5a9f81d0755713dedd1705e1f54a036bb6db71c1f2f4ee45706104921cf2359e`

See more details on using hashes here.

bergson 0.0.1

Navigation

Verified details

Owner

Unverified details

Meta

Project description

Bergson

Announcements

Installation

Usage

Training Gradients

Attention Head Gradients

GRPO

Queries

Development

Project details

Verified details

Owner

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes