A library for doing research on developmental interpretability
Project description
DevInterp
A Python Library for Developmental Interpretability Research
DevInterp is a python library for conducting research on developmental interpretability, a novel AI safety research agenda rooted in Singular Learning Theory (SLT). DevInterp proposes tools for detecting, locating, and ultimately controlling the development of structure over training.
Read more about developmental interpretability.
:warning: This library is still in early development. Don't expect things to work on a first attempt. We are actively working on improving the library and adding new features.
Installation
To install devinterp, simply run pip install devinterp. (Note: This has PyTorch as a dependency.)
Minimal Example
from devinterp.slt.sampler import sample, LLCEstimator
from devinterp.optim import SGLD
from devinterp.utils import default_nbeta
# Assuming you have a PyTorch Model assigned to model, and DataLoader assigned to trainloader
llc_estimator = LLCEstimator(..., nbeta=default_nbeta(trainloader))
sample(model, trainloader, ..., callbacks = [llc_estimator])
llc_mean = llc_estimator.get_results()["llc/mean"]
Advanced Usage
To see DevInterp in action, check out our example notebooks:
For more advanced usage, see the Diagnostics notebook and for a quick guide on picking hyperparameters, see the above Grokking Demo
or the the Calibration notebook.
. Documentation can be found here.
For papers that either inspired or used the DevInterp package, click here.
Known Issues
- LLC Estimation is currently more of an art than a science. It will take some time and pain to get it work reliably.
If you run into issues not mentioned here, please first check the github issues, then ask in the DevInterp Discord, and only then make a new github issue.
Contributing
See CONTRIBUTING.md for guidelines on how to contribute.
Credits & Citations
This package was created by Timaeus. The main contributors to this package are Stan van Wingerden, Jesse Hoogland, George Wang, and William Zhou. Zach Furman, Matthew Farrugia-Roberts, Rohan Hitchcock, and Edmund Lau also made valuable contributions or provided useful advice.
If this package was useful in your work, please cite it as:
@misc{devinterpcode,
title = {DevInterp},
author = {van Wingerden, Stan and Hoogland, Jesse and Wang, George and Zhou, William},
year = {2024},
howpublished = {\url{https://github.com/timaeus-research/devinterp}},
}
Optional Dependencies
DevInterp offers additional visualization functionalities that are not included in the base installation. To enable these features, install the package with the vis extra:
pip install devinterp[vis]
This will install plotly, which is required for the visualization utilities provided in vis_utils.py.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file devinterp-1.3.2.tar.gz.
File metadata
- Download URL: devinterp-1.3.2.tar.gz
- Upload date:
- Size: 51.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c42b1ea079ac9219f7d849b79cf029663fa3ea48817c1b5001ce326e53deae6d
|
|
| MD5 |
121a2c7782dddd82a2210bc87e4cfdd1
|
|
| BLAKE2b-256 |
04f581671da0aa92963ff74bbde3f3fea5d8fd1f2acdf82edecd804ffb1339d6
|
Provenance
The following attestation bundles were made for devinterp-1.3.2.tar.gz:
Publisher:
publish.yml on timaeus-research/devinterp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
devinterp-1.3.2.tar.gz -
Subject digest:
c42b1ea079ac9219f7d849b79cf029663fa3ea48817c1b5001ce326e53deae6d - Sigstore transparency entry: 165296786
- Sigstore integration time:
-
Permalink:
timaeus-research/devinterp@a646b9dc8110d2b51da0e9babf0b99996c6626f4 -
Branch / Tag:
refs/tags/v1.3.2 - Owner: https://github.com/timaeus-research
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@a646b9dc8110d2b51da0e9babf0b99996c6626f4 -
Trigger Event:
release
-
Statement type:
File details
Details for the file devinterp-1.3.2-py3-none-any.whl.
File metadata
- Download URL: devinterp-1.3.2-py3-none-any.whl
- Upload date:
- Size: 63.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9c17bf117cf33c00d983d384ae5ae6b7c595eb04da91a2ec7eb61db96c5eef67
|
|
| MD5 |
a3a96a1bdf781ec0a6a2306bfa873dd4
|
|
| BLAKE2b-256 |
e556ba95c52ff28a4062814d88dea423f3c1b25ff345e8d36cecab6753037c2b
|
Provenance
The following attestation bundles were made for devinterp-1.3.2-py3-none-any.whl:
Publisher:
publish.yml on timaeus-research/devinterp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
devinterp-1.3.2-py3-none-any.whl -
Subject digest:
9c17bf117cf33c00d983d384ae5ae6b7c595eb04da91a2ec7eb61db96c5eef67 - Sigstore transparency entry: 165296787
- Sigstore integration time:
-
Permalink:
timaeus-research/devinterp@a646b9dc8110d2b51da0e9babf0b99996c6626f4 -
Branch / Tag:
refs/tags/v1.3.2 - Owner: https://github.com/timaeus-research
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@a646b9dc8110d2b51da0e9babf0b99996c6626f4 -
Trigger Event:
release
-
Statement type: