Tools for applying circuits-style interpretability techniques to RL agents.
Project description
CircRL
A small library of mech interp tools, primarily focused on interpreting RL policies (though most of the tools are general).
The library has three main components:
- Hooks: Tools for hooking into PyTorch models that provide simple, safe wrappers around PyTorch forward hooks functionality and allow easy caching, patching and arbitrary hook functions.
- Probing: Tools for training linear probes on model activations (or any other data), including sparse probes.
- Rollouts: Tools for running rollouts and collectiong various kinds of data through a unified interface.
Installation
CircRL is available on PyPI, and can be installed with pip:
pip install circrl
Usage
A detailed, self-contained demo of CircRL is available in the CircRL demo notebook.
License
CircRL is licensed under the MIT license.
Citation
If you use CircRL in your research, please cite according to:
@misc{circrl,
author = {MacDiarmid, Monte},
title = {CircRL},
year = {2023},
url = {https://github.com/montemac/circrl}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file circrl-1.0.0.tar.gz.
File metadata
- Download URL: circrl-1.0.0.tar.gz
- Upload date:
- Size: 149.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5ffef4e966564983683092697ecdc2240ce2e546e7f994aa9e2860d94df26ae4
|
|
| MD5 |
817f02e048c3506f663116a0db393826
|
|
| BLAKE2b-256 |
88b39e000ded2ed6ddebdc55ce47f6738a713818fcbb871d232d42c239da92a8
|
File details
Details for the file circrl-1.0.0-py3-none-any.whl.
File metadata
- Download URL: circrl-1.0.0-py3-none-any.whl
- Upload date:
- Size: 10.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
32e1cb40580cf2c3e0f48994325cc50bdecbe1a92f4c7aee3519f43712cd18d3
|
|
| MD5 |
111c983c77309937a5cad26996b95f6c
|
|
| BLAKE2b-256 |
b4074f32be8d33799c0af8c993df316ee8cb710ee8638cf6b6c5bc6801d7517b
|