MI-Chef: an audit suite for mechanistic interpretability - testing attribution graph faithfulness by serving corpora to cross-layer transcoders. v0.0.1 is an early-development release for an active research project; the audit suite ships with the paper.
Project description
MI-Chef
An audit suite for mechanistic interpretability: testing attribution graph faithfulness by serving corpora to cross-layer transcoders.
michefv0.0.1 is an early-development release reserving the package name for an active research project. The full audit suite ships alongside the accompanying paper. If you landed here early: the API below is the roadmap, not yet the product.
Why
Attribution graphs — the causal stories produced by circuit tracing — are never computed on a model directly. They are computed through a replacement model (a cross-layer transcoder, CLT) trained on a corpus the researcher chooses. Whether the choice of corpus changes the story has never been measured. MI-Chef measures it, and packages the instruments so you can audit your own interpreters before trusting their testimony.
Roadmap (ships with the paper)
michef.audit— the product: circuit stability score, four-level agreement metrics (feature / subspace / graph / narrative), seed and paraphrase noise floors, Procrustes gauge controls, anti-phantom validation battery.michef.pantry— loaders for the corpus-controlled CLT grid (HuggingFace).michef.serve— corpus-to-CLT recipes (thin wrapper over CLT-Forge; consumes, never reimplements).michef.taste— side-by-side attribution-graph comparison dashboards.
Integrates with circuit-tracer and CLT-Forge checkpoints.
Status
Active research project targeting v0.1 with the paper release. Watch this space.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file michef-0.0.1.tar.gz.
File metadata
- Download URL: michef-0.0.1.tar.gz
- Upload date:
- Size: 2.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
09b526f7347cb413e73dcd14a51192e0b472ac6ceff8fb7f90c5a166d46a3e27
|
|
| MD5 |
c6c0229d53cb78f66d054dc2937c35b9
|
|
| BLAKE2b-256 |
3b3869ec63a7814f28d43a01f10199fe092e091bf1a7a0a29218d02ee531cb6b
|
File details
Details for the file michef-0.0.1-py3-none-any.whl.
File metadata
- Download URL: michef-0.0.1-py3-none-any.whl
- Upload date:
- Size: 2.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
09650dc4c9282e08093a796bc1302790e26e3232ea6c25cf4cef78443f742e2d
|
|
| MD5 |
4868207dbd8fc63685c676c6a76de51f
|
|
| BLAKE2b-256 |
b2bdebc1aa2c69de7934088ec0f1d548eb95b99f15ccc61a42f0af488ae4732d
|