Kiji Inspector
Project description
Kiji Inspector: Mechanistic Interpretability for AI Agent Tool Selection
Status
This project is under heavy active development. We are planning to release a stable version of the framework in the coming weeks.
In the meantime, join our Slack Community
Learn more about our approach and early results:
What This Project Does
This project trains Sparse Autoencoders (SAEs) on the internal activations of an AI agent to understand why it selects specific tools. Given a user request like "Search our docs for API limits," the agent must choose between tools (e.g., internal_search vs web_search). We extract the model's hidden representations at the moment of that decision, decompose them into interpretable features using a JumpReLU SAE, and validate the resulting explanations through automated fuzzing and causal ablation experiments.
The key insight: train the SAE on raw activations (not difference vectors), then use contrastive pairs post-hoc to identify which learned features correspond to specific tool-selection decisions. This preserves the SAE's general feature dictionary while enabling targeted analysis of decision-relevant features.
Install
For loading and running pretrained SAEs:
pip install kiji-inspector
For the full extraction, training, and analysis workflow:
pip install 'kiji-inspector[train]'
kiji-inspector[full] is also available as an alias for the same full stack.
Quick Start
from kiji_inspector import SAE
sae, feature_descriptions = SAE.from_pretrained(
base_model="nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16",
layer=20,
)
features = sae.encode(activations)
reconstruction = sae.decode(features)
Training and data-generation entrypoints live under the package namespace:
python -m kiji_inspector.generate_pairs 1300
python -m kiji_inspector.pipeline --layers 10 20 30
Local vLLM patches
For local experiments that require the custom vllm extraction changes, rebuild the environment and apply the patch set from the repository root:
uv sync --no-cache --refresh --extra full --group dev
./patches/apply-patch.sh
The apply script installs every *.patch file under patches in lexical order:
01_allow_extract_hidden_states.patch02_support_nemotron_models.patch03_support_gemma3_models.patch
Additional workflow details live in patches/README_PATCH.md.
🤝 Contributing
We welcome contributions! Whether you're fixing a bug, improving documentation, or proposing a new feature, your help is appreciated.
Ways to Contribute
- Report Bugs - Open an issue with steps to reproduce
- Improve Docs - Documentation PRs are always welcome
- Submit Features - Open an issue to discuss your idea before submitting a PR
- Share Feedback - Start a discussion
Community
- Slack - Join our community to ask questions and connect with other contributors
- Contributors - See CONTRIBUTORS.md for the list of people who have contributed
📄 License
Copyright (c) 2026 Dataiku SAS
This project is licensed under the Apache 2.0 License - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file kiji_inspector-0.5.0rc0.tar.gz.
File metadata
- Download URL: kiji_inspector-0.5.0rc0.tar.gz
- Upload date:
- Size: 102.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6cf2e8c0be7be91301cc5fdcbf7b687755d38ffff9bc6ae64fa13d9d0b17a128
|
|
| MD5 |
71352d2ac4de87a775f17a1054e5b7af
|
|
| BLAKE2b-256 |
290e74351db9dad489a74d14f3cca8ad1277af2613792aa438cb574f82ab80f0
|
Provenance
The following attestation bundles were made for kiji_inspector-0.5.0rc0.tar.gz:
Publisher:
publish-kiji-inspector.yml on dataiku/kiji-inspector
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
kiji_inspector-0.5.0rc0.tar.gz -
Subject digest:
6cf2e8c0be7be91301cc5fdcbf7b687755d38ffff9bc6ae64fa13d9d0b17a128 - Sigstore transparency entry: 1550355466
- Sigstore integration time:
-
Permalink:
dataiku/kiji-inspector@279e4c6b3d26b2db0b87a053a192cc90f893b520 -
Branch / Tag:
refs/heads/feat/new-vllm-gemma4 - Owner: https://github.com/dataiku
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-kiji-inspector.yml@279e4c6b3d26b2db0b87a053a192cc90f893b520 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file kiji_inspector-0.5.0rc0-py3-none-any.whl.
File metadata
- Download URL: kiji_inspector-0.5.0rc0-py3-none-any.whl
- Upload date:
- Size: 114.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
19dcbcdfc6810e8ab72c139c4334d622e609ef69dc6e2cb239f7ac29cdbf0730
|
|
| MD5 |
7a67af3515abfb5b4a4fc779f5bd60f4
|
|
| BLAKE2b-256 |
08529a570c8648fee5a54afcea20231329df74c9edf318dea97a336d2ca77111
|
Provenance
The following attestation bundles were made for kiji_inspector-0.5.0rc0-py3-none-any.whl:
Publisher:
publish-kiji-inspector.yml on dataiku/kiji-inspector
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
kiji_inspector-0.5.0rc0-py3-none-any.whl -
Subject digest:
19dcbcdfc6810e8ab72c139c4334d622e609ef69dc6e2cb239f7ac29cdbf0730 - Sigstore transparency entry: 1550355814
- Sigstore integration time:
-
Permalink:
dataiku/kiji-inspector@279e4c6b3d26b2db0b87a053a192cc90f893b520 -
Branch / Tag:
refs/heads/feat/new-vllm-gemma4 - Owner: https://github.com/dataiku
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-kiji-inspector.yml@279e4c6b3d26b2db0b87a053a192cc90f893b520 -
Trigger Event:
workflow_dispatch
-
Statement type: