Vision-Language Model Interpretability Analysis - One Token at a Time
Project description
OTaT: One Token at a Time
Vision-Language Model Interpretability Analysis toolkit for analyzing attention patterns in models like LLaVA and Qwen-VL.
Installation
From PyPI
pip install otat
From GitHub
pip install git@github.com:varungupta31/otat_api.git
Local Development
git clone https://github.com/varungupta31/otat_api.git
cd otat_api
pip install -e .
Quick Start
from interpretability.api.wrapper import InterpretabilityAnalyzer
# Initialize analyzer for LLaVA OneVision 0.5B
analyzer = InterpretabilityAnalyzer(
model_type="llava_onevision",
model_id="llava-hf/llava-onevision-qwen2-0.5b-ov-hf",
device='cuda:1' #or 'auto' or whichever device you wish to load the model on.
)
# Initialize analyzer for LLaVA OneVision 7B
analyzer = InterpretabilityAnalyzer(
model_type="llava_onevision",
model_id="llava-hf/llava-onevision-qwen2-7b-ov-hf",
device='auto'
)
# Initialize analyzer for Qwen2.5-VL 3B
analyzer = InterpretabilityAnalyzer(
model_type="qwen_25_vl",
model_id="Qwen/Qwen2.5-VL-3B-Instruct"
device='auto'
)
# Initialize analyzer for Qwen2.5-VL 7B
analyzer = InterpretabilityAnalyzer(
model_type="qwen_25_vl",
model_id="Qwen/Qwen2.5-VL-7B-Instruct"
device='auto'
)
# Run analysis
result = analyzer.analyze(
image_path="path/to/image.jpg",
task_text="What is in this image?",
instruction="Answer briefly.",
blocking_mode="none",
num_tokens=25
)
print(result['output_tokens'])
print(result['series']) # Attention patterns
Features
- 🔍 Attention pattern analysis for VLMs
- 🎯 Support for LLaVA, Qwen-VL, and Qwen2-LLM
- 🚫 Attention blocking experiments
- 📊 Token-level attention aggregation
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
otat-0.1.3.tar.gz
(22.1 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
otat-0.1.3-py3-none-any.whl
(33.4 kB
view details)
File details
Details for the file otat-0.1.3.tar.gz.
File metadata
- Download URL: otat-0.1.3.tar.gz
- Upload date:
- Size: 22.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c3177a1a1cf8bec595b300fa48b1139f0dcf7d47b039ec6fd1c066bf57a38b3a
|
|
| MD5 |
bf40b52303833ba95df597d30d32bf9c
|
|
| BLAKE2b-256 |
f22b5f64136435664b6a642336742868acea6f3aefe26bd0f51c6ac4b00be695
|
File details
Details for the file otat-0.1.3-py3-none-any.whl.
File metadata
- Download URL: otat-0.1.3-py3-none-any.whl
- Upload date:
- Size: 33.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5cf89a7c8c6f89d20ba0f6f75ae925bf9cefd9ab4dbf52a422e546d13e72c97b
|
|
| MD5 |
6203eef8d998f43d5cb080567e958001
|
|
| BLAKE2b-256 |
ad2905d46a23fe1a5faf5663df41de40da25b5eb212cba6d8ad779396ce1190f
|