Skip to main content

Vision-Language Model Interpretability Analysis - One Token at a Time

Project description

OTaT: One Token at a Time

Vision-Language Model Interpretability Analysis toolkit for analyzing attention patterns in models like LLaVA and Qwen-VL.

Installation

From PyPI

pip install otat

From GitHub

pip install git@github.com:varungupta31/otat_api.git

Local Development

git clone https://github.com/varungupta31/otat_api.git
cd otat_api
pip install -e .

Quick Start

from interpretability.api.wrapper import InterpretabilityAnalyzer

# Initialize analyzer for LLaVA OneVision 0.5B
analyzer = InterpretabilityAnalyzer(
    model_type="llava_onevision",
    model_id="llava-hf/llava-onevision-qwen2-0.5b-ov-hf",
    device='cuda:1' #or 'auto' or whichever device you wish to load the model on.
)

# Initialize analyzer for LLaVA OneVision 7B
analyzer = InterpretabilityAnalyzer(
    model_type="llava_onevision",
    model_id="llava-hf/llava-onevision-qwen2-7b-ov-hf",
    device='auto'
)

# Initialize analyzer for Qwen2.5-VL 3B
analyzer = InterpretabilityAnalyzer(
    model_type="qwen_25_vl",
    model_id="Qwen/Qwen2.5-VL-3B-Instruct"
    device='auto'
)

# Initialize analyzer for Qwen2.5-VL 7B
analyzer = InterpretabilityAnalyzer(
    model_type="qwen_25_vl",
    model_id="Qwen/Qwen2.5-VL-7B-Instruct"
    device='auto'
)

# Run analysis
result = analyzer.analyze(
    image_path="path/to/image.jpg",
    task_text="What is in this image?",
    instruction="Answer briefly.",
    blocking_mode="none",
    num_tokens=25
)

print(result['output_tokens'])
print(result['series'])  # Attention patterns

Features

  • 🔍 Attention pattern analysis for VLMs
  • 🎯 Support for LLaVA, Qwen-VL, and Qwen2-LLM
  • 🚫 Attention blocking experiments
  • 📊 Token-level attention aggregation

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

otat-0.1.2.tar.gz (21.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

otat-0.1.2-py3-none-any.whl (33.4 kB view details)

Uploaded Python 3

File details

Details for the file otat-0.1.2.tar.gz.

File metadata

  • Download URL: otat-0.1.2.tar.gz
  • Upload date:
  • Size: 21.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.0

File hashes

Hashes for otat-0.1.2.tar.gz
Algorithm Hash digest
SHA256 57a15f3e6ab57e3e96385f01ef0b875078187293f97d4ff2ca3e0033ddc8722a
MD5 2f50951c3b300b302dfecebde3afb0ff
BLAKE2b-256 b65889dd3c1b7dcc3505a30d852b5116150ea9bddadb4a07c3d6fda9fba56ac0

See more details on using hashes here.

File details

Details for the file otat-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: otat-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 33.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.0

File hashes

Hashes for otat-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 aa422295069c753329e125054a8de9861e6db1d20511cd3cbe945749a147ae09
MD5 84f8aee5044b77d12cb3ef4a4a956ef6
BLAKE2b-256 7e54d60031c51099bae299fa73aaee3d56bee140e2cbf416f2ef1a4482e3e0df

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page