Skip to main content

Vision-Language Model Interpretability Analysis - One Token at a Time

Project description

OTaT: One Token at a Time

Vision-Language Model Interpretability Analysis toolkit for analyzing attention patterns in models like LLaVA and Qwen-VL.

Installation

From PyPI

pip install otat

From GitHub

pip install git@github.com:varungupta31/otat_api.git

Local Development

git clone https://github.com/varungupta31/otat_api.git
cd otat_api
pip install -e .

Quick Start

from interpretability.api.wrapper import InterpretabilityAnalyzer

# Initialize analyzer for LLaVA OneVision 0.5B
analyzer = InterpretabilityAnalyzer(
    model_type="llava_onevision",
    model_id="llava-hf/llava-onevision-qwen2-0.5b-ov-hf",
    device='cuda:1' #or 'auto' or whichever device you wish to load the model on.
)

# Initialize analyzer for LLaVA OneVision 7B
analyzer = InterpretabilityAnalyzer(
    model_type="llava_onevision",
    model_id="llava-hf/llava-onevision-qwen2-7b-ov-hf",
    device='auto'
)

# Initialize analyzer for Qwen2.5-VL 3B
analyzer = InterpretabilityAnalyzer(
    model_type="qwen_25_vl",
    model_id="Qwen/Qwen2.5-VL-3B-Instruct"
    device='auto'
)

# Initialize analyzer for Qwen2.5-VL 7B
analyzer = InterpretabilityAnalyzer(
    model_type="qwen_25_vl",
    model_id="Qwen/Qwen2.5-VL-7B-Instruct"
    device='auto'
)

# Run analysis
result = analyzer.analyze(
    image_path="path/to/image.jpg",
    task_text="What is in this image?",
    instruction="Answer briefly.",
    blocking_mode="none",
    num_tokens=25
)

print(result['output_tokens'])
print(result['series'])  # Attention patterns

Features

  • 🔍 Attention pattern analysis for VLMs
  • 🎯 Support for LLaVA, Qwen-VL, and Qwen2-LLM
  • 🚫 Attention blocking experiments
  • 📊 Token-level attention aggregation

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

otat-0.1.3.tar.gz (22.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

otat-0.1.3-py3-none-any.whl (33.4 kB view details)

Uploaded Python 3

File details

Details for the file otat-0.1.3.tar.gz.

File metadata

  • Download URL: otat-0.1.3.tar.gz
  • Upload date:
  • Size: 22.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.9

File hashes

Hashes for otat-0.1.3.tar.gz
Algorithm Hash digest
SHA256 c3177a1a1cf8bec595b300fa48b1139f0dcf7d47b039ec6fd1c066bf57a38b3a
MD5 bf40b52303833ba95df597d30d32bf9c
BLAKE2b-256 f22b5f64136435664b6a642336742868acea6f3aefe26bd0f51c6ac4b00be695

See more details on using hashes here.

File details

Details for the file otat-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: otat-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 33.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.9

File hashes

Hashes for otat-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 5cf89a7c8c6f89d20ba0f6f75ae925bf9cefd9ab4dbf52a422e546d13e72c97b
MD5 6203eef8d998f43d5cb080567e958001
BLAKE2b-256 ad2905d46a23fe1a5faf5663df41de40da25b5eb212cba6d8ad779396ce1190f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page