Calculate the number of tokens used for images in VLMs

These details have not been verified by PyPI

Project links

Homepage

Project description

Vision Token Calculator

A Python tool for calculating the number of tokens generated when processing images with Vision Language Models (VLMs).

Features

Calculate image tokens for VLMs
Support both existing images and dummy images
Support remote images via URL (http/https)
Simple command line interface (CLI)

Installation

Option 1: PyPI (recommended)

pip install vt-calc

Option 2: From source (editable for development)

pip install -e .

Usage

Using the vt-calc command (after pip install -e .)

After installing with pip install -e ., you can use the vt-calc command directly:

# Single image
vt-calc --image path/to/your/image.jpg

# Image from URL
vt-calc --image https://example.com/image.jpg

# Directory (batch processing)
vt-calc --image path/to/your/images_dir

# Dummy image with specific dimensions (Width x Height)
vt-calc --size 1920 1080

# Choose a short model name (default: qwen2.5-vl)
vt-calc --image path/to/your/image.jpg -m qwen2.5-vl

# Calculate tokens for a video file
vt-calc --video path/to/video.mp4 -m qwen2.5-vl

# Specify frame sampling rate (FPS)
vt-calc --video video.mp4 --fps 2.0

# Limit maximum number of frames
vt-calc --video video.mp4 --max-frames 100

# Show help
vt-calc --help

CLI options

-i, --image: Path to an image file, a directory of images, or an image URL
-s, --size WIDTH HEIGHT: Create a dummy image of the given size
-m, --model-name: Short model name to use (default: qwen2.5-vl)

Supported input formats for directory processing: .jpg, .jpeg, .png, .webp (case-insensitive).

Example output (single image)

Using dummy image: 1024 x 768
                        ╔══════════════════════════════╗
                        ║ VISION TOKEN ANALYSIS REPORT ║
                        ╚══════════════════════════════╝
╭───────────────────────────────── MODEL INFO ─────────────────────────────────╮
│                                                                              │
│   Model Name                qwen2.5-vl                                       │
│                                                                              │
╰──────────────────────────────────────────────────────────────────────────────╯
╭───────────────────────────────── IMAGE INFO ─────────────────────────────────╮
│                                                                              │
│   Image Source              Dummy image                                      │
│   Original Size (H x W)     1024 x 768                                       │
│   Resized Size (H x W)      1036 x 756                                       │
│                                                                              │
╰──────────────────────────────────────────────────────────────────────────────╯
╭───────────────────────────────── PATCH INFO ─────────────────────────────────╮
│                                                                              │
│   Patch Size (ViT)          14                                               │
│   Grid Size (H x W)         74 x 54                                          │
│   Number of Patches         3996                                             │
│                                                                              │
╰──────────────────────────────────────────────────────────────────────────────╯
╭───────────────────────────────── TOKEN INFO ─────────────────────────────────╮
│                                                                              │
│   Image Token               999                                              │
│   (<|image_pad|>)                                                            │
│   Image Start Token         1                                                │
│   (<|vision_start|>)                                                         │
│   Image End Token           1                                                │
│   (<|vision_end|>)                                                           │
│                                                                              │
╰──────────────────────────────────────────────────────────────────────────────╯
╭──────────────────────────────── TOKEN FORMAT ────────────────────────────────╮
│               <|vision_start|><|image_pad|>*999<|vision_end|>                │
╰──────────────────────────────────────────────────────────────────────────────╯

Example output (multi image)

Processing directory: test_images/
Found 8 images to process...

[1/8] Processing: test_1_640x480.jpg ✓ (393 tokens)
[2/8] Processing: test_2_800x600.jpg ✓ (611 tokens)
[3/8] Processing: test_3_1024x768.jpg ✓ (1001 tokens)
[4/8] Processing: test_4_1280x720.jpg ✓ (1198 tokens)
[5/8] Processing: test_5_1920x1080.jpg ✓ (2693 tokens)
[6/8] Processing: test_6_512x512.jpg ✓ (326 tokens)
[7/8] Processing: test_7_256x256.jpg ✓ (83 tokens)
[8/8] Processing: test_8_2048x1536.jpg ✓ (4017 tokens)

       BATCH ANALYSIS REPORT
╭────────────────────────┬────────────╮
│ Model                  │ qwen2.5-vl │
│ Total Images Processed │ 8          │
│ Average Vision Tokens  │ 1290.2     │
│ Minimum Vision Tokens  │ 83         │
│ Maximum Vision Tokens  │ 4017       │
│ Standard Deviation     │ 1370.5     │
╰────────────────────────┴────────────╯

Supported Models

Model	Option
Qwen2-VL	qwen2-vl
Qwen2.5-VL	qwen2.5-vl
Qwen3-VL	qwen3-vl
InternVL3	internvl3
LLaVA	llava

License

This project is licensed under the MIT License — see the LICENSE file for details.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.0.4

Jan 11, 2026

This version

0.0.3

Jan 7, 2026

0.0.2

Jun 18, 2025

0.0.1

Jun 18, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vt_calc-0.0.3.tar.gz (22.6 kB view details)

Uploaded Jan 7, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

vt_calc-0.0.3-py3-none-any.whl (22.4 kB view details)

Uploaded Jan 7, 2026 Python 3

File details

Details for the file vt_calc-0.0.3.tar.gz.

File metadata

Download URL: vt_calc-0.0.3.tar.gz
Upload date: Jan 7, 2026
Size: 22.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for vt_calc-0.0.3.tar.gz
Algorithm	Hash digest
SHA256	`c94825cf021e668d0d97b7a6d9a9082ab43896b7561347954388663a984f9a6e`
MD5	`4e30b185d0951b230feeb2cb84e33bff`
BLAKE2b-256	`b351b38d95bedd66c9d94f5d5f95e115c835962d013b49e791545aabb00c1fe4`

See more details on using hashes here.

File details

Details for the file vt_calc-0.0.3-py3-none-any.whl.

File metadata

Download URL: vt_calc-0.0.3-py3-none-any.whl
Upload date: Jan 7, 2026
Size: 22.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for vt_calc-0.0.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a7ac7d9239c5bd0f93127d7856adaea7e11c3b1cba2127d637c7453932952d68`
MD5	`57c3756b83dd39bf1086e4eda0c08c1a`
BLAKE2b-256	`f5207fb3a0d08e64a8335039d1ceea76795f8e47f93b8aa340addc35e53e9393`

See more details on using hashes here.

vt-calc 0.0.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Vision Token Calculator

Features

Installation

Option 1: PyPI (recommended)

Option 2: From source (editable for development)

Usage

CLI options

Example output (single image)

Example output (multi image)

Supported Models

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes