Skip to main content

A training-free compression paradigm that dynamically adjusts per-frame token retention ratios based on LLM's keyframe prior

Project description

Less Is More, but Where?
Dynamic Token Compression via LLM-Guided Keyframe Prior

This repository is the official PyTorch implementation of DyToK.

๐Ÿ“š TABLE OF CONTENTS

  1. Motivation
  2. Method
  3. News
  4. TODO
  5. Installation
  6. Quick Start
  7. Reproducing Results
  8. Development
  9. Acknowledgements
  10. Citation

๐ŸŽฏ Motivation

Unveiling the keyframe prior in VLLMs Unveiling the keyframe prior in VLLMs. We visualize the averaged attention from the final text token to visual tokens across all layers for each frame. The top-8 frames by attention scores are shown chronologically, with ground truth (GT) keyframes highlighted in red. We observe that even when the model answers incorrectly, its attention still pinpoints relevant frames, revealing a strong task-dependent keyframe prior.

๐ŸŒˆ Method

Illustration of DyToK Illustration of DyToK. We adaptively compress video tokens through two synergistic components:

  1. Temporal Importance Estimation leverages cross-modal attention from a lightweight assistant model to identify keyframes;
  2. Dynamic Frame-Level Compression that proportionally allocates token budgets to preserve salient content.

๐ŸŽ‰ News

  • [2025.12.06] Released code for integrating DyToK with encoder feature-based pruning methods.
  • [2025.09.18] Our paper has been accepted at NeurIPS 2025.

๐Ÿ”ฅ TODO

  • Initialize Project.
  • Release code for integrating DyToK with LLM attention-based pruning methods.
  • Add support for Qwen3-VL.

๐Ÿ“ฆ Installation

DyToK's code is extremely concise and works out of the box. Just install and go!

1. Quick Install

Install the latest stable version directly from PyPI:

pip install dytok

2. Development Install

Clone the repository and install in editable mode:

git clone https://github.com/yu-lin-li/DyToK.git
cd DyToK
pip install -e .

๐Ÿš€ Quick Start

Integrating DyToK takes just two lines of code:

from dytok import visionzip
visionzip(model, dytok=True, use_tiny=True, tiny_model=tiny_model)

Try it out with our demo script using LLaVA-OneVision:

python playground/llavaov_infer.py

๐Ÿ“Š Reproducing Results

All experiments in the paper are based on LMMs-Eval. Follow these steps to reproduce our results.

1. Setup Environment

# Create virtual environment
conda create -n dytok python=3.10
conda activate dytok

# Install base models (e.g., LLaVA-OneVision)
pip install git+https://github.com/LLaVA-VL/LLaVA-NeXT.git

# Install DyToK
git clone https://github.com/yu-lin-li/DyToK.git
cd DyToK
pip install -e .

# Install LMMs-Eval streamlined for DyToK
cd eval
pip install -e .
pip install flash-attn==2.6.3  # optional

๐Ÿ’ก Note: Our eval/ directory contains a minimal, DyToK-focused version of LMMs-Eval. For full functionality, install the official LMMs-Eval separately and integrate DyToK as described in Development.

2. Evaluation

Reproduce DyToK-enhanced VisionZip results on LLaVA-OneVision:

bash eval/scripts/dytok_visionzip_tiny_32_ov.sh

๐Ÿ› ๏ธ Development

1. Repository Structure

.
โ”œโ”€โ”€ assets/
โ”œโ”€โ”€ dytok/                    # Core DyToK logic
โ”‚   โ””โ”€โ”€ visionzip/            # DyToK-enhanced VisionZip
โ”œโ”€โ”€ eval/
โ”‚   โ”œโ”€โ”€ lmms_eval/            # Evaluation toolkit
โ”‚   โ”‚   โ””โ”€โ”€ models/           # DyToK-integrated models
โ”‚   โ””โ”€โ”€ scripts/              # Evaluation scripts
โ”œโ”€โ”€ playground/               # Demo inference scripts
โ”‚   โ””โ”€โ”€ llavaov_infer.py
โ”œโ”€โ”€ pyproject.toml
โ””โ”€โ”€ README.md

2. Adapt DyToK to Your Own Method

DyToK is designed as a plug-and-play module. To integrate it into your token compression method:

  • Look for code blocks explicitly annotated to isolate DyToK-specific logic from the base method, as shown below:
# ! โ€”โ€”โ€”โ€” DyToK Begin โ€”โ€”โ€”โ€”
...
# ! โ€”โ€”โ€”โ€” DyToK End โ€”โ€”โ€”โ€”
  • Migrate the enclosed logic into your method.

โœ… Pro Tip: Use the Better Comments extension in VSCode to highlight DyToK annotations in red!

3. Integrate with Your Own LMMs-Eval

To add DyToK support to your local LMMs-Eval:

cp eval/lmms_eval/models/*.py <YOUR_LMMS_EVAL_PATH>/models/

Then register the model in <YOUR_LMMS_EVAL_PATH>/models/__init__.py:

# Add the DyToK model entry to AVAILABLE_MODELS
AVAILABLE_MODELS = {
    # existing models ...
    "llava_onevision_dytok": "Llava_OneVision_DyToK"
}

โค๏ธ Acknowledgements

Our work builds upon the codebase of VisionZip, DyCoke, FastV, LLaVA-NeXT, Qwen2.5-VL, and LMMs-Eval. We sincerely thank the authors for their remarkable contributions.

๐Ÿ“œ Citation

If you find DyToK useful in your research, please cite our paper:

@article{li2025less,
  title={Less Is More, but Where? Dynamic Token Compression via LLM-Guided Keyframe Prior},
  author={Li, Yulin and Gui, Haokun and Fan, Ziyang and Wang, Junjie and Kang, Bin and Chen, Bin and Tian, Zhuotao},
  journal={arXiv preprint arXiv:2025},
  year={2025}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dytok-0.1.0.tar.gz (29.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dytok-0.1.0-py3-none-any.whl (29.7 kB view details)

Uploaded Python 3

File details

Details for the file dytok-0.1.0.tar.gz.

File metadata

  • Download URL: dytok-0.1.0.tar.gz
  • Upload date:
  • Size: 29.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.18

File hashes

Hashes for dytok-0.1.0.tar.gz
Algorithm Hash digest
SHA256 8f5edabee6309bf3857fa100c1dceffcf0176572acb93211a2f980dc2dc2c09a
MD5 42a676da55260037a53ef408e31a13ca
BLAKE2b-256 9495ab640c54ec5fee44fdac4668b8b1eafb7b0d15470090f61b58046de5dfd7

See more details on using hashes here.

File details

Details for the file dytok-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: dytok-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 29.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.18

File hashes

Hashes for dytok-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 99dcad17aa563a1340fe579b60d2fa19d678ccb6fdc747118ee5cd7ee2495d69
MD5 f9eb4974f5800dee37ef5c24aff76f09
BLAKE2b-256 287de49472b366a35b0c9bebcb1ffe0ca94da9861d8f9b12928e315e198e8d9e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page