FlowCut: Rethinking Redundancy via Information Flow for Efficient Vision-Language Models
Project description
FlowCut: Rethinking Redundancy via Information Flow for Efficient Vision-Language Models
Jintao Tong1,
Wenwei Jin2,
Penda Qin2,
Anqi Li3,
Yixiong Zou1✉,
Yuhong Li2✉,
Yuhua Li1,
Ruixuan Li1
1School of Computer Science and Technology, Huazhong University of Science and Technology
2Xiaohongshu Inc., 3Institute of Information Science, Beijing Jiaotong University
🔥 News
- Checkpoints for LLaVA-1.5-7B-FlowCut (retain 128 tokens/192 tokens) will be released soon!
- Code will be released soon!
2025.05.26We release our latest work FlowCut, a plug-and-play, training-free token reduction method that seamlessly integrates into various VLMs for efficient training and inference.
💡 Highlights
TLDR: To address inefficiency from excessive visual tokens in LVLMs, we propose a unified, bottom-up perspective based on information-flow, revealing dynamic redundancy emergence and introduce FlowCut, making pruning decision aligned with the model's inherent behavior, outperforming all existing approaches.
🛠 Preparation
Our code is easy to use.
- Clone the LLaVA's repository.
git clone https://github.com/haotian-liu/LLaVA.git
cd LLaVA
- Install the LLaVA's environment.
conda create -n llava python=3.10 -y
conda activate llava
pip install --upgrade pip
pip install -e .
pip install flash-attn --no-build-isolation
- For formal usage, you can install the package from PyPI by running the following command:
pip install flowcut
For development, you can install the package by cloning the repository and running the following command:
git clone https://github.com/TungChintao/flowcut
cd flowcut
pip install -e .
File organization as follow:
├── LLaVA-main
├── flowcut
├── llava
├── playground
├── script
🚀 Quick Start
from llava.model.builder import load_pretrained_model
from llava.mm_utils import get_model_name_from_path
from llava.eval.run_llava import eval_model
from FlowCut import flowcut
model_path = "liuhaotian/llava-v1.5-7b"
tokenizer, model, image_processor, context_len = load_pretrained_model(
model_path=model_path,
model_base=None,
model_name=get_model_name_from_path(model_path)
)
## FlowCut retains 64 visual tokens
model = flowcut(model, target_num=64)
📖 Evaluation
The evaluation code follows the structure of LLaVA or Lmms-Eval. After loading the model, simply add two lines as shown below:
## Load LLaVA Model (code from llava.eval.model_vqa_loader)
tokenizer, model, image_processor, context_len = load_pretrained_model(model_path, args.model_base, model_name)
## add FlowCut
from flowcut import flowcut
model = flowcut(model, target_num=64)
Script templetes (please follow the detailed instruction in LLaVA-Evaluation).
bash scripts/v1_5/eval/[Benchmark].sh
Examples:
CUDA_VISIBLE_DEVICES=0 bash scripts/v1_5/eval/mme.sh
CUDA_VISIBLE_DEVICES=0 bash scripts/v1_5/eval/pope.sh
CUDA_VISIBLE_DEVICES=0 bash scripts/v1_5/eval/textvqa.sh
🎯 Training
The training code follows the structure of LLaVA. After loading the model, simply add two lines as shown below:
## Load LLaVA Model (code from llava.train)
code of loading model...
## add FlowCut
from flowcut import flowcut
model = flowcut(model, target_num=64)
## training
trainer = LLaVATrainer(model=model,
tokenizer=tokenizer,
args=training_args,
**data_module)
🔑 License
- This project is released under the Apache 2.0 license.
📌 Citation
- If you find this project useful in your research, please consider citing:
@article{tong2025flowcut,
title={FlowCut: Rethinking Redundancy via Information Flow for Efficient Vision-Language Models},
author={Jintao Tong and Wenwei Jin and Pengda Qin and Anqi Li and Yixiong Zou and Yuhong Li and Yuhua Li and Ruixuan Li},
journal={arXiv preprint arXiv:2505.19536},
year={2025}
}
👍 Acknowledgment
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file flowcut-1.0.1.tar.gz.
File metadata
- Download URL: flowcut-1.0.1.tar.gz
- Upload date:
- Size: 19.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.8.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
451f2db18bc4bbcc7f99295f6aa84bca4c85e82779ac6baee90465cd242446d5
|
|
| MD5 |
d961eea57caf3a19440658014f728ac4
|
|
| BLAKE2b-256 |
9ab6fe5e36646dc3ad5e03d84872ec4a3bc695899bb2c0949ca14a012d23f59d
|