Skip to main content

FlowCut: Rethinking Redundancy via Information Flow for Efficient Vision-Language Models

Project description

FlowCut: Rethinking Redundancy via Information Flow for Efficient Vision-Language Models

Jintao Tong1, Wenwei Jin2, Penda Qin2, Anqi Li3, Yixiong Zou1✉,
Yuhong Li2✉, Yuhua Li1, Ruixuan Li1

1School of Computer Science and Technology, Huazhong University of Science and Technology
2Xiaohongshu Inc., 3Institute of Information Science, Beijing Jiaotong University

arXiv Code License

🔥 News

  • 2025.05.28 Code is now available, and FlowCut can be easily installed via PyPI using pip install flowcut
  • 2025.05.26 We release our latest work FlowCut, a plug-and-play, training-free token reduction method that seamlessly integrates into various VLMs for efficient training and inference.

💡 Highlights

mask

TLDR: To address inefficiency from excessive visual tokens in LVLMs, we propose a unified, bottom-up perspective based on information-flow, revealing dynamic redundancy emergence and introduce FlowCut, making pruning decision aligned with the model's inherent behavior, outperforming all existing approaches.

🛠 Preparation

Our code is easy to use.

  1. Clone the LLaVA's repository.
git clone https://github.com/haotian-liu/LLaVA.git
cd LLaVA
  1. Install the LLaVA's environment.
conda create -n llava python=3.10 -y
conda activate llava
pip install --upgrade pip  
pip install -e .
pip install flash-attn --no-build-isolation
  1. For formal usage, you can install the package from PyPI by running the following command:
pip install flowcut

For development, you can install the package by cloning the repository and running the following command:

git clone https://github.com/TungChintao/flowcut
cd flowcut
pip install -e .

File organization as follow:

├── LLaVA-main
    ├── flowcut
    ├── llava
    ├── playground
    ├── script

🚀 Quick Start

from llava.model.builder import load_pretrained_model
from llava.mm_utils import get_model_name_from_path
from llava.eval.run_llava import eval_model
from flowcut import flowcut
model_path = "liuhaotian/llava-v1.5-7b"

tokenizer, model, image_processor, context_len = load_pretrained_model(
    model_path=model_path,
    model_base=None,
    model_name=get_model_name_from_path(model_path)
)
## FlowCut retains 64 visual tokens
model = flowcut(model, target_num=64)

📖 Evaluation

The evaluation code follows the structure of LLaVA or Lmms-Eval. After loading the model, simply add two lines as shown below:

## Load LLaVA Model (code from llava.eval.model_vqa_loader)
tokenizer, model, image_processor, context_len = load_pretrained_model(model_path, args.model_base, model_name)
## add FlowCut
from flowcut import flowcut
model = flowcut(model, target_num=64)

Script templetes (please follow the detailed instruction in LLaVA-Evaluation).

bash scripts/v1_5/eval/[Benchmark].sh

Examples:

CUDA_VISIBLE_DEVICES=0 bash scripts/v1_5/eval/mme.sh
CUDA_VISIBLE_DEVICES=0 bash scripts/v1_5/eval/pope.sh
CUDA_VISIBLE_DEVICES=0 bash scripts/v1_5/eval/textvqa.sh

🎯 Training

The training code follows the structure of LLaVA. After loading the model, simply add two lines as shown below:

## Load LLaVA Model (code from llava.train)
code of loading model...
## add FlowCut
from flowcut import flowcut
model = flowcut(model, target_num=64)
## training
trainer = LLaVATrainer(model=model,
                tokenizer=tokenizer,
                args=training_args,
                **data_module)

🔑 License

📌 Citation

  • If you find this project useful in your research, please consider citing:
@article{tong2025flowcut,
  title={FlowCut: Rethinking Redundancy via Information Flow for Efficient Vision-Language Models}, 
  author={Jintao Tong and Wenwei Jin and Pengda Qin and Anqi Li and Yixiong Zou and Yuhong Li and Yuhua Li and Ruixuan Li},
  journal={arXiv preprint arXiv:2505.19536},
  year={2025}
}

👍 Acknowledgment

  • This work is built upon LLaVA, Qwen VL, and Video-LLaVA. We thank them for their excellent open-source contributions.

  • We also thank FastV, SparseVLM, VisionZip and others for their contributions, which have provided valuable insights.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flowcut-1.0.2.tar.gz (19.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

flowcut-1.0.2-py3-none-any.whl (20.5 kB view details)

Uploaded Python 3

File details

Details for the file flowcut-1.0.2.tar.gz.

File metadata

  • Download URL: flowcut-1.0.2.tar.gz
  • Upload date:
  • Size: 19.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.8.13

File hashes

Hashes for flowcut-1.0.2.tar.gz
Algorithm Hash digest
SHA256 a73f3966ea1dcc0ebfc979858ae660799ef03aa744e60a1b8291de7378d366f2
MD5 dbfac94344a27aa2f3a339c0e8fe4626
BLAKE2b-256 ffe658263cc865b7d63f99f851ad5ccd1f459965eae9b5ad1faa10b7b6340b73

See more details on using hashes here.

File details

Details for the file flowcut-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: flowcut-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 20.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.8.13

File hashes

Hashes for flowcut-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 cdd6e67d0f622c13dbc07f4941c740b64ea9ee96a65ab945c6d1c7f1b6c585bd
MD5 e0217dc2c6ee364dc659f2898311d166
BLAKE2b-256 4277188518f48c3bc2850d83fd36a55188e967a3aa23b728a0d7f58776f0cd9d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page