Skip to main content

Add your description here

Project description

Efficient Vision Foundation Models for High-Resolution Generation and Perception

PWC

News

  • (🔥 New) [2025/09/05] We will no longer maintain this codebase. All future updates and announcements will be made on DC-Gen.
  • (🔥 New) [2025/01/24] We released DC-AE-SANA-1.1: doc.
  • (🔥 New) [2025/01/23] DC-AE and SANA are accepted by ICLR 2025.
  • (🔥 New) [2025/01/14] We released DC-AE+USiT models: model, training. Using the default training settings and sampling strategy, DC-AE+USiT-2B achieves 1.72 FID on ImageNet 512x512, surpassing the SOTA diffusion model EDM2-XXL and SOTA auto-regressive image generative models (MAGVIT-v2 and MAR-L).

Content

[ICLR 2025] Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models [paper] [readme] [poster]

Deep Compression Autoencoder (DC-AE) is a new family of high-spatial compression autoencoders with a spatial compression ratio of up to 128 while maintaining reconstruction quality. It accelerates all latent diffusion models regardless of the diffusion model architecture.

Demo

demo

Figure 1: We address the reconstruction accuracy drop of high spatial-compression autoencoders.

demo

Figure 2: DC-AE speeds up latent diffusion models.

Figure 3: DC-AE enables efficient text-to-image generation on the laptop: SANA.

[CVPR 2024 eLVM Workshop] EfficientViT-SAM: Accelerated Segment Anything Model Without Accuracy Loss [paper] [online demo] [readme]

EfficientViT-SAM is a new family of accelerated segment anything models by replacing SAM's heavy image encoder with EfficientViT. It delivers a 48.9x measured TensorRT speedup on A100 GPU over SAM-ViT-H without sacrificing accuracy.

[ICCV 2023] EfficientViT-Classification [paper] [readme]

Efficient image classification models with EfficientViT backbones.

[ICCV 2023] EfficientViT-Segmentation [paper] [readme]

Efficient semantic segmantation models with EfficientViT backbones.

demo

EfficientViT-GazeSAM [readme]

Gaze-prompted image segmentation models capable of running in real time with TensorRT on an NVIDIA RTX 4070.

GazeSAM demo

Getting Started

conda create -n efficientvit python=3.10
conda activate efficientvit
pip install -U -r requirements.txt

Third-Party Implementation/Integration

Contact

Han Cai

Reference

If EfficientViT or EfficientViT-SAM or DC-AE is useful or relevant to your research, please kindly recognize our contributions by citing our paper:

@inproceedings{cai2023efficientvit,
  title={Efficientvit: Lightweight multi-scale attention for high-resolution dense prediction},
  author={Cai, Han and Li, Junyan and Hu, Muyan and Gan, Chuang and Han, Song},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={17302--17313},
  year={2023}
}
@article{zhang2024efficientvit,
  title={EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss},
  author={Zhang, Zhuoyang and Cai, Han and Han, Song},
  journal={arXiv preprint arXiv:2402.05008},
  year={2024}
}
@article{chen2024deep,
  title={Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models},
  author={Chen, Junyu and Cai, Han and Chen, Junsong and Xie, Enze and Yang, Shang and Tang, Haotian and Li, Muyang and Lu, Yao and Han, Song},
  journal={arXiv preprint arXiv:2410.10733},
  year={2024}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

visaion5efficientvit-0.2.0.tar.gz (157.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

visaion5efficientvit-0.2.0-py3-none-any.whl (195.0 kB view details)

Uploaded Python 3

File details

Details for the file visaion5efficientvit-0.2.0.tar.gz.

File metadata

  • Download URL: visaion5efficientvit-0.2.0.tar.gz
  • Upload date:
  • Size: 157.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.0 {"installer":{"name":"uv","version":"0.10.0","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for visaion5efficientvit-0.2.0.tar.gz
Algorithm Hash digest
SHA256 f900919b52dee646bb7bc701502ae4dabe4853383a683d6af42d3707632f802e
MD5 ff8569f8e137d3c25fa7acd4bc620e03
BLAKE2b-256 304d35e0f9ac9f9ebc081441e3cdc4f35fc7cc87d37ef40781dc59331f981f7b

See more details on using hashes here.

File details

Details for the file visaion5efficientvit-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: visaion5efficientvit-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 195.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.0 {"installer":{"name":"uv","version":"0.10.0","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for visaion5efficientvit-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 81a3a19a2e2d30c50b9809445c00c0a277ef48a0b861442e9ddb9b6e227d4834
MD5 27871d5d71b9041a7fe47cfe361a17fa
BLAKE2b-256 6f38aebfee156aae9ca0b8114670dc85c795776b4856282236a12dfbc2fbf9c1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page