Adapter+ implementation for vision transformer backbones from timm
Project description
Adapters Strike Back
This is the official repository of our paper:
Adapters Strike Back
Jan-Martin O. Steitz
and Stefan Roth
CVPR, 2024
Abstract: Adapters provide an efficient and lightweight mechanism for adapting trained transformer models to a variety of different tasks. However, they have often been found to be outperformed by other adaptation mechanisms, including low-rank adaptation. In this paper, we provide an in-depth study of adapters, their internal structure, as well as various implementation choices. We uncover pitfalls for using adapters and suggest a concrete, improved adapter architecture, called Adapter+, that not only outperforms previous adapter implementations but surpasses a number of other, more complex adaptation mechanisms in several challenging settings. Despite this, our suggested adapter is highly robust and, unlike previous work, requires little to no manual intervention when addressing a novel scenario. Adapter+ reaches state-of-the-art average accuracy on the VTAB benchmark, even without a per-task hyperparameter optimization.
Training and evaluation
Install requirements
conda env create -f environment.yml
conda activate adapter_plus
Dataset preparation
For dataset preparation of the VTAB and FGVC benchmarks, please follow VPT. Our configuration expects the VTAB and FGVC dataset folders to reside under datasets/vtab
and datasets/fgvc
, respectively.
Training
For training, you can select one of the preconfigured experiments from conf/experiments
to train on the complete VTAB or FGVC benchmarks or define your own and run for example
python train.py +experiment=vtab/adapter_plus_dim1-32
Results aggregation
To aggregate the results of the VTAB and FGVC benchmarks, you can use the Jupyter notebook get_results.ipynb
.
Evaluation of checkpoints
To evaluate checkpoints from previous training runs, use eval.py
in combination with the experiment configuration you want to re-evaluate, e.g.:
python eval.py +experiment=vtab/adapter_plus_dim1-32
Disclaimer
Training and evaluation for the paper were done with timm 0.6.7, pytorch 1.12, and python 3.9. For the best possible usability of our pip module, we have updated the code to the latest versions. As such, the numbers on VTAB may vary slightly (we measured up to +/- 0.2 p.p. in accuracy). However, the global average accuracy across all VTAB subgroups remains unchanged.
Use Adapter+ in your own project
To use our adapter implementation in your own project simply, install the pip module:
pip install adapter-plus
Besides various adapter configurations, our module also supports LoRA (without weight matrices merging for inference) and VPT-deep (without checkpointing). Please refer to the configurations in the repository's conf
directory for details.
Our pip module patches the _create_vision_transformer
function of the timm library to support adapters. All vision transformers built with Block
or ResPostBlock
block functions are supported.
You can create an adapter-enabled vision transformer model as shown below:
import timm
import adapter_plus
from omegaconf import OmegaConf
# create config for Adapter+
# change bottleneck dim as required
adapterplus_conf = OmegaConf.create(
"""
config: post
act_layer: true
norm_layer: false
bias: true
init: houlsby
scaling: channel
dim: 8
attn_adapter: false
dropout: 0
drop_path: 0.1
"""
)
# create pre-trained timm ViT model
# with adapter=True and adapter_config
model = timm.create_model(
"vit_base_patch16_224.orig_in21k",
adapter=True,
pretrained=True,
drop_path_rate=0.1,
num_classes=101,
adapter_config=adapterplus_conf,
)
# only require gradients for
# adapters and classifier head
model.requires_grad_(False)
model.head.requires_grad_(True)
for m in model.modules():
if isinstance(m, adapter_plus.Adapter):
m.requires_grad_(True)
Acknowledgements
This work has been funded by the LOEWE initiative (Hesse, Germany) within the emergenCITY center.
Citation
@inproceedings{Steitz:2024:ASB,
author = {Steitz, Jan-Martin O. and Roth, Stefan},
title = {Adapters Strike Back},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2024},
pages = {23449--23459}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file adapter_plus-0.1.0.tar.gz
.
File metadata
- Download URL: adapter_plus-0.1.0.tar.gz
- Upload date:
- Size: 14.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4bda86da2989c4a06010346ed27dad5237b99c59e93ea8e6bcf24a93ae64f2d8 |
|
MD5 | 1be8c5f829990e099ab53b6900724795 |
|
BLAKE2b-256 | 0b79d13944c33fbd543f569f37da01ca3ac720aeaa47889c6c8ba9bbcbad9a94 |
File details
Details for the file adapter_plus-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: adapter_plus-0.1.0-py3-none-any.whl
- Upload date:
- Size: 12.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | cd0e963875c800d5bd02ca34bb62729fd4b95dde18243359bdf61525667d36ab |
|
MD5 | 2a0003e844a97dd2eaa191dcd4915150 |
|
BLAKE2b-256 | 261019993162482218f66b920613997baa2f49346cdb666bfb5c636904e9840d |