Vision Transformers implemented using KAN layers
Project description
Vision-KAN
We are experimenting with the possibility of KAN replacing MLP in Vision Transformer, this project may be delayed for a long time due to GPU resource constraints, if there are any new developments, we will show them here!
To install this package
pip install VisionKAN
Minimal Example
from VisionKAN import create_model, train_one_epoch, evaluate
KAN_model = create_model(
model_name='deit_tiny_patch16_224_KAN',
pretrained=False,
hdim_kan=192,
num_classes=100,
drop_rate=0.0,
drop_path_rate=0.05,
img_size=224,
batch_size=144
)
Dataset | MLP hidden dim | model | date | epoch | top1 | top5 | Checkpoint |
---|---|---|---|---|---|---|---|
ImageNet 1k | 768 | DeiT-tiny(baseline) | - | 300 | 72.2 | 91.1 | |
CIFAR-100 | 192 | DeiT-tiny(baseline) | 2024.5.25 | 300(stop) | 84.94 | 96.53 | Checkpoint |
CIFAR-100 | 384 | DeiT-small(baseline) | 2024.5.25 | 300(stop) | 86.49 | 96.17 | Checkpoint |
CIFAR-100 | 768 | DeiT-base(baseline) | 2024.5.25 | 300(stop) | 86.54 | 96.16 | Checkpoint |
Dataset | KAN hidden dim | model | date | epoch | top1 | top5 | Checkpoint |
---|---|---|---|---|---|---|---|
ImageNet 1k | 20 | Vision-KAN | 2024.5.16 | 37(stop) | 36.34 | 61.48 | - |
ImageNet 1k | 192 | Vision-KAN | 2024.5.25 | 346(stop) | 64.87 | 86.14 | Checkpoint |
ImageNet 1k | 768 | Vision-KAN | 2024.5.29 | 71(training) | 59.39 | 82.49 | - |
CIFAR-100 | 192 | Vision-KAN | 2024.5.25 | 300(stop) | 73.17 | 93.307 | Checkpoint |
CIFAR-100 | 384 | Vision-KAN | 2024.5.25 | 300(stop) | 78.69 | 94.73 | Checkpoint |
CIFAR-100 | 768 | Vision-KAN | 2024.5.29 | 300(stop) | 79.82 | 95.42 | Checkpoint |
News
5.7.2024
We released our current Vision KAN code, we used efficient KAN to simply replace the MLP layer in the Transformer block and are pre-training the Tiny model on ImageNet 1k, subsequent results will be updated in the table.
5.14.2024
The model has started to converge, we use [192, 20, 192] as input, hidden, and output dimensions, and we reshape the input dimensions in order to fit the processing dimensions of KAN.
5.15.2024
we change efficient kan to faster kan to speed up to 2x in training process, and change base model from Deit iii to Deit, so that we can use pre-trained model for most layers except kan layer
5.16.2024
The convergence of the model seems to be entering a bottleneck, and I'm guessing that kan's hidden layer setting of 20 is too small, so I'm going to adjust the hidden layer to 192 if it doesn't converge after a few more rounds of running.
5.22.2024
Fix Timm version dependency bugs and remove extraneous code.
5.24.2024
The decline in losses is starting to slow down and it looks like it's getting close to the final result.
5.25.2024
The model with 192 hidden layers is close to convergence and we will next try a larger KAN hidden layer, the same as the MLP. We release the best checkpoint of VisionKAN with 192 hidden dim.
Architecture
We used DeiT as a baseline for Vision KAN development, thanks to Meta and MIT for the amazing work!
Star History
If you are using our work, please cite
@misc{VisionKAN2024,
author = {Ziwen Chen and Gundavarapu and WU DI},
title = {Vision-KAN: Exploring the Possibility of KAN Replacing MLP in Vision Transformer},
year = {2024},
howpublished = {\url{https://github.com/chenziwenhaoshuai/Vision-KAN.git}},
}
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file visionkan-0.0.3.tar.gz
.
File metadata
- Download URL: visionkan-0.0.3.tar.gz
- Upload date:
- Size: 212.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.11.5 Linux/5.14.0-284.40.1.el9_2.x86_64
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 74680252f7825c9b219d094cd58f6d468c9fb84851b223a4732f8e3a3b9bc5d6 |
|
MD5 | 7ee31fce16eddcce4064c497c9cb7c3e |
|
BLAKE2b-256 | 3eb256af2d5825c22774d84163d627def29fdcd0e45093c97380534e30476ee6 |
File details
Details for the file visionkan-0.0.3-py3-none-any.whl
.
File metadata
- Download URL: visionkan-0.0.3-py3-none-any.whl
- Upload date:
- Size: 245.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.11.5 Linux/5.14.0-284.40.1.el9_2.x86_64
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2c30a4a1ed47f446af6e6aa980299f007224e75ad6363160ce1524490854ad17 |
|
MD5 | 4ba6bd59e1ab879a0b8b2a146c4e8127 |
|
BLAKE2b-256 | d72863f8493d77b8cea639535982614a3ce21e4f4ee34497330cc1a236cf81b1 |