Skip to main content

Vision Transformers implemented using KAN layers

Project description

Vision-KAN

We are experimenting with the possibility of KAN replacing MLP in Vision Transformer, this project may be delayed for a long time due to GPU resource constraints, if there are any new developments, we will show them here!

To install this package

pip install VisionKAN

Minimal Example

from VisionKAN import create_model, train_one_epoch, evaluate

KAN_model = create_model(
    model_name='deit_tiny_patch16_224_KAN',
    pretrained=False,
    hdim_kan=192,
    num_classes=100,
    drop_rate=0.0,
    drop_path_rate=0.05,
    img_size=224,
    batch_size=144
)
Dataset MLP hidden dim model date epoch top1 top5 Checkpoint
ImageNet 1k 768 DeiT-tiny(baseline) - 300 72.2 91.1
CIFAR-100 192 DeiT-tiny(baseline) 2024.5.25 300(stop) 84.94 96.53 Checkpoint
CIFAR-100 384 DeiT-small(baseline) 2024.5.25 300(stop) 86.49 96.17 Checkpoint
CIFAR-100 768 DeiT-base(baseline) 2024.5.25 300(stop) 86.54 96.16 Checkpoint
Dataset KAN hidden dim model date epoch top1 top5 Checkpoint
ImageNet 1k 20 Vision-KAN 2024.5.16 37(stop) 36.34 61.48 -
ImageNet 1k 192 Vision-KAN 2024.5.25 346(stop) 64.87 86.14 Checkpoint
ImageNet 1k 768 Vision-KAN 2024.5.29 71(training) 59.39 82.49 -
CIFAR-100 192 Vision-KAN 2024.5.25 300(stop) 73.17 93.307 Checkpoint
CIFAR-100 384 Vision-KAN 2024.5.25 300(stop) 78.69 94.73 Checkpoint
CIFAR-100 768 Vision-KAN 2024.5.29 300(stop) 79.82 95.42 Checkpoint

News

5.7.2024

We released our current Vision KAN code, we used efficient KAN to simply replace the MLP layer in the Transformer block and are pre-training the Tiny model on ImageNet 1k, subsequent results will be updated in the table.

5.14.2024

The model has started to converge, we use [192, 20, 192] as input, hidden, and output dimensions, and we reshape the input dimensions in order to fit the processing dimensions of KAN.

5.15.2024

we change efficient kan to faster kan to speed up to 2x in training process, and change base model from Deit iii to Deit, so that we can use pre-trained model for most layers except kan layer

5.16.2024

The convergence of the model seems to be entering a bottleneck, and I'm guessing that kan's hidden layer setting of 20 is too small, so I'm going to adjust the hidden layer to 192 if it doesn't converge after a few more rounds of running.

5.22.2024

Fix Timm version dependency bugs and remove extraneous code.

5.24.2024

The decline in losses is starting to slow down and it looks like it's getting close to the final result.

5.25.2024

The model with 192 hidden layers is close to convergence and we will next try a larger KAN hidden layer, the same as the MLP. We release the best checkpoint of VisionKAN with 192 hidden dim.

Architecture

We used DeiT as a baseline for Vision KAN development, thanks to Meta and MIT for the amazing work!

Star History

Star History Chart

If you are using our work, please cite

@misc{VisionKAN2024,
  author = {Ziwen Chen and Gundavarapu   and WU DI},
  title = {Vision-KAN: Exploring the Possibility of KAN Replacing MLP in Vision Transformer},
  year = {2024},
  howpublished = {\url{https://github.com/chenziwenhaoshuai/Vision-KAN.git}},
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

visionkan-0.0.5.tar.gz (212.4 kB view details)

Uploaded Source

Built Distribution

visionkan-0.0.5-py3-none-any.whl (245.0 kB view details)

Uploaded Python 3

File details

Details for the file visionkan-0.0.5.tar.gz.

File metadata

  • Download URL: visionkan-0.0.5.tar.gz
  • Upload date:
  • Size: 212.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.11.5 Linux/5.14.0-284.40.1.el9_2.x86_64

File hashes

Hashes for visionkan-0.0.5.tar.gz
Algorithm Hash digest
SHA256 c9f839dca985018dd62841af848f01c0cd4e0d14d11294743ee3edf24d0aa3e3
MD5 9fd6bd04e8d04371d7de9e62cf42627d
BLAKE2b-256 64ae6a0e1cff6c97a4462449d3406fa15da93e4ff6e69b9d3ed085610ff0b670

See more details on using hashes here.

File details

Details for the file visionkan-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: visionkan-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 245.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.11.5 Linux/5.14.0-284.40.1.el9_2.x86_64

File hashes

Hashes for visionkan-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 81b0956c0274ea3886a95b78aeea1351bebd4794e86a2405b186c0334e940885
MD5 eda0753b4165f7c60e8f491125eb0c77
BLAKE2b-256 3f82fe1b14b2b478c0e5f988f72faaebc8b3d15825659c58e64c2c751ddfe4ed

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page