Pretrained keras 3 vision models
Project description
Keras Models 🚀
📖 Introduction
Keras Models (kmodels) is a collection of models with pretrained weights, built entirely with Keras 3. It supports a range of tasks, including classification, object detection (DETR, RT-DETR, RT-DETRv2, RF-DETR, D-FINE), segmentation (SAM, SAM2, SAM3, SegFormer, DeepLabV3, EoMT), vision-language modeling (CLIP, SigLIP, SigLIP2), and more. It includes hybrid architectures like MaxViT alongside traditional CNNs and pure transformers. kmodels includes custom layers and backbone support, providing flexibility and efficiency across various applications. For backbones, there are various weight variants like in1k, in21k, fb_dist_in1k, ms_in22k, fb_in22k_ft_in1k, ns_jft_in1k, aa_in1k, cvnets_in1k, augreg_in21k_ft_in1k, augreg_in21k, and many more.
⚡ Installation
From PyPI (recommended)
pip install -U kmodels
From Source
pip install -U git+https://github.com/IMvision12/keras-models
📑 Documentation
| Topic | Description |
|---|---|
| Backbone Models | Classification backbones (ViT, ResNet, Swin, ConvNeXt, EfficientNet, and more) with usage examples and model listing |
Segmentation
| Model | Description |
|---|---|
| SAM | Segment Anything Model — promptable segmentation with points, boxes, or masks (ViT-B/L/H) |
| SAM2 | Segment Anything Model 2 — next generation of promptable visual segmentation (Hiera Tiny/Small/Base+/Large) |
| SAM3 | Segment Anything Model 3 — open-vocabulary detection + segmentation with CLIP text encoder (ViT-L/14). Weights require Meta SAM License acceptance on HuggingFace |
| SegFormer | Transformer-based semantic segmentation with MLP decoder, Cityscapes & ADE20K weights |
| DeepLabV3 | Atrous convolution-based semantic segmentation |
| EoMT | Encoder-only Mask Transformer for panoptic segmentation |
Object Detection
| Model | Description |
|---|---|
| DETR | End-to-end object detection with Transformers (ResNet-50/101 backbones) |
| RT-DETR | Real-time DETR with ResNet-vd backbone and hybrid encoder (ResNet-18/34/50/101 variants) |
| RT-DETRv2 | RT-DETR v2 with selective multi-scale deformable attention and learnable per-level sampling scale (ResNet-18/34/50/101 variants) |
| RF-DETR | Real-time detection transformer (Nano, Small, Medium, Base, Large variants) |
| D-FINE | Fine-grained distribution refinement detector with HGNetV2 backbone (Nano/Small/Medium/Large/XLarge) |
Feature Extraction
| Model | Description |
|---|---|
| DINO | Self-supervised ViT-S/B and ResNet-50 backbones trained with self-distillation |
| DINOv2 | Improved self-supervised ViT-S/B/L backbones with LayerScale, trained on LVD-142M |
Vision-Language Models
| Model | Description |
|---|---|
| CLIP | Contrastive Language-Image Pre-training for zero-shot classification |
| SigLIP | Sigmoid loss-based language-image pre-training with multilingual support |
| SigLIP2 | Next-gen SigLIP with improved semantic understanding and 256K vocabulary |
📑 Models
-
Backbones
-
Object Detection
🏷️ Model Name 📜 Reference Paper 📦 Source of Weights D-FINE D-FINE: Redefine Regression Task of DETRs as Fine-grained Distribution Refinement transformersDETR End-to-End Object Detection with Transformers transformersRT-DETR DETRs Beat YOLOs on Real-time Object Detection transformersRT-DETRv2 RT-DETRv2: Improved Baseline with Bag-of-Freebies for Real-Time Detection Transformers transformersRF-DETR RF-DETR: Neural Architecture Search for Real-Time Detection Transformers rfdetr
-
Segmentation
🏷️ Model Name 📜 Reference Paper 📦 Source of Weights DeepLabV3 Rethinking Atrous Convolution for Semantic Image Segmentation torchvisionEoMT Your ViT is Secretly an Image Segmentation Model transformersSAM Segment Anything transformersSAM2 SAM 2: Segment Anything in Images and Videos transformersSAM3 SAM 3 transformers(gated)SegFormer SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers transformers
-
Feature Extraction
🏷️ Model Name 📜 Reference Paper 📦 Source of Weights DINO Emerging Properties in Self-Supervised Vision Transformers torch.hubDINOv2 DINOv2: Learning Robust Visual Features without Supervision transformers
-
Vision-Language-Models (VLMs)
🏷️ Model Name 📜 Reference Paper 📦 Source of Weights CLIP Learning Transferable Visual Models From Natural Language Supervision transformersSigLIP Sigmoid Loss for Language Image Pre-Training transformersSigLIP2 SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features transformers
📜 License
This project leverages timm and transformers for converting pretrained weights from PyTorch to Keras. For licensing details, please refer to the respective repositories.
- 🔖 kmodels Code: This repository is licensed under the Apache 2.0 License.
🌟 Credits
- The Keras team for their powerful and user-friendly deep learning framework
- The Transformers library for its robust tools for loading and adapting pretrained models
- The pytorch-image-models (timm) project for pioneering many computer vision model implementations
- All contributors to the original papers and architectures implemented in this library
Citing
BibTeX
@misc{gc2025kmodels,
author = {Gitesh Chawda},
title = {Keras Models},
year = {2025},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/IMvision12/keras-models}}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file kmodels-0.2.5.tar.gz.
File metadata
- Download URL: kmodels-0.2.5.tar.gz
- Upload date:
- Size: 478.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
76e6e5bd5288fdf332935cd4c95234a614287b2317181fbcdb08a890faa40bf5
|
|
| MD5 |
ae3237aaac2ed174e5b754b36d603306
|
|
| BLAKE2b-256 |
c0a9ae861dcb0543675ffdf8e36ba7a316c4ea5cbf5076c252b3160ec702657b
|
Provenance
The following attestation bundles were made for kmodels-0.2.5.tar.gz:
Publisher:
release.yml on IMvision12/keras-models
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
kmodels-0.2.5.tar.gz -
Subject digest:
76e6e5bd5288fdf332935cd4c95234a614287b2317181fbcdb08a890faa40bf5 - Sigstore transparency entry: 1276419530
- Sigstore integration time:
-
Permalink:
IMvision12/keras-models@d7f183bae5e8c45db8ced9deef35882166d4e1e4 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/IMvision12
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@d7f183bae5e8c45db8ced9deef35882166d4e1e4 -
Trigger Event:
push
-
Statement type:
File details
Details for the file kmodels-0.2.5-py3-none-any.whl.
File metadata
- Download URL: kmodels-0.2.5-py3-none-any.whl
- Upload date:
- Size: 587.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7179b5103f1e2117f7576b64f99ec9c6495599cad16bcc6aca58c06fb16b9f4f
|
|
| MD5 |
6f8a9dab99aef5be86df7557e7bad781
|
|
| BLAKE2b-256 |
7739bece7da1d1880f521de0e64fd396cb74e1feeb972aa4d9c7606841d184bf
|
Provenance
The following attestation bundles were made for kmodels-0.2.5-py3-none-any.whl:
Publisher:
release.yml on IMvision12/keras-models
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
kmodels-0.2.5-py3-none-any.whl -
Subject digest:
7179b5103f1e2117f7576b64f99ec9c6495599cad16bcc6aca58c06fb16b9f4f - Sigstore transparency entry: 1276419644
- Sigstore integration time:
-
Permalink:
IMvision12/keras-models@d7f183bae5e8c45db8ced9deef35882166d4e1e4 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/IMvision12
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@d7f183bae5e8c45db8ced9deef35882166d4e1e4 -
Trigger Event:
push
-
Statement type: