Implementation of popular vision models in Jax
Project description
Equimo: Modern Vision Models in JAX/Equinox
WARNING: This is a research library implementing recent computer vision models. The implementations are based on paper descriptions and may not be exact replicas of the original implementations. Use with caution in production environments.
Equimo (Equinox Image Models) provides JAX/Equinox implementations of recent computer vision models, currently focusing (but not limited to) on transformer and state-space architectures.
Features
- Pure JAX/Equinox implementations
- Focus on recent architectures (2023-2024 papers)
- Modular design for easy experimentation
- Extensive documentation and type hints
Installation
From PyPI
pip install equimo
From Source
git clone https://github.com/clementpoiret/equimo.git
cd equimo
pip install -e .
Implemented Models
| Model | Paper | Year | Status |
|---|---|---|---|
| FasterViT | FasterViT: Fast Vision Transformers with Hierarchical Attention | 2023 | ✅ |
| Castling-ViT | Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention During Vision Transformer Inference | 2023 | Partial* |
| MLLA | Mamba-like Linear Attention | 2024 | ✅ |
| PartialFormer | Efficient Vision Transformers with Partial Attention | 2024 | ✅ |
| SHViT | SHViT: Single-Head Vision Transformer with Memory Efficient Macro Design | 2024 | ✅ |
| VSSD | VSSD: Vision Mamba with Non-Causal State Space Duality | 2024 | ✅ |
*: Only contains the Linear Angular Attention module. It is straight forward to build a ViT around it, but may require an additional __call__ kwarg to control the sparse_reg bool.
Basic Usage
import jax
import equimo.models as em
# Create a model (e.g. `faster_vit_0_224`)
key = jax.random.PRNGKey(0)
model = em.FasterViT(
img_size=224,
in_channels=3,
dim=64,
in_dim=64,
depths=[2, 3, 6, 5],
num_heads=[2, 4, 8, 16],
hat=[False, False, True, False],
window_size=[7, 7, 7, 7],
ct_size=2,
key=key,
)
# Generate random input
x = jax.random.normal(key, (3, 224, 224))
# Run inference
output = model(x, enable_dropout=False, key=key)
Saving and Loading Models
Equimo provides utilities for saving models locally and loading pre-trained models from the official repository.
Saving Models Locally
from pathlib import Path
from equimo.io import save_model
# Save model with compression (creates .tar.lz4 file)
save_model(
Path("path/to/save/model"),
model, # can be any model you created using Equimo
model_config,
torch_hub_cfg, # This can be an empty list, it's mainly to keep track of where are the weights coming
compression=True
)
# Save model without compression (creates directory)
save_model(
Path("path/to/save/model"),
model,
model_config,
torch_hub_cfg,
compression=False
)
Loading Models
from equimo.io import load_model
# Load a pre-trained model from the official repository
model = load_model(cls="vit", identifier="dinov2_vits14_reg")
# Load a local model (compressed)
model = load_model(cls="vit", path=Path("path/to/model.tar.lz4"))
# Load a local model (uncompressed directory)
model = load_model(cls="vit", path=Path("path/to/model/"))
List of pretrained models
Currently, only DinoV2 have been ported, therefore the following identifiers are available:
dinov2_vitb14.tar.lz4dinov2_vitb14_reg.tar.lz4dinov2_vitl14.tar.lz4dinov2_vitl14_reg.tar.lz4dinov2_vits14.tar.lz4dinov2_vits14_reg.tar.lz4
(giant version coming soon)
Contributing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Citation
If you use Equimo in your research, please cite:
@software{equimo2024,
author = {Clément POIRET},
title = {Equimo: Modern Vision Models in JAX/Equinox},
year = {2024},
publisher = {GitHub},
url = {https://github.com/clementpoiret/equimo}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file equimo-0.2.0.tar.gz.
File metadata
- Download URL: equimo-0.2.0.tar.gz
- Upload date:
- Size: 46.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.6.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b925276712098ef429c018a7141a4b7396bc6afe1695899c418c01bcae48a691
|
|
| MD5 |
4e58abeebe619185b0fccf92d4b8d381
|
|
| BLAKE2b-256 |
01b94050955c8acd459c79c6d3f0b5ab0fe40f571d62861492a328910427e6af
|
File details
Details for the file Equimo-0.2.0-py3-none-any.whl.
File metadata
- Download URL: Equimo-0.2.0-py3-none-any.whl
- Upload date:
- Size: 61.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.6.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4c9cdb587879c1709649a4f2414d0a5659da991ecce66d9bdca1b773713b63c0
|
|
| MD5 |
8c0a5e5d7196517a4d314eab5763d1b5
|
|
| BLAKE2b-256 |
0d208b437b65cd2ac125a3d67316027ce383aebbc6fb316389d274578d5f201f
|