Vision LLMs with LoRA fine-tuning.
Project description
Plimai: Vision LLMs with Efficient LoRA Fine-Tuning
Plimai is a modular, research-friendly framework for building and fine-tuning Vision Large Language Models (LLMs) with efficient Low-Rank Adaptation (LoRA) support. It is designed for:
- Researchers exploring new vision transformer architectures or fine-tuning strategies
- Practitioners who want to adapt large vision models to custom datasets with limited compute
- Developers looking for a clean, extensible codebase for vision-language AI
Plimai provides a plug-and-play interface for LoRA, making it easy to experiment with parameter-efficient fine-tuning. The codebase is modular, so you can swap out or extend components like patch embedding, attention, or MLP heads.
🏗️ Architecture Overview
Plimai is built around a modular Vision Transformer (ViT) backbone, with LoRA adapters injected into attention and MLP layers for efficient fine-tuning. The main components are:
graph TD
A[Input Image] --> B[Patch Embedding]
B --> C[+CLS Token & Positional Encoding]
C --> D[Transformer Encoder]
D --> E[LayerNorm]
E --> F[MLP Head]
F --> G[Output (e.g., Class logits)]
subgraph LoRA Adapters
D
end
Main Modules
- PatchEmbedding: Splits the image into patches and projects them into embedding space.
- TransformerEncoder: Stack of transformer layers, each with multi-head self-attention and MLP blocks. LoRA adapters can be injected here.
- LoRALinear: Low-rank adapters for efficient fine-tuning, only a small number of parameters are updated.
- MLPHead: Final classification or regression head.
- Config & Utils: Easy configuration and preprocessing utilities.
📦 Installation
pip install Plimai
Or, for the latest version from source:
git clone https://github.com/plim-ai/plim.git
cd plim
pip install .
🧑💻 Quick Start
import torch
from Plimai.models.vision_transformer import VisionTransformer
from Plimai.utils.config import default_config
# Dummy image batch: batch_size=2, channels=3, height=224, width=224
x = torch.randn(2, 3, 224, 224)
model = VisionTransformer(
img_size=default_config['img_size'],
patch_size=default_config['patch_size'],
in_chans=default_config['in_chans'],
num_classes=default_config['num_classes'],
embed_dim=default_config['embed_dim'],
depth=default_config['depth'],
num_heads=default_config['num_heads'],
mlp_ratio=default_config['mlp_ratio'],
lora_config=default_config['lora'],
)
out = model(x)
print('Output shape:', out.shape)
📚 Documentation
🧩 Module Breakdown
| Module | Description |
|---|---|
PatchEmbedding |
Converts images to patch embeddings for transformer input |
TransformerEncoder |
Stack of transformer layers with optional LoRA adapters |
LoRALinear |
Low-rank adapters for parameter-efficient fine-tuning |
MLPHead |
Output head for classification or regression |
data.py |
Preprocessing and augmentation utilities |
config.py |
Centralized configuration for model/training hyperparameters |
🧪 Running Tests
pytest tests/
🤝 Contributing
We welcome contributions! Please see CONTRIBUTING.md for guidelines.
- Open issues for bugs or feature requests
- Submit pull requests for improvements
- Star ⭐ the repo if you find it useful!
📄 License
This project is licensed under the MIT License. See the LICENSE file for details.
🌟 Acknowledgements
Directory Structure
Plimai/
models/
vision_transformer.py
lora.py
components/
patch_embedding.py
attention.py
mlp.py
utils/
data.py
config.py
example.py
📁 Project Folders
- memory/: For memory-related data, cache, or persistent state used by the application or agents.
- telemetry/: For logging, analytics, or telemetry data collection and storage.
- sync/: For synchronization logic, checkpoints, or data exchange between distributed components.
- filesystem/: For file management utilities, storage, or virtual file system logic.
- docs/: For documentation, API reference, and tutorials.
- eval/: For evaluation scripts, benchmarks, or experiment results.
See the rest of this README for more details on the codebase and usage.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file plimai-0.1.5.1.tar.gz.
File metadata
- Download URL: plimai-0.1.5.1.tar.gz
- Upload date:
- Size: 10.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
070d909d31ee4ef5606ed0615fb111f220de5e5f06ae33169a693922efb750f3
|
|
| MD5 |
0c036c7b238c7b7993cdc926047a5cea
|
|
| BLAKE2b-256 |
3e5b0280e409fc4e71f1b74265301d3ce144c2aa8c678d361f0b6a3156db2f56
|
File details
Details for the file plimai-0.1.5.1-py3-none-any.whl.
File metadata
- Download URL: plimai-0.1.5.1-py3-none-any.whl
- Upload date:
- Size: 9.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c76773cd3a62ffaee71b051e637b06839b86ebd389095f5495a1882c6f831453
|
|
| MD5 |
62abc5074cc4d0fe9dfba96f79697b0f
|
|
| BLAKE2b-256 |
45a7db1a25ba87d0e24d01300fb3002a24a04249591fe4740aba9e1f2266b5c5
|