Skip to main content

Vision LLMs with LoRA fine-tuning.

Project description

Plimai: Vision LLMs with Efficient LoRA Fine-Tuning

PyPI version Downloads License: MIT


Plimai is a modular, research-friendly framework for building and fine-tuning Vision Large Language Models (LLMs) with efficient Low-Rank Adaptation (LoRA) support. It is designed for:

  • Researchers exploring new vision transformer architectures or fine-tuning strategies
  • Practitioners who want to adapt large vision models to custom datasets with limited compute
  • Developers looking for a clean, extensible codebase for vision-language AI

Plimai provides a plug-and-play interface for LoRA, making it easy to experiment with parameter-efficient fine-tuning. The codebase is modular, so you can swap out or extend components like patch embedding, attention, or MLP heads.


🏗️ Architecture Overview

Plimai is built around a modular Vision Transformer (ViT) backbone, with LoRA adapters injected into attention and MLP layers for efficient fine-tuning. The main components are:

graph TD
    A[Input Image] --> B[Patch Embedding]
    B --> C[+CLS Token & Positional Encoding]
    C --> D[Transformer Encoder]
    D --> E[LayerNorm]
    E --> F[MLP Head]
    F --> G[Output (e.g., Class logits)]
    subgraph LoRA Adapters
        D
    end

Main Modules

  • PatchEmbedding: Splits the image into patches and projects them into embedding space.
  • TransformerEncoder: Stack of transformer layers, each with multi-head self-attention and MLP blocks. LoRA adapters can be injected here.
  • LoRALinear: Low-rank adapters for efficient fine-tuning, only a small number of parameters are updated.
  • MLPHead: Final classification or regression head.
  • Config & Utils: Easy configuration and preprocessing utilities.

📦 Installation

pip install Plimai

Or, for the latest version from source:

git clone https://github.com/plim-ai/plim.git
cd plim
pip install .

🧑‍💻 Quick Start

import torch
from Plimai.models.vision_transformer import VisionTransformer
from Plimai.utils.config import default_config

# Dummy image batch: batch_size=2, channels=3, height=224, width=224
x = torch.randn(2, 3, 224, 224)
model = VisionTransformer(
    img_size=default_config['img_size'],
    patch_size=default_config['patch_size'],
    in_chans=default_config['in_chans'],
    num_classes=default_config['num_classes'],
    embed_dim=default_config['embed_dim'],
    depth=default_config['depth'],
    num_heads=default_config['num_heads'],
    mlp_ratio=default_config['mlp_ratio'],
    lora_config=default_config['lora'],
)
out = model(x)
print('Output shape:', out.shape)

📚 Documentation


🧩 Module Breakdown

Module Description
PatchEmbedding Converts images to patch embeddings for transformer input
TransformerEncoder Stack of transformer layers with optional LoRA adapters
LoRALinear Low-rank adapters for parameter-efficient fine-tuning
MLPHead Output head for classification or regression
data.py Preprocessing and augmentation utilities
config.py Centralized configuration for model/training hyperparameters

🧪 Running Tests

pytest tests/

🤝 Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

  • Open issues for bugs or feature requests
  • Submit pull requests for improvements
  • Star ⭐ the repo if you find it useful!

📄 License

This project is licensed under the MIT License. See the LICENSE file for details.


🌟 Acknowledgements

Directory Structure

Plimai/
  models/
    vision_transformer.py
    lora.py
  components/
    patch_embedding.py
    attention.py
    mlp.py
  utils/
    data.py
    config.py
  example.py

📁 Project Folders

  • memory/: For memory-related data, cache, or persistent state used by the application or agents.
  • telemetry/: For logging, analytics, or telemetry data collection and storage.
  • sync/: For synchronization logic, checkpoints, or data exchange between distributed components.
  • filesystem/: For file management utilities, storage, or virtual file system logic.
  • docs/: For documentation, API reference, and tutorials.
  • eval/: For evaluation scripts, benchmarks, or experiment results.

See the rest of this README for more details on the codebase and usage.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

plimai-0.1.5.1.tar.gz (10.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

plimai-0.1.5.1-py3-none-any.whl (9.6 kB view details)

Uploaded Python 3

File details

Details for the file plimai-0.1.5.1.tar.gz.

File metadata

  • Download URL: plimai-0.1.5.1.tar.gz
  • Upload date:
  • Size: 10.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.11

File hashes

Hashes for plimai-0.1.5.1.tar.gz
Algorithm Hash digest
SHA256 070d909d31ee4ef5606ed0615fb111f220de5e5f06ae33169a693922efb750f3
MD5 0c036c7b238c7b7993cdc926047a5cea
BLAKE2b-256 3e5b0280e409fc4e71f1b74265301d3ce144c2aa8c678d361f0b6a3156db2f56

See more details on using hashes here.

File details

Details for the file plimai-0.1.5.1-py3-none-any.whl.

File metadata

  • Download URL: plimai-0.1.5.1-py3-none-any.whl
  • Upload date:
  • Size: 9.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.11

File hashes

Hashes for plimai-0.1.5.1-py3-none-any.whl
Algorithm Hash digest
SHA256 c76773cd3a62ffaee71b051e637b06839b86ebd389095f5495a1882c6f831453
MD5 62abc5074cc4d0fe9dfba96f79697b0f
BLAKE2b-256 45a7db1a25ba87d0e24d01300fb3002a24a04249591fe4740aba9e1f2266b5c5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page