Efficient LoRA Fine-Tuning for Vision LLMs with advanced CLI and model zoo
Project description
Fine-tune Vision LLMs with ease
Train LLaVA, Qwen-VL, and other vision models in minutes.
The simplest way to create custom multimodal AI.
Quick Start • Features • Models • Docs
⚡ Quick Start
1-Click Install (Recommended)
The fastest way to get started. Installs Langvision in an isolated environment.
curl -fsSL https://raw.githubusercontent.com/langtrain-ai/langvision/main/scripts/install.sh | bash
Or using pip
pip install langvision
Fine-tune a vision model in 3 lines:
from langvision import LoRATrainer
trainer = LoRATrainer(model_name="llava-hf/llava-1.5-7b-hf")
trainer.train_from_file("image_data.jsonl")
Your custom vision model is ready.
✨ Features
🖼️ Multimodal TrainingTrain on images + text together. Perfect for VQA, image captioning, and visual reasoning. 🎯 Smart DefaultsOptimized configurations for each model architecture. Just point and train. 💾 Efficient MemoryLoRA + 4-bit quantization = Train 13B vision models on a single 24GB GPU. |
🔧 Battle-TestedProduction-ready code used by teams building real-world vision applications. 🌐 All Major ModelsLLaVA, Qwen-VL, CogVLM, InternVL, and more. Full compatibility. ☁️ Deploy AnywhereExport to GGUF, ONNX, or deploy directly to Langtrain Cloud. |
🤖 Supported Models
| Model | Parameters | Memory Required |
|---|---|---|
| LLaVA 1.5 | 7B, 13B | 8GB, 16GB |
| Qwen-VL | 7B | 8GB |
| CogVLM | 17B | 24GB |
| InternVL | 6B, 26B | 8GB, 32GB |
| Phi-3 Vision | 4.2B | 6GB |
📖 Full Example
from langvision import LoRATrainer
from langvision.config import TrainingConfig, LoRAConfig
# Configure training
config = TrainingConfig(
num_epochs=3,
batch_size=2,
learning_rate=2e-4,
lora=LoRAConfig(rank=16, alpha=32)
)
# Initialize trainer
trainer = LoRATrainer(
model_name="llava-hf/llava-1.5-7b-hf",
output_dir="./my-vision-model",
config=config
)
# Train on image-text data
trainer.train_from_file("training_data.jsonl")
📝 Data Format
{"image": "path/to/image1.jpg", "conversations": [{"from": "human", "value": "What's in this image?"}, {"from": "assistant", "value": "A cat sitting on a couch."}]}
🤝 Community
Built with ❤️ by Langtrain AI
Making vision AI accessible to everyone.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file langvision-0.1.49.tar.gz.
File metadata
- Download URL: langvision-0.1.49.tar.gz
- Upload date:
- Size: 123.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
25b313f7b9b2e5fd9d35e36f1dc9d0feefb8bef54d7db45460173a530e959a5f
|
|
| MD5 |
f07ae12bff2256c39312186a26d496ae
|
|
| BLAKE2b-256 |
ddf14f996bafbfcacd8b1370ad4e0c534e6ca7cbc346e2eb385a5fc7bb4b1924
|
File details
Details for the file langvision-0.1.49-py3-none-any.whl.
File metadata
- Download URL: langvision-0.1.49-py3-none-any.whl
- Upload date:
- Size: 154.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
adc0fc60c34f5ab624b3eb08dfc582b73629a3109089a39235ca6452d6a2896d
|
|
| MD5 |
47e2fe2983bc50eb92e0901e63baae8c
|
|
| BLAKE2b-256 |
5de290233aa6ed74a0ccb82572dae248d6d541d49dd1012f8a636232d621e046
|