Efficient LoRA Fine-Tuning for Vision LLMs with advanced CLI and model zoo

These details have not been verified by PyPI

Project links

Project description

Langvision: LoRA Fine-Tuning for Vision LLMs

Fine-tune Vision LLMs (LLaVA, Qwen-VL) in minutes

What You'll Need

# Quick system check
python --version 

# Check GPU support (Optional but recommended)
python -c "import torch; print('GPU ready!' if torch.cuda.is_available() else 'CPU mode - still works!')"

Install LangTrain

# Step 1: Create a clean environment (recommended)
python -m venv langtrain-env
source langtrain-env/bin/activate  # Windows: langtrain-env\Scripts\activate

# Step 2: Install LangVision
pip install langvision

# Step 3: Verify it worked
python -c "import langvision; print('✅ LangVision installed!')"

Train Your First Model

from langvision import LoRATrainer

# Step 1: Define your training data (Images + QA)
training_data = [
    {
        "image": "./images/cat.jpg", 
        "question": "What is in this image?", 
        "answer": "A cute tabby cat sitting on a rug."
    },
    {
        "image": "./images/dog.jpg", 
        "question": "Describe the animal.", 
        "answer": "A golden retriever playing with a ball."
    }
]

# Step 2: Create the trainer
# Configures Vision Encoder + LLM Adapter automatically
trainer = LoRATrainer(
    model_name="llava-v1.6-7b",  # Works with LLaVA, Qwen-VL, BLIP-2 etc.
    output_dir="./my_vision_model",
)

# Step 3: Train!
trainer.train(training_data)

# Step 4: Test your model
model = trainer.load_model()
response = model.chat("./images/cat.jpg", "What do you see?")
print(f"AI: {response}")

Use Your Trained Model

from langvision import ChatModel

# Load your trained model
model = ChatModel.load("./my_vision_model")

# Analyze images
print(model.chat("image1.jpg", "Describe this scene."))

Using Your Own Data

from langvision import LoRATrainer

trainer = LoRATrainer(
    model_name="llava-v1.6-7b",
    output_dir="./custom_vlm",
)

# Method 1: Load from Hugging Face datasets
trainer.train_from_hub("your_username/your_vqa_dataset")

Next Steps

Train with QLoRA: Use QLoRATrainer to fine-tune LLaVA-7B on consumer GPUs (under 12GB VRAM).
Explore Model Zoo: langvision model-zoo list to see supported models (LLaVA, Qwen, CogVLM, etc.).
Read the Docs: Check out langtrain.xyz/docs.

Architecture Overview

Langvision adapts Vision Transformers (ViT) and Large Language Models (LLM) using LoRA.

flowchart TD
    A(["Input Image"]) --> B(["Vision Encoder (Frozen)"])
    B --> C(["Projector"])
    C --> D(["LLM (LoRA Adapted)"])
    D --> E(["Text Output"])

Contributing

Contributions are welcome! See CONTRIBUTING.md.

License

MIT License. See LICENSE.

Citation

@software{langvision2025,
  author = {Pritesh Raj},
  title = {Langvision: Efficient LoRA Fine-Tuning for Vision LLMs},
  url = {https://github.com/langtrain-ai/langvision},
  year = {2025}
}

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.58

May 17, 2026

0.1.57

Feb 24, 2026

0.1.56

Feb 18, 2026

0.1.55

Feb 18, 2026

0.1.54

Feb 18, 2026

0.1.53

Feb 16, 2026

0.1.52

Feb 16, 2026

0.1.51

Jan 10, 2026

0.1.50

Jan 10, 2026

0.1.49

Jan 10, 2026

0.1.48

Jan 10, 2026

0.1.47

Jan 10, 2026

0.1.46

Jan 10, 2026

0.1.45

Jan 10, 2026

0.1.44

Jan 10, 2026

0.1.43

Jan 10, 2026

0.1.42

Jan 4, 2026

0.1.41

Jan 4, 2026

0.1.40

Jan 4, 2026

0.1.39

Jan 4, 2026

This version

0.1.38

Jan 4, 2026

0.1.37

Jan 4, 2026

0.1.0

Sep 22, 2025

0.0.2

Jul 3, 2025

0.0.1

Jul 3, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langvision-0.1.38.tar.gz (119.4 kB view details)

Uploaded Jan 4, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

langvision-0.1.38-py3-none-any.whl (149.6 kB view details)

Uploaded Jan 4, 2026 Python 3

File details

Details for the file langvision-0.1.38.tar.gz.

File metadata

Download URL: langvision-0.1.38.tar.gz
Upload date: Jan 4, 2026
Size: 119.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for langvision-0.1.38.tar.gz
Algorithm	Hash digest
SHA256	`3a97b7774823c47da20c12237332626615994a6ec438d3e5cc5d347d0fdaf268`
MD5	`0a533a4ef83bf49188164877746330fc`
BLAKE2b-256	`6789214c46c76d501e0bf1a3284d0c7003edd9faadc8dcb9deba4f843c342501`

See more details on using hashes here.

File details

Details for the file langvision-0.1.38-py3-none-any.whl.

File metadata

Download URL: langvision-0.1.38-py3-none-any.whl
Upload date: Jan 4, 2026
Size: 149.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for langvision-0.1.38-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2efdc8d42318807859b34590a0de5e2cac01b74356c428c1633389145736bbc5`
MD5	`2aa52f65aae4e0283dddf3aa57c092b3`
BLAKE2b-256	`d61e83168d85e62345aeec00f0ee385fdf48b362699babd24e95436504293d93`

See more details on using hashes here.

langvision 0.1.38

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Langvision: LoRA Fine-Tuning for Vision LLMs

Fine-tune Vision LLMs (LLaVA, Qwen-VL) in minutes

What You'll Need

Install LangTrain

Train Your First Model

Use Your Trained Model

Using Your Own Data

Next Steps

Architecture Overview

Contributing

License

Citation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes