CLI tool to run local vision models like stabilityai/sd-turbo
Project description
Vllama: Vision Models Made Easy ๐
Vllama is a comprehensive CLI tool that simplifies working with vision models and machine learning workflows. Whether you're preprocessing datasets, training models with AutoML, or generating images with state-of-the-art diffusion models, Vllama makes it easy - locally or on cloud GPUs.
โจ Key Features
- ๐ง Autonomous Data Preprocessing: Intelligent data cleaning, encoding, scaling, and feature selection
- ๐ค AutoML Training: Train and compare multiple ML models automatically with hyperparameter tuning
- ๐จ Vision Model Inference: Generate images using pre-trained diffusion models (Stable Diffusion, SD-Turbo)
- โ๏ธ Cloud GPU Integration: Seamlessly offload computation to Kaggle GPUs
- ๐ Rich Visualizations: Automatic generation of insights, correlations, and performance metrics
- ๐พ Smart Output Management: Organized folder structure with logs, models, and visualizations
๐ฆ Installation
1. Clone the Repository
git clone https://github.com/ManvithGopu13/Vllama.git
cd Vllama
2. Install Dependencies
pip install -r requirements.txt
3. Install Vllama CLI
pip install -e .
Now you can use vllama from anywhere in your terminal!
๐ Quick Start Guide
Data Preprocessing & Model Training Workflow
Step 1: Preprocess Your Dataset
Clean and prepare your data for machine learning:
vllama data --path dataset.csv --target price --test_size 0.2 --output_dir ./outputs
What it does:
- Automatically detects column types (numerical/categorical)
- Handles missing values intelligently (KNN imputation, median/mode filling)
- Removes duplicates and handles outliers
- Encodes categorical variables (label encoding, one-hot encoding, frequency encoding)
- Scales features using RobustScaler
- Performs feature selection (removes zero-variance and highly correlated features)
- Generates visualizations (missing values heatmap, correlation matrix, etc.)
- Splits data into train/test sets
- Saves processed data as
train_data.csvandtest_data.csv
Parameters:
--path: Path to your dataset (supports CSV, Excel, JSON, Parquet)--target: Target column name (auto-detected if not specified)--test_sizeor-t: Test set proportion (default: 0.2)--output_diror-o: Output directory (default: current directory)
Output Structure:
output_folder_YYYYMMDD_HHMMSS/
โโโ train_data.csv
โโโ test_data.csv
โโโ processed_full_data.csv
โโโ preprocessing_log.json
โโโ preprocessing_log.txt
โโโ summary_report.json
โโโ transformation_metadata.json
โโโ visualizations/
โโโ 01_missing_initial.png
โโโ 02_dtypes.png
โโโ 03_corr_processed.png
โโโ 04_target_processed.png
โโโ 05_mi.png
Step 2: Train Models with AutoML
Automatically train and compare multiple ML models:
vllama train --path ./outputs/output_folder_YYYYMMDD_HHMMSS --target price
What it does:
- Auto-detects task type (classification or regression)
- Trains multiple models with hyperparameter tuning:
- Classification: Logistic Regression, Random Forest, XGBoost, LightGBM, CatBoost, SVM, KNN, MLP, Naive Bayes
- Regression: Random Forest, XGBoost, LightGBM, CatBoost, SVR, KNN, MLP
- Uses RandomizedSearchCV for efficient hyperparameter optimization
- Evaluates models on test set with comprehensive metrics
- Generates visualizations (confusion matrices, ROC curves, prediction plots)
- Saves all models and creates a leaderboard
- Identifies and saves the best performing model
Parameters:
--pathor-p: Path to folder containingtrain_data.csvandtest_data.csv--targetor-t: Target column name
Output Structure:
results/
โโโ model_summary.csv # Leaderboard of all models
โโโ best_model.pkl # Best performing model
โโโ best_model.txt # Best model details
โโโ report.html # HTML report with all results
โโโ per_model/
โโโ RandomForest/
โ โโโ RandomForest_best_model.pkl
โ โโโ RandomForest_tuning_results.csv
โ โโโ RandomForest_confusion_matrix.png
โ โโโ RandomForest_roc_curve.png
โโโ XGBoost/
โโโ ...
Vision Model Inference Workflow
Step 1: Show Available Models
vllama show models
Lists all supported vision models with descriptions.
Step 2: Install a Model (Optional)
Pre-download model weights to cache:
vllama install stabilityai/sd-turbo
Step 3: Generate Images Locally
Single Prompt Mode:
vllama run stabilityai/sd-turbo --prompt "A serene mountain landscape at sunset" --output_dir ./images
Interactive Mode:
vllama run stabilityai/sd-turbo
Then enter prompts interactively. Type exit or quit to stop.
Parameters:
model: Model name (e.g.,stabilityai/sd-turbo)--promptor-p: Text prompt for image generation--output_diror-o: Directory to save generated images (default: current directory)--serviceor-s: Offload to cloud service (e.g.,kaggle)
Features:
- Automatic GPU/CPU detection
- Low VRAM optimization (for GPUs with โค3GB VRAM)
- Memory-efficient attention (xformers)
- Attention slicing and VAE tiling for better performance
Step 4: Generate Images on Kaggle GPU
vllama run stabilityai/sd-turbo --service kaggle --prompt "A cyberpunk city at night"
What it does:
- Creates a Kaggle kernel with GPU enabled
- Installs dependencies automatically
- Runs the model on Kaggle's GPU
- Downloads the generated image to your local machine
๐ Complete Command Reference
Data & ML Commands
vllama data
Autonomous data preprocessing and cleaning.
vllama data --path <dataset> --target <column> [--test_size <float>] [--output_dir <dir>]
Examples:
# Basic usage with auto-detected target
vllama data --path sales_data.csv
# Specify target column and test size
vllama data --path housing.csv --target price --test_size 0.25
# Custom output directory
vllama data --path data.csv --target label -t 0.3 -o ./processed_data
vllama train
AutoML model training with hyperparameter tuning.
vllama train --path <data_folder> --target <column>
Examples:
# Train on preprocessed data
vllama train --path ./output_folder_20231124_143022 --target SalePrice
# Short form
vllama train -p ./data -t label
Vision Model Commands
vllama show models
List all supported vision models.
vllama show models
vllama install
Download and cache a model.
vllama install <model_name>
Example:
vllama install stabilityai/sd-turbo
vllama run
Run a vision model for image generation.
vllama run <model_name> [--prompt <text>] [--service <service>] [--output_dir <dir>]
Examples:
# Single prompt
vllama run stabilityai/sd-turbo --prompt "A beautiful sunset"
# Interactive mode
vllama run stabilityai/sd-turbo
# Run on Kaggle GPU
vllama run stabilityai/sd-turbo --service kaggle --prompt "A dragon flying"
# Custom output directory
vllama run stabilityai/sd-turbo -p "A forest" -o ./my_images
vllama post
Send a prompt to an already running model session.
vllama post <prompt> [--output_dir <dir>]
Example:
vllama post "A magical castle" --output_dir ./outputs
vllama stop
Stop the currently running model session.
vllama stop
Cloud Integration Commands
vllama login
Authenticate with a cloud GPU service.
vllama login --service <service> [--username <user>] [--key <api_key>]
Examples:
# Login to Kaggle with credentials
vllama login --service kaggle --username myusername --key abc123xyz
# Use existing Kaggle credentials from ~/.kaggle/kaggle.json
vllama login --service kaggle
vllama init gpu
Initialize a GPU session on a cloud service.
vllama init gpu --service <service>
Example:
vllama init gpu --service kaggle
vllama logout
Remove cloud service credentials.
vllama logout
๐ฏ Common Workflows
Workflow 1: Complete ML Pipeline
# 1. Preprocess data
vllama data --path raw_data.csv --target price
# 2. Train models (use the output folder from step 1)
vllama train --path ./output_folder_20231124_143022 --target price
# 3. Review results in the results/ folder
Workflow 2: Local Image Generation
# 1. Install model (optional, first-time only)
vllama install stabilityai/sd-turbo
# 2. Generate images interactively
vllama run stabilityai/sd-turbo
# Enter prompts:
# Prompt> A serene lake with mountains
# Prompt> A futuristic city
# Prompt> exit
Workflow 3: Cloud GPU Image Generation
# 1. Login to Kaggle
vllama login --service kaggle --username myuser --key myapikey
# 2. Generate image on Kaggle GPU
vllama run stabilityai/sd-turbo --service kaggle --prompt "A magical forest"
# Image will be downloaded automatically
๐ Understanding Outputs
Data Preprocessing Outputs
Logs:
preprocessing_log.json: Detailed JSON log of all preprocessing stepspreprocessing_log.txt: Human-readable text logsummary_report.json: Summary statistics and metadata
Data Files:
train_data.csv: Training dataset (80% by default)test_data.csv: Testing dataset (20% by default)processed_full_data.csv: Complete processed datasettransformation_metadata.json: Encoders and scalers metadata for future use
Visualizations:
- Missing values heatmap
- Data types distribution
- Correlation matrix (top 20 features)
- Target distribution
- Mutual information scores
Model Training Outputs
Model Files:
best_model.pkl: Best performing model (can be loaded with joblib)model_summary.csv: Comparison of all trained modelsreport.html: Interactive HTML report
Per-Model Outputs:
{model}_best_model.pkl: Saved model{model}_tuning_results.csv: Hyperparameter search results{model}_confusion_matrix.png: Confusion matrix (classification){model}_roc_curve.png: ROC curve (binary classification){model}_pred_vs_true.png: Scatter plot (regression)
Vision Model Outputs
Generated images are saved as:
vllama_output_{timestamp}.png # Local generation
vllama_kaggle_{timestamp}.png # Kaggle generation
๐ง Advanced Configuration
Environment Variables
Create a .env file for configuration:
# Kaggle API Credentials
KAGGLE_USERNAME=your_username
KAGGLE_KEY=your_api_key
# Model Cache Directory (optional)
HF_HOME=/path/to/cache
GPU Optimization
Vllama automatically optimizes for your GPU:
- High VRAM (>3GB): Uses float16, full resolution (512x512), more inference steps
- Low VRAM (โค3GB): Uses float32, reduced steps, memory-efficient attention
- CPU: Falls back to CPU inference (slower but works)
๐ค Contributing
We welcome contributions! Please see CONTRIBUTING.md for guidelines.
๐ License
This project is licensed under the GNU General Public License v3.0.
๐ Troubleshooting
Common Issues
Issue: "Kaggle API credentials not found"
# Solution: Set up Kaggle credentials
vllama login --service kaggle --username YOUR_USERNAME --key YOUR_API_KEY
Issue: "CUDA out of memory"
# Solution: The tool automatically handles low VRAM, but you can also:
# 1. Close other GPU applications
# 2. Use CPU mode (automatic fallback)
# 3. Use Kaggle GPU instead
vllama run model --service kaggle --prompt "your prompt"
Issue: "Target column not found"
# Solution: Specify the target column explicitly
vllama data --path data.csv --target your_column_name
๐ Support
- Documentation: GitHub Repository
- Issues: GitHub Issues
- Email: manvithgopu1394@gmail.com
๐ Acknowledgments
Built with:
Made with โค๏ธ by Gopu Manvith
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vllama-0.7.0.tar.gz.
File metadata
- Download URL: vllama-0.7.0.tar.gz
- Upload date:
- Size: 45.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
298f892e3b2ea81670ae0a6b7e822ea145f63d96011e03e67ff1f885428188da
|
|
| MD5 |
e65bb78c3db2d55931b34a47b16cbe8c
|
|
| BLAKE2b-256 |
55ff8b44199cce7fe229ac93ad3fe3aec5dafc1ef253e1789f4444644f6db031
|
File details
Details for the file vllama-0.7.0-py3-none-any.whl.
File metadata
- Download URL: vllama-0.7.0-py3-none-any.whl
- Upload date:
- Size: 42.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4a7e817fcd1309e559f986479cf5d6b9a71a7bf0521714c3596e10464965714d
|
|
| MD5 |
db5ad0277ef01d934abf3989adb89d3f
|
|
| BLAKE2b-256 |
1eac41442e6c52d77b221dc8c5bd7fce5fd49f577ccebf3bad7370eb0b1eddb7
|