LocalLab: Run language models locally or in Google Collab with a friendly API
Project description
🚀 LocalLab: Run AI Models Easily
LocalLab empowers users to run any Hugging Face AI model locally or on Google Colab with minimal setup required. It automatically configures an API using ngrok, enabling seamless integration into applications from any location. Designed for simplicity, LocalLab makes advanced AI accessible to all, regardless of technical expertise. With built-in model management, performance optimizations, and system monitoring, it ensures efficient and reliable AI operations for developers, researchers, and enthusiasts alike.
What Problem Does LocalLab Solve?
- Local Inference: Run advanced language models without relying on expensive cloud services.
- Optimized Performance: Utilize state-of-the-art techniques like quantization, attention slicing, and CPU offloading for maximum efficiency.
- Seamless Deployment: Easily switch between local deployment and Google Colab, leveraging ngrok for public accessibility.
- Effective Resource Management: Automatically monitor and manage CPU, RAM, and GPU usage to ensure smooth operation.
System Requirements
Minimum Requirements
| Component | Local Deployment | Google Colab |
|---|---|---|
| RAM | 4GB | Free tier (12GB) |
| CPU | 2 cores | 2 cores |
| Python | 3.8+ | 3.8+ |
| Storage | 2GB free | - |
| GPU | Optional | Available in free tier |
Recommended Requirements
| Component | Local Deployment | Google Colab |
|---|---|---|
| RAM | 8GB+ | Pro tier (24GB) |
| CPU | 4+ cores | Pro tier (4 cores) |
| Python | 3.9+ | 3.9+ |
| Storage | 5GB+ free | - |
| GPU | CUDA-compatible | Pro tier GPU |
Key Features
- Multiple Model Support: Pre-configured models along with the ability to load custom ones on demand.
- Advanced Optimizations: Support for FP16, INT8, and INT4 quantization, Flash Attention, and attention slicing.
- Comprehensive Logging System: Colorized console output with server status tracking, request monitoring, and performance metrics.
- Robust Resource Monitoring: Real-time insights into system performance and resource usage.
- Flexible Client Libraries: Comprehensive clients available for both Python and Node.js.
- Google Colab Friendly: Dedicated workflow for deploying via Google Colab with public URL access.
Unique Visual Overview
Below is a high-level diagram of LocalLab's architecture.
graph TD
A["User"] --> B["LocalLab Client (Python/Node.js)"]
B --> C["LocalLab Server"]
C --> D["Model Manager"]
D --> E["Hugging Face Models"]
C --> F["Optimizations"]
C --> G["Resource Monitoring"]
Google Colab Workflow
sequenceDiagram
participant U as "User (Colab)"
participant S as "LocalLab Server"
participant N as "Ngrok Tunnel"
U->>S: Run start_server(ngrok=True)
S->>N: Establish public tunnel
N->>U: Return public URL
U->>S: Connect via public URL
Documentation & Usage Guides
For full documentation and detailed guides, please visit our documentation page.
- Getting Started Guide
- Python Client
- Node.js Client
- Client Comparison
- Google Colab Guide
- API Reference
Get Started
-
Installation:
pip install locallab
-
Starting the Server Locally:
from locallab import start_server start_server()
-
Starting the Server on Google Colab:
!pip install locallab # Set up your ngrok auth token (REQUIRED for public access) # Get your free token from: https://dashboard.ngrok.com/get-started/your-authtoken import os os.environ["NGROK_AUTH_TOKEN"] = "your_token_here" # Optional: Configure model and optimizations os.environ["HUGGINGFACE_MODEL"] = "microsoft/phi-2" # Choose your preferred model os.environ["LOCALLAB_ENABLE_QUANTIZATION"] = "true" # Enable model optimizations # Start the server with ngrok for public access from locallab import start_server start_server(use_ngrok=True) # Creates a public URL accessible from anywhere
-
Connecting your Client:
from locallab.client import LocalLabClient # Use the ngrok URL displayed in the output above client = LocalLabClient("https://xxxx-xxx-xxx-xxx.ngrok.io") # Test the connection response = client.generate("Hello, how are you?") print(response)
Join the Community
- Report issues on our GitHub Issues.
- Participate in discussions on our Community Forum.
- Learn how to contribute by reading our Contributing Guidelines.
LocalLab is designed to bring the power of advanced language models directly to your workspace—efficiently, flexibly, and affordably. Give it a try and revolutionize your AI projects!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file locallab-0.3.7.tar.gz.
File metadata
- Download URL: locallab-0.3.7.tar.gz
- Upload date:
- Size: 40.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.21
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
28536c43073d316f46f9252fac6bf47593f6424f7231b2cd0fc0111c05d1641d
|
|
| MD5 |
7975ffc308ff1588eaf63eefe2f90062
|
|
| BLAKE2b-256 |
453188a7e41a1eebea9c2438f54c0a42b9807f738dd7f5fad5a195c5ccfd4070
|
File details
Details for the file locallab-0.3.7-py3-none-any.whl.
File metadata
- Download URL: locallab-0.3.7-py3-none-any.whl
- Upload date:
- Size: 41.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.21
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4f0d5166cbdb581e8414e2dd4bb2bd41414411454a861bd283abd15a3dc35dba
|
|
| MD5 |
45deb240ed66b423e761d286b95eb77b
|
|
| BLAKE2b-256 |
4c74fec9bc82e0f119d27a3251c02f4a3e0d5413e09c9be224c4fc2a3ca6ca53
|