Skip to main content

LocalLab: Run language models locally or in Google Collab with a friendly API

Project description

🚀 LocalLab: Run AI Models Easily

Build Status LocalLab Version Python Version License

LocalLab empowers users to run any Hugging Face AI model locally or on Google Colab with minimal setup required. It automatically configures an API using ngrok, enabling seamless integration into applications from any location. Designed for simplicity, LocalLab makes advanced AI accessible to all, regardless of technical expertise. With built-in model management, performance optimizations, and system monitoring, it ensures efficient and reliable AI operations for developers, researchers, and enthusiasts alike.

What's New in v0.4.9

  • 🔧 Fixed Configuration System: The locallab config command now properly saves settings that are respected when running locallab start
  • 📋 Configuration Display: The CLI now shows your current configuration before prompting for changes
  • ⏩ Skip Unnecessary Prompts: Only prompts for settings that aren't already configured
  • ✅ Clear Feedback: After saving configuration, the CLI shows what was saved and how to use it

What Problem Does LocalLab Solve?

  • Local Inference: Run advanced language models without relying on expensive cloud services.
  • Optimized Performance: Utilize state-of-the-art techniques like quantization, attention slicing, and CPU offloading for maximum efficiency.
  • Seamless Deployment: Easily switch between local deployment and Google Colab, leveraging ngrok for public accessibility.
  • Effective Resource Management: Automatically monitor and manage CPU, RAM, and GPU usage to ensure smooth operation.

System Requirements

Minimum Requirements

Component Local Deployment Google Colab
RAM 4GB Free tier (12GB)
CPU 2 cores 2 cores
Python 3.8+ 3.8+
Storage 2GB free -
GPU Optional Available in free tier

Recommended Requirements

Component Local Deployment Google Colab
RAM 8GB+ Pro tier (24GB)
CPU 4+ cores Pro tier (4 cores)
Python 3.9+ 3.9+
Storage 5GB+ free -
GPU CUDA-compatible Pro tier GPU

Key Features

  • Interactive CLI: Configure and run your server with an intuitive command-line interface that adapts to your environment.
  • Multiple Model Support: Pre-configured models along with the ability to load custom ones on demand.
  • Advanced Optimizations: Support for FP16, INT8, and INT4 quantization, Flash Attention, and attention slicing.
  • Comprehensive Logging System: Colorized console output with server status tracking, request monitoring, and performance metrics.
  • Robust Resource Monitoring: Real-time insights into system performance and resource usage.
  • Flexible Client Libraries: Comprehensive clients available for both Python and Node.js.
  • Google Colab Friendly: Dedicated workflow for deploying via Google Colab with public URL access.
  • Persistent Configuration: Save your settings for future use with the new configuration system.

Unique Visual Overview

Below is a high-level diagram of LocalLab's architecture.

graph TD
    A["User"] --> B["LocalLab Client (Python/Node.js)"]
    B --> C["LocalLab Server"]
    C --> D["Model Manager"]
    D --> E["Hugging Face Models"]
    C --> F["Optimizations"]
    C --> G["Resource Monitoring"]

Google Colab Workflow

sequenceDiagram
    participant U as "User (Colab)"
    participant S as "LocalLab Server"
    participant N as "Ngrok Tunnel"
    U->>S: Run start_server(ngrok=True)
    S->>N: Establish public tunnel
    N->>U: Return public URL
    U->>S: Connect via public URL

Documentation & Usage Guides

For full documentation and detailed guides, please visit our documentation page.

Get Started

  1. Installation:

    pip install locallab
    
  2. Using the CLI (New!):

    # Start the server with interactive configuration
    locallab start
    
    # Start with specific options
    locallab start --model microsoft/phi-2 --quantize --quantize-type int8
    
    # Run the configuration wizard without starting the server
    locallab config
    
    # Display system information
    locallab info
    
  3. Starting the Server Programmatically:

    from locallab import start_server
    start_server()
    
  4. Starting the Server on Google Colab:

    !pip install locallab
    
    # Set up your ngrok auth token (REQUIRED for public access)
    # Get your free token from: https://dashboard.ngrok.com/get-started/your-authtoken
    import os
    os.environ["NGROK_AUTH_TOKEN"] = "your_token_here"
    
    # Optional: Configure model and optimizations
    os.environ["HUGGINGFACE_MODEL"] = "microsoft/phi-2"  # Choose your preferred model
    os.environ["LOCALLAB_ENABLE_QUANTIZATION"] = "true"  # Enable model optimizations
    
    # Start the server with ngrok for public access
    from locallab import start_server
    start_server(use_ngrok=True)  # Creates a public URL accessible from anywhere
    
  5. Connecting your Client:

    from locallab.client import LocalLabClient
    
    # Use the ngrok URL displayed in the output above
    client = LocalLabClient("https://xxxx-xxx-xxx-xxx.ngrok.io")
    
    # Test the connection
    response = client.generate("Hello, how are you?")
    print(response)
    

CLI Features (New in v0.4.8!)

LocalLab now includes a powerful command-line interface with the following features:

  • Interactive Configuration: Guided setup for all server settings
  • Environment Detection: Smart defaults based on your system
  • Persistent Settings: Configuration stored in ~/.locallab/config.json
  • System Information: Detailed insights about your hardware
  • Performance Optimizations: Easy configuration of quantization and other optimizations
  • Google Colab Integration: Automatic detection and configuration for Colab environments

Example CLI usage:

# Start with interactive prompts
locallab start

# Configure with specific options
locallab start --model microsoft/phi-2 --port 8080 --quantize --attention-slicing

# Run configuration wizard
locallab config

# Check system resources
locallab info

Join the Community


LocalLab is designed to bring the power of advanced language models directly to your workspace—efficiently, flexibly, and affordably. Give it a try and revolutionize your AI projects!

Project details


Release history Release notifications | RSS feed

This version

0.4.9

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

locallab-0.4.9.tar.gz (50.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

locallab-0.4.9-py3-none-any.whl (52.7 kB view details)

Uploaded Python 3

File details

Details for the file locallab-0.4.9.tar.gz.

File metadata

  • Download URL: locallab-0.4.9.tar.gz
  • Upload date:
  • Size: 50.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.21

File hashes

Hashes for locallab-0.4.9.tar.gz
Algorithm Hash digest
SHA256 274197172e94263875462809e080f226ae78c514967b6270956085cfde18e17a
MD5 864ffb7c8971412f42d08072a6931386
BLAKE2b-256 a3da968537b84241696a9520af8334164bf0e578fa823edb4d4643962bc96426

See more details on using hashes here.

File details

Details for the file locallab-0.4.9-py3-none-any.whl.

File metadata

  • Download URL: locallab-0.4.9-py3-none-any.whl
  • Upload date:
  • Size: 52.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.21

File hashes

Hashes for locallab-0.4.9-py3-none-any.whl
Algorithm Hash digest
SHA256 a7ce5215a3e0c643bf555ef3b9a199d32f9343d508cb9856dfe6139a6cee39d2
MD5 4f290f3ccd305bee59586cd4e8835b36
BLAKE2b-256 05811c531ec7fc5993ef3c6e95d662a817440bb35b2030ae90e829751da9b5f6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page