Skip to main content

Modular expertise framework for transformer models โ€” add independent LoRA-based expert lobes with automatic hybrid routing.

Project description

๐Ÿง  MultiLobe

Modular expertise framework for transformer models.

MultiLobe lets you attach independent LoRA-based expertise lobes to any frozen HuggingFace causal language model. Each lobe is a lightweight specialist trained on domain-specific data, and a hybrid two-stage router automatically picks the right expert at inference time.

Why? Traditional fine-tuning suffers from catastrophic forgetting โ€” teaching a model law makes it forget medicine. MultiLobe solves this architecturally: the base model is always frozen, and each domain lives in its own isolated LoRA adapter.


โœจ Features

  • ๐Ÿ”’ Frozen base model โ€” the original weights are never modified
  • ๐Ÿงฉ Isolated lobes โ€” each expertise area is a separate LoRA adapter; training one never affects another
  • ๐Ÿš€ Hybrid routing โ€” fast embedding-based first pass + log-probability fallback
  • ๐Ÿ’พ Save & load โ€” persist the full multi-expert setup and restore it later
  • ๐ŸŽฏ Manual override โ€” force a specific lobe when you know which expert to use
  • ๐Ÿ“Š Transparent decisions โ€” get routing metadata (which lobe, confidence, scores)

๐Ÿ“ฆ Installation

pip install multilobe

Or install from source:

git clone https://github.com/multilobe/multilobe.git
cd multilobe
pip install -e .

Requirements

  • Python โ‰ฅ 3.9
  • PyTorch โ‰ฅ 2.0
  • A CUDA GPU is strongly recommended for training

๐Ÿš€ Quick Start

from multilobe import MultiLobeModel

# 1. Load a base model (frozen automatically)
model = MultiLobeModel.from_pretrained("google/gemma-2b")

# 2. Add expertise lobes
model.add_lobe(
    name="medical",
    dataset="path/to/medical.jsonl",
    epochs=3,
    lora_r=16,
    lora_alpha=32,
)

model.add_lobe(
    name="legal",
    dataset="path/to/legal.jsonl",
    epochs=3,
)

# 3. Save the full setup
model.save("./my_multilobe_model")

# 4. Load it back later
model = MultiLobeModel.load("./my_multilobe_model")

# 5. Generate โ€” routing is automatic
output = model.generate("Bu semptomlar ne anlama gelir?")
print(output)

๐ŸŽฏ API Reference

MultiLobeModel.from_pretrained(model_name, **kwargs)

Downloads a HuggingFace causal-LM, freezes all parameters, and initialises the routing system.

Parameter Default Description
model_name (required) HuggingFace model ID (e.g. "google/gemma-2b")
device auto torch.device override
embedding_model "all-MiniLM-L6-v2" Sentence-transformer for embedding router
embedding_threshold 0.75 Cosine similarity threshold
logprob_max_tokens 30 Tokens evaluated by log-prob fallback

model.add_lobe(name, dataset, epochs, **kwargs)

Creates, trains, and registers a new LoRA-based expertise lobe.

Parameter Default Description
name (required) Unique lobe identifier
dataset (required) Path to JSONL file ({"input": ..., "output": ...})
epochs 3 Training epochs
lora_r 16 LoRA rank
lora_alpha 32 LoRA alpha scaling
batch_size 4 Training batch size
learning_rate 2e-4 AdamW peak learning rate

model.generate(input_text, **kwargs)

Generate a response with automatic or manual lobe selection.

# Automatic routing
output = model.generate("What are the symptoms of diabetes?")

# Manual lobe selection
output = model.generate("...", lobe="medical")

# With routing metadata
output, meta = model.generate("...", return_metadata=True)
print(meta["selected_lobe"])      # "medical"
print(meta["router_confidence"])  # 0.89
print(meta["router_type"])        # "embedding" or "logprob"

model.save(path) / MultiLobeModel.load(path)

Persists the full multi-lobe setup. Only LoRA weights + metadata are saved โ€” the base model is re-downloaded from the Hub on load.


๐Ÿ”„ How Routing Works

MultiLobe uses a two-stage hybrid router:

Input Query
    โ”‚
    โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Stage 1: Embedding  โ”‚  Fast cosine similarity
โ”‚  (all-MiniLM-L6-v2)  โ”‚  against lobe centroids
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
           โ”‚
    confidence โ‰ฅ 0.75?
    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”
    โ”‚ YES         โ”‚ NO
    โ–ผ             โ–ผ
 Use lobe    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
             โ”‚  Stage 2: LogProb โ”‚  Evaluate input
             โ”‚  (fallback)       โ”‚  under each adapter
             โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                      โ”‚
                      โ–ผ
                  Use best lobe
  1. Embedding Router โ€” encodes the query with a sentence-transformer, computes cosine similarity with each lobe's representation vector (the mean embedding of its training data). If the best score exceeds 0.75, route immediately.

  2. Log-Prob Router โ€” if the embedding router isn't confident, each lobe's adapter is activated in turn and the mean log-probability of the input tokens is computed. The lobe with the highest log-prob wins.


๐Ÿ“ Dataset Format

Each JSONL file should contain one JSON object per line:

{"input": "What are the symptoms of hypertension?", "output": "Hypertension symptoms include headaches, shortness of breath, and nosebleeds..."}
{"input": "How is diabetes diagnosed?", "output": "Diabetes is typically diagnosed through blood tests such as HbA1c..."}

๐Ÿ—๏ธ Architecture

multilobe/
โ”œโ”€โ”€ __init__.py          # Public API exports
โ”œโ”€โ”€ model.py             # MultiLobeModel โ€” main orchestrator
โ”œโ”€โ”€ lobe.py              # Lobe & LobeMetadata classes
โ”œโ”€โ”€ trainer.py           # LoRA training loop
โ”œโ”€โ”€ utils.py             # Dataset loading, tokenisation helpers
โ””โ”€โ”€ router/
    โ”œโ”€โ”€ __init__.py
    โ”œโ”€โ”€ base.py          # BaseRouter ABC & RoutingResult
    โ”œโ”€โ”€ embedding.py     # Stage 1 โ€” cosine similarity router
    โ””โ”€โ”€ logprob.py       # Stage 2 โ€” log-probability fallback

๐Ÿ“œ License

MIT


๐Ÿค Contributing

Contributions are welcome! Please open an issue or submit a pull request.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

multilobe-0.1.0.tar.gz (21.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

multilobe-0.1.0-py3-none-any.whl (22.0 kB view details)

Uploaded Python 3

File details

Details for the file multilobe-0.1.0.tar.gz.

File metadata

  • Download URL: multilobe-0.1.0.tar.gz
  • Upload date:
  • Size: 21.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.7

File hashes

Hashes for multilobe-0.1.0.tar.gz
Algorithm Hash digest
SHA256 5a5dc4ed4afb6931bc8222e01c5deed30d471e07146700484faf8b07c141f053
MD5 47ffce6898232d536b7ae5d12635c5fa
BLAKE2b-256 970dec60f174ef77c8df87b7318e9306a31cbdef362a6e938b9887c9ec0fc20b

See more details on using hashes here.

File details

Details for the file multilobe-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: multilobe-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 22.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.7

File hashes

Hashes for multilobe-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 92c43df190338ab2ac38c452dbf0827797786ab37324cee0175ff95f964018d5
MD5 06bf7abddc3ecd7a0b0d6c78e868dd9a
BLAKE2b-256 278645208e83c2968ef8a4f82ef83f981cdf963d9d2d1da2e56038598105a41c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page