Modular expertise framework for transformer models โ add independent LoRA-based expert lobes with automatic hybrid routing.
Project description
๐ง MultiLobe
Modular expertise framework for transformer models.
MultiLobe lets you attach independent LoRA-based expertise lobes to any frozen HuggingFace causal language model. Each lobe is a lightweight specialist trained on domain-specific data, and a hybrid two-stage router automatically picks the right expert at inference time.
Why? Traditional fine-tuning suffers from catastrophic forgetting โ teaching a model law makes it forget medicine. MultiLobe solves this architecturally: the base model is always frozen, and each domain lives in its own isolated LoRA adapter.
โจ Features
- ๐ Frozen base model โ the original weights are never modified
- ๐งฉ Isolated lobes โ each expertise area is a separate LoRA adapter; training one never affects another
- ๐ Hybrid routing โ fast embedding-based first pass + log-probability fallback
- ๐พ Save & load โ persist the full multi-expert setup and restore it later
- ๐ฏ Manual override โ force a specific lobe when you know which expert to use
- ๐ Transparent decisions โ get routing metadata (which lobe, confidence, scores)
๐ฆ Installation
pip install multilobe
Or install from source:
git clone https://github.com/multilobe/multilobe.git
cd multilobe
pip install -e .
Requirements
- Python โฅ 3.9
- PyTorch โฅ 2.0
- A CUDA GPU is strongly recommended for training
๐ Quick Start
from multilobe import MultiLobeModel
# 1. Load a base model (frozen automatically)
model = MultiLobeModel.from_pretrained("google/gemma-2b")
# 2. Add expertise lobes
model.add_lobe(
name="medical",
dataset="path/to/medical.jsonl",
epochs=3,
lora_r=16,
lora_alpha=32,
)
model.add_lobe(
name="legal",
dataset="path/to/legal.jsonl",
epochs=3,
)
# 3. Save the full setup
model.save("./my_multilobe_model")
# 4. Load it back later
model = MultiLobeModel.load("./my_multilobe_model")
# 5. Generate โ routing is automatic
output = model.generate("Bu semptomlar ne anlama gelir?")
print(output)
๐ฏ API Reference
MultiLobeModel.from_pretrained(model_name, **kwargs)
Downloads a HuggingFace causal-LM, freezes all parameters, and initialises the routing system.
| Parameter | Default | Description |
|---|---|---|
model_name |
(required) | HuggingFace model ID (e.g. "google/gemma-2b") |
device |
auto | torch.device override |
embedding_model |
"all-MiniLM-L6-v2" |
Sentence-transformer for embedding router |
embedding_threshold |
0.75 |
Cosine similarity threshold |
logprob_max_tokens |
30 |
Tokens evaluated by log-prob fallback |
model.add_lobe(name, dataset, epochs, **kwargs)
Creates, trains, and registers a new LoRA-based expertise lobe.
| Parameter | Default | Description |
|---|---|---|
name |
(required) | Unique lobe identifier |
dataset |
(required) | Path to JSONL file ({"input": ..., "output": ...}) |
epochs |
3 |
Training epochs |
lora_r |
16 |
LoRA rank |
lora_alpha |
32 |
LoRA alpha scaling |
batch_size |
4 |
Training batch size |
learning_rate |
2e-4 |
AdamW peak learning rate |
model.generate(input_text, **kwargs)
Generate a response with automatic or manual lobe selection.
# Automatic routing
output = model.generate("What are the symptoms of diabetes?")
# Manual lobe selection
output = model.generate("...", lobe="medical")
# With routing metadata
output, meta = model.generate("...", return_metadata=True)
print(meta["selected_lobe"]) # "medical"
print(meta["router_confidence"]) # 0.89
print(meta["router_type"]) # "embedding" or "logprob"
model.save(path) / MultiLobeModel.load(path)
Persists the full multi-lobe setup. Only LoRA weights + metadata are saved โ the base model is re-downloaded from the Hub on load.
๐ How Routing Works
MultiLobe uses a two-stage hybrid router:
Input Query
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโ
โ Stage 1: Embedding โ Fast cosine similarity
โ (all-MiniLM-L6-v2) โ against lobe centroids
โโโโโโโโโโโโฌโโโโโโโโโโโโ
โ
confidence โฅ 0.75?
โโโโโโโโดโโโโโโโ
โ YES โ NO
โผ โผ
Use lobe โโโโโโโโโโโโโโโโโโโโ
โ Stage 2: LogProb โ Evaluate input
โ (fallback) โ under each adapter
โโโโโโโโโโฌโโโโโโโโโโ
โ
โผ
Use best lobe
-
Embedding Router โ encodes the query with a sentence-transformer, computes cosine similarity with each lobe's representation vector (the mean embedding of its training data). If the best score exceeds 0.75, route immediately.
-
Log-Prob Router โ if the embedding router isn't confident, each lobe's adapter is activated in turn and the mean log-probability of the input tokens is computed. The lobe with the highest log-prob wins.
๐ Dataset Format
Each JSONL file should contain one JSON object per line:
{"input": "What are the symptoms of hypertension?", "output": "Hypertension symptoms include headaches, shortness of breath, and nosebleeds..."}
{"input": "How is diabetes diagnosed?", "output": "Diabetes is typically diagnosed through blood tests such as HbA1c..."}
๐๏ธ Architecture
multilobe/
โโโ __init__.py # Public API exports
โโโ model.py # MultiLobeModel โ main orchestrator
โโโ lobe.py # Lobe & LobeMetadata classes
โโโ trainer.py # LoRA training loop
โโโ utils.py # Dataset loading, tokenisation helpers
โโโ router/
โโโ __init__.py
โโโ base.py # BaseRouter ABC & RoutingResult
โโโ embedding.py # Stage 1 โ cosine similarity router
โโโ logprob.py # Stage 2 โ log-probability fallback
๐ License
MIT
๐ค Contributing
Contributions are welcome! Please open an issue or submit a pull request.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file multilobe-0.1.0.tar.gz.
File metadata
- Download URL: multilobe-0.1.0.tar.gz
- Upload date:
- Size: 21.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5a5dc4ed4afb6931bc8222e01c5deed30d471e07146700484faf8b07c141f053
|
|
| MD5 |
47ffce6898232d536b7ae5d12635c5fa
|
|
| BLAKE2b-256 |
970dec60f174ef77c8df87b7318e9306a31cbdef362a6e938b9887c9ec0fc20b
|
File details
Details for the file multilobe-0.1.0-py3-none-any.whl.
File metadata
- Download URL: multilobe-0.1.0-py3-none-any.whl
- Upload date:
- Size: 22.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
92c43df190338ab2ac38c452dbf0827797786ab37324cee0175ff95f964018d5
|
|
| MD5 |
06bf7abddc3ecd7a0b0d6c78e868dd9a
|
|
| BLAKE2b-256 |
278645208e83c2968ef8a4f82ef83f981cdf963d9d2d1da2e56038598105a41c
|