Skip to main content

Fuse multiple LoRA adapters into Mixture-of-Experts systems. Train → Compose → Serve.

Project description

ConcAdptr

Concocting Adapters — Brew multiple LoRA adapters into Mixture-of-Experts systems.

Train → Concoct → Serve.

Python 3.9+ License: Apache 2.0 PyPI version


ConcAdptr takes independently trained LoRA adapters and concocts them into MoE-style expert systems with learned routing. Model-agnostic. Privacy-preserving. Built for production.

The Problem

You fine-tune a base model with LoRA for your product. Then each customer or user-group needs their own specialization — but they can't share their data. You end up with multiple LoRA adapters trained in isolation. How do you combine them into something smarter than any individual adapter?

The Solution

ConcAdptr takes your independently trained LoRA adapters and concocts them into experts within a Mixture-of-Experts system. A lightweight router learns which expert(s) to activate for each input — without needing access to any customer's original training data.

Base Model ─┬─ LoRA Adapter A (medical)    ──┐
            ├─ LoRA Adapter B (legal)      ──┼── Router ── Concocted Output
            └─ LoRA Adapter C (finance)    ──┘

Key Features

  • Model-agnostic — Works with any HuggingFace transformer (Qwen, LLaMA, Mistral, Gemma, etc.)
  • Privacy-preserving — Customer data never leaves their environment; only adapters travel
  • 3 routing strategies — Soft merging (MoLoRA), Top-K sparse routing (MixLoRA), X-LoRA learned scaling
  • Static merging fallback — Linear, TIES, and DARE merging when routing overhead is undesirable
  • HuggingFace Hub integration — Push/pull adapters and full models to/from the Hub
  • Full pipeline — Train adapters → Concoct with router → Serve — one library
  • Production-ready — FastAPI serving, adapter registry, compatibility validation
  • Consumer GPU friendly — 4-bit quantization, runs on 16GB VRAM

Installation

pip install concadptr

With optional dependencies:

pip install concadptr[training]   # + bitsandbytes, trl
pip install concadptr[serving]    # + fastapi, uvicorn
pip install concadptr[hub]        # + huggingface_hub
pip install concadptr[all]        # everything

Quick Start

1. Define your configuration

from concadptr import ConcAdptrConfig

config = ConcAdptrConfig(
    base_model="Qwen/Qwen2.5-7B-Instruct",
    adapters={
        "medical": "./adapters/medical_invoices",
        "legal": "./adapters/legal_contracts",
        "finance": "./adapters/financial_reports",
    },
    routing_strategy="xlora",  # or "soft_merging", "top_k"
    quantization="4bit",
)

Or load from YAML:

config = ConcAdptrConfig.from_yaml("config.yaml")

2. Build the concocted model

from concadptr import ConcAdptrModel

model = ConcAdptrModel.from_config(config)

3. Train the router

from concadptr import ConcAdptrTrainer
from datasets import load_dataset, concatenate_datasets

# Mix of domain samples — not customer data
router_dataset = concatenate_datasets([
    load_dataset("medical_qa", split="train[:500]"),
    load_dataset("legal_docs", split="train[:500]"),
    load_dataset("finance_qa", split="train[:500]"),
])

trainer = ConcAdptrTrainer(
    model=model,
    train_dataset=router_dataset,
    learning_rate=1e-4,
    num_epochs=3,
    batch_size=4,
)

trainer.train()
model.save_pretrained("./concocted_model")

4. Analyze routing patterns

from concadptr.utils import print_routing_summary

model.router.enable_history(True)
# Run some inference...
stats = model.router.get_routing_stats()
print_routing_summary(stats, expert_names=["medical", "legal", "finance"])

5. Serve

from concadptr.serving import serve

serve("./concocted_model", host="0.0.0.0", port=8000)
curl -X POST http://localhost:8000/v1/completions \
  -H "Content-Type: application/json" \
  -d '{"prompt": "Analyze this medical invoice...", "max_tokens": 256}'

Routing Strategies

Strategy How It Works Best For
soft_merging Weighted average of ALL experts per token Few experts (2-8), overlapping domains
top_k Activate only top-k experts per token Many experts (8+), distinct domains
xlora Learned scaling with frozen adapters, layer-wise Independent adapters, privacy-critical

Static Merging (No Router)

When routing overhead is undesirable, merge adapters statically into a single PEFT adapter:

from concadptr import merge_adapters

# Linear weighted average
output = merge_adapters(
    adapters={"medical": "./adapters/medical", "legal": "./adapters/legal"},
    output_path="./merged",
    method="linear",       # "linear", "ties", "dare", "dare_ties"
    weights=[0.6, 0.4],
)

# TIES — reduces interference between adapters
output = merge_adapters(adapters=..., output_path="./merged", method="ties", trim_fraction=0.2)

# DARE — stochastic drop + rescale before merging
output = merge_adapters(adapters=..., output_path="./merged", method="dare", density=0.7)

Or via the registry:

registry.merge(["medical", "legal"], output_path="./merged", method="ties")

The output is a standard PEFT adapter directory — usable with PeftModel.from_pretrained().

HuggingFace Hub

# Push a full concocted model
model.push_to_hub("username/my-concocted-model", token="hf_...")

# Load it back
model = ConcAdptrModel.from_hub("username/my-concocted-model")

# Push/pull individual adapters
registry.push_adapter_to_hub("medical", repo_id="username/medical-adapter")
registry.load_adapter_from_hub("username/medical-adapter", name="medical")

Architecture

┌──────────────────────────────────────────────┐
│                 ConcAdptrModel               │
│                                              │
│  ┌──────────┐  ┌───────────────────────────┐ │
│  │   Base   │  │     Adapter Registry      │ │
│  │  Model   │  │  ┌─────┐ ┌─────┐ ┌─────┐ │ │
│  │ (frozen) │  │  │LoRA │ │LoRA │ │LoRA │ │ │
│  │          │  │  │  A  │ │  B  │ │  C  │ │ │
│  │          │  │  │froze│ │froze│ │froze│ │ │
│  └──────────┘  │  └──┬──┘ └──┬──┘ └──┬──┘ │ │
│                └─────┼───────┼───────┼─────┘ │
│                      │       │       │       │
│                ┌─────▼───────▼───────▼─────┐ │
│                │         Router            │ │
│                │       (trainable)         │ │
│                └─────────────┬─────────────┘ │
│                              │               │
│                    ┌─────────▼─────────┐     │
│                    │ Concocted Output  │     │
│                    └───────────────────┘     │
└──────────────────────────────────────────────┘

The Multi-Customer Use Case

ConcAdptr was designed for a specific real-world pattern:

  1. You fine-tune a base model on your general training data → your product's foundation model
  2. Each customer fine-tunes on their private data (on-premise) → produces a LoRA adapter
  3. The adapter (50-200MB, no raw data) is transferred back to you
  4. ConcAdptr concocts all customer adapters into a MoE system with learned routing

Customer data never leaves their environment. The router learns which expert(s) to activate without seeing the original training data. This is federated expertise — cross-customer knowledge transfer without data sharing.

Project Structure

concadptr/
├── concadptr/
│   ├── __init__.py          # Public API
│   ├── config.py            # Configuration classes (ConcAdptrConfig, MergeConfig, ...)
│   ├── model.py             # ConcAdptrModel (core)
│   ├── trainer.py           # ConcAdptrTrainer (router training)
│   ├── router/
│   │   ├── base.py          # BaseRouter ABC
│   │   ├── soft_merging.py  # Dense/soft routing (MoLoRA)
│   │   ├── top_k.py         # Sparse top-k routing (MixLoRA)
│   │   └── xlora.py         # Learned scaling (X-LoRA)
│   ├── adapters/
│   │   └── __init__.py      # AdapterRegistry
│   ├── merging/
│   │   ├── __init__.py      # merge_adapters() functional API
│   │   ├── base.py          # AdapterMerger ABC
│   │   ├── linear.py        # Weighted average
│   │   ├── ties.py          # TIES (Trim, Elect Sign, Merge)
│   │   ├── dare.py          # DARE (Drop And REscale)
│   │   └── utils.py         # Weight loading utilities
│   ├── serving/
│   │   └── server.py        # FastAPI inference server
│   └── utils/
│       └── visualization.py # Routing analysis tools
├── tests/
├── examples/
│   └── config.yaml
├── pyproject.toml
└── README.md

Development

git clone https://github.com/irfanalii/concadptr.git
cd concadptr
pip install -e ".[dev]"
pytest

Roadmap

  • Core library architecture
  • 3 routing strategies (soft, top-k, X-LoRA)
  • Per-layer routing hooks (2-pass forward with LoRA delta weighting)
  • Adapter registry with compatibility validation
  • Router training pipeline
  • FastAPI serving
  • Routing visualization and analysis
  • Static merging — Linear, TIES, DARE, DARE+TIES
  • HuggingFace Hub push/pull (models and adapters)
  • Hook per-layer routing into the generation loop
  • vLLM integration for high-throughput serving
  • Benchmarking suite across model families (Qwen2.5, LLaMA 3.1, Mistral)
  • Adapter version metadata and progressive merging pipeline
  • Federated LoRA training (FedAvg on adapter weights)

References

License

Apache 2.0 — see LICENSE for details.

Author

Irfan AliGitHub · HuggingFace


ConcAdptr — because the best models are concocted, not just trained.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

concadptr-0.2.0.tar.gz (48.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

concadptr-0.2.0-py3-none-any.whl (42.0 kB view details)

Uploaded Python 3

File details

Details for the file concadptr-0.2.0.tar.gz.

File metadata

  • Download URL: concadptr-0.2.0.tar.gz
  • Upload date:
  • Size: 48.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for concadptr-0.2.0.tar.gz
Algorithm Hash digest
SHA256 0fe0f2e6b168d19de70b7f4d66e9d86a42a2c73faaf156a59461063130cf61af
MD5 140e9edb3080476356af5701dbf6ef6e
BLAKE2b-256 031659e798e3e2d8927fecdab667adb63c06325f946d8ac9c1eb1c62c68330d7

See more details on using hashes here.

File details

Details for the file concadptr-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: concadptr-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 42.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for concadptr-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a400c3192a5a22746437a991946362cd1fbb330a65b9c1b2351e88344b581140
MD5 ed956c04cbe2a16b8565924c09016c9d
BLAKE2b-256 2b9abf8569bfa7ca24ee609f611b2dee1b0b4848ab73acd36929cb0e4443cbc2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page