A model aggregator service for multiple LLM backends.
Project description
LLM Aggregator
LLM Aggregator keeps a live list of every model exposed by your local OpenAI-compatible servers.
Features
- Polls models from configured LLM provider servers (
/v1/models). - Enriches model information with a helper LLM.
- Optionally hands model information from external websites to helper LLM.
- Ships with a minimal UI showing providers, models, and host RAM.
- The builtin UI can easily be replaced.
Web Interface
The builtin UI shows a single table plus a small RAM widget, so you immediately see what is running:
| Model | Base URL | Types | Family | Context | Quant | Params | Summary |
|---|---|---|---|---|---|---|---|
| llama3.1:8b | http://10.7.2.100:11434/v1 |
llm | Llama 3.1 | 8K | Q4_K_M | 8B | General chat tuned for balance |
| qwen2.5:14b | http://10.7.2.100:8080/v1 |
llm,embed | Qwen 2.5 | 32K | Q5_0 | 14B | Multilingual reasoning focused |
Columns:
Model– identifier reported by the provider.Base URL– where the model is served.Types– capabilities (LLM, VLM, embedder, etc.).Family– base architecture inferred by the helper LLM.Context– approximate context window in tokens.Quant– quantization hinted by the model name or docs.Params– estimated parameter count.Summary– one-line description generated by the helper LLM.
Installation
Prerequisites
- Python 3.10 or higher
- LLM servers (Ollama, llama.cpp, nexa, etc.) with OpenAI-compatible APIs
Install from PyPI
pip install llm-aggregator
Usage
Set the LLM_AGGREGATOR_CONFIG environment variable to point at your config.yaml and the service will
load it on startup.
Starting the Service
export LLM_AGGREGATOR_CONFIG=/path/to/config.yaml
llm-aggregator
Or run directly:
export LLM_AGGREGATOR_CONFIG=/path/to/config.yaml
python -m llm_aggregator
By default, the web interface will be available at http://localhost:8888.
Configuration
All runtime behavior is controlled through the YAML file pointed to by the LLM_AGGREGATOR_CONFIG environment variable.
Use config.yaml as a reference template.
UI modes
Use static_enabled and custom_static_path to set one of three modes:
static_enabled: true(default) serves the built-in UI.static_enabled: trueandcustom_static_path: /path/to/dirserves your files instead of the built-in UI.static_enabled: falseserves no UI at all. Provide your own UI using the REST endpoints.
Configuration Options
- host / port – Where the FastAPI server and static frontend bind.
- log_level – Logging verbosity (
DEBUG,INFO,WARNING,ERROR,CRITICAL). Defaults toINFOif omitted. - log_format – Optional
loggingformat string. When omitted the service leaves existing logging configuration untouched. - logger_overrides – Map of logger names to override their logging level
(e.g.,
httpx: WARNING). - brain – Settings for the enrichment LLM:
base_url– HTTP endpoint of the enrichment provider.id– Model identifier passed to the provider.api_key– Optional API-Key.max_batch_size– Number of models to enrich at once (defaults to 1).
- providers – Map of provider name to an OpenAI-compatible backend to query:
base_url– Public URL returned via the REST API.internal_base_url– Optional internal URL used for server-to-server calls; defaults tobase_urlwhen omitted.api_key– Optional API-Key for that provider.files_size_gatherer– Optional block to report on-disk model size:path– Script or executable invoked as<path> <base_path> <full_model_name>.base_path– Filesystem root passed to the script.timeout_seconds– Optional per-provider timeout (default: 15s).
- model_info_sources – Optional external websites where model information is fetched from for enrichment.
Each entry requires a human-readable
name(shown to the LLM) and aurl_templatethat contains{model_id}. - time – Background scheduling knobs (all in seconds):
fetch_models_intervalfetch_models_timeoutenrich_models_timeoutenrich_idle_sleepwebsite_markdown_cache_ttl– TTL for cached markdown scraped from external sources.
- ui – Optional static UI:
static_enabled–true: static web frontend is served at/index.htmland assets at/static.custom_static_path– Optional directory to replace the bundled UI; must contain a readableindex.htmland asset files.
- brain_prompts – LLM instructions kept separate so the block can live at the end of the YAML:
system– System message injected ahead of every enrichment request.user– Main user instruction describing the enrichment JSON contract.model_info_prefix_template– Optional prefix template applied to fetched markdown snippets; receives{model_id}and{provider_label}placeholders.
REST API
-
GET /v1/models– OpenAIListModelsResponseplus ametaobject on eachdataitem with the enriched metadata. Example:{ "object": "list", "data": [ { "id": "llama3.1:8b", "object": "model", "created": 1, "owned_by": "ollama", "meta": { "base_url": "http://127.0.0.1:11434/v1", "types": ["llm"], "model_family": "Llama 3.1", "context_size": "8K", "quant": "Q4_K_M", "param": "8B", "size": 481406976, "summary": "General chat tuned for balance" } } ] }
-
GET /api/stats– Returns an array of recent RAM usage percentages sampled for the Chart.js widget in the UI[57.5,57.6,57.6]
-
POST /api/clear– Empty request; clears model cache and restarts model information collection.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llm_aggregator-0.1.9.tar.gz.
File metadata
- Download URL: llm_aggregator-0.1.9.tar.gz
- Upload date:
- Size: 44.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9bf3abd0b9f3e4306a6325c8f56435fc72e43de194b69448ae553d2bc7e767d6
|
|
| MD5 |
de1de51b401f2e4a974375003db53d88
|
|
| BLAKE2b-256 |
5efd2964ed44be92a999f6d03570f3a5122e97d1b6022f88866cc7d606f13446
|
Provenance
The following attestation bundles were made for llm_aggregator-0.1.9.tar.gz:
Publisher:
ci.yml on Wuodan/llm-aggregator
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
llm_aggregator-0.1.9.tar.gz -
Subject digest:
9bf3abd0b9f3e4306a6325c8f56435fc72e43de194b69448ae553d2bc7e767d6 - Sigstore transparency entry: 730031505
- Sigstore integration time:
-
Permalink:
Wuodan/llm-aggregator@6c6a84b5e4c2065fa31adf35a7bd593569670996 -
Branch / Tag:
refs/tags/0.1.9 - Owner: https://github.com/Wuodan
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@6c6a84b5e4c2065fa31adf35a7bd593569670996 -
Trigger Event:
push
-
Statement type:
File details
Details for the file llm_aggregator-0.1.9-py3-none-any.whl.
File metadata
- Download URL: llm_aggregator-0.1.9-py3-none-any.whl
- Upload date:
- Size: 36.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bc3640aeb39bd4ca8df878f2ec9849d56098f5a18fed904739b7a9b6168d120f
|
|
| MD5 |
30e3cf348f64e2b97e7ba3c961215471
|
|
| BLAKE2b-256 |
b541578710d74c2f45096988700808dadba5ff1fee6ba954a66596df36ca7204
|
Provenance
The following attestation bundles were made for llm_aggregator-0.1.9-py3-none-any.whl:
Publisher:
ci.yml on Wuodan/llm-aggregator
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
llm_aggregator-0.1.9-py3-none-any.whl -
Subject digest:
bc3640aeb39bd4ca8df878f2ec9849d56098f5a18fed904739b7a9b6168d120f - Sigstore transparency entry: 730031506
- Sigstore integration time:
-
Permalink:
Wuodan/llm-aggregator@6c6a84b5e4c2065fa31adf35a7bd593569670996 -
Branch / Tag:
refs/tags/0.1.9 - Owner: https://github.com/Wuodan
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@6c6a84b5e4c2065fa31adf35a7bd593569670996 -
Trigger Event:
push
-
Statement type: