Open and modular framework for building observable LLM gateways with OpenAI-compatible APIs and pluggable providers.
Project description
Llimona
Llimona is an open and modular Python framework for building production-ready LLM gateways. It provides OpenAI-compatible APIs, provider-aware routing, and an extensible plugin model for integrating multiple backends behind a single interface.
By keeping providers as addons, Llimona stays lightweight at its core while enabling deployments to include only the integrations, policies, and observability components they actually need.
Key Features
- OpenAI-compatible service interfaces (currently Responses and Models).
- Provider routing using the
provider_name/model_namenaming convention. - Addon-based extensibility through Python entry points (
llimona.addon). - Typed YAML configuration with Pydantic validation.
- Request
Contextpropagation with actor/origin metadata, constraints, and sub-context trees. - Sensor support for metrics such as request counters and elapsed time, making request execution observable.
Architecture
Requirements
- Python
>= 3.14 uv(recommended)
Installation
Install dependencies for local development
uv sync
Install the core package
uv pip install .
Install an addon package
uv pip install ./addons/llimona_azure_openai
Quick Start
1) Create an app config
Example (test_config/app.yaml):
provider_addons:
- azure_openai
provider_loaders:
- type: autodiscovery_dirs
src: !path .
2) Create a provider directory with provider.yaml
Example (example_config/azure_1/provider.yaml):
type: azure_openai
name: azure_1
display_name: Azure Example 1
owner_id: 444444-222-333-222 # Not used, just for future purposes
base_url: !envvar AZURE_OPENAI_1_BASE_URL
credentials:
api_key: !envvar AZURE_OPENAI_1_API_KEY
services:
- type: openai_responses
- type: openai_models
models:
- name: gpt-4o-mini
allowed_services:
- openai_responses
3) Run a request
uv run llimona app --config-file example_config/app.yaml openai responses create azure_1/gpt-4o-mini "Hello" --stream
4) Observe sensor metrics
After the request completes, Llimona prints sensor values that make execution observable:
Sensor value: elapsed_time=0.606314 (Elapsed time of the request.)
Sensor value: request_count=1 (Number of requests being processed for the sensor request_count.)
Sensor value: request_per_unit_of_time=1 (Number of requests in the last 0:01:00.)
Sensor value: request_per_window_of_time=1 (Number of requests until the next reset.)
CLI Usage
Top-level help
llimona --help
List discovered addons
llimona addons
Run commands with an app config
llimona app --config-file <path-to-app.yaml> <command>
Providers
# list all providers
llimona app --config-file <cfg> providers
# inspect one provider
llimona app --config-file <cfg> providers <provider_name>
# list models in one provider
llimona app --config-file <cfg> providers <provider_name> models
OpenAI-compatible interface commands
# create a response
llimona app --config-file <cfg> openai responses create <provider>/<model> "Prompt"
# streaming response
llimona app --config-file <cfg> openai responses create <provider>/<model> "Prompt" --stream
# list models (global or filtered by provider)
llimona app --config-file <cfg> openai models list
llimona app --config-file <cfg> openai models list <provider_name>
Configuration Overview
The app configuration supports these top-level fields:
provider_addons: provider addons to register.provider_loader_addons: provider-loader addons to register.sensor_addons: sensor addons to register.id_builder: optional ID builder configuration.provider_loaders: loader definitions.
Built-in provider loader:
autodiscovery_dirs: scans child directories undersrc, readsprovider.yaml, and optionally merges definitions frommodels/*.yaml,services/*.yaml, andsensors/*.yaml.
Architecture Summary
Llimona receives OpenAI-compatible requests, decomposes model IDs, routes to the appropriate provider, and maps provider-specific responses back to interface models.
Every call flows through a Context object, which can carry:
- action metadata (
provider,service,service_action,model) - actor and origin information
- conversation metadata
- constraints
- collected sensor values
Routing strategies can create sub-contexts, enabling per-branch observability and post-execution failure inspection.
Sensors make the platform observable by exposing execution metrics across the full request context tree.
For full technical details, see docs/arch.md.
Addons in This Repository
addons/llimona_azure_openai: Azure OpenAI provider addon.addons/llimona_smart_provider: smart/virtual provider routing addon.
Development
Install development tools
uv sync --group dev
Run tests
poe test
Lint and format
poe fix
Branching and Versioning
The repository follows a GitFlow-like model with:
mainas the default integration branchfeat/*,fix/*, andchore/*working branches- squash-merge pull requests
- SemVer/PEP 440 release semantics
See branching model document for the complete policy.
Security Notes
- Do not commit real API keys or secrets in provider files.
- Inject credentials at runtime through your deployment environment.
License
This project is licensed under the GNU AFFERO GENERAL PUBLIC LICENSE. See LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llimona-0.4.0b1.tar.gz.
File metadata
- Download URL: llimona-0.4.0b1.tar.gz
- Upload date:
- Size: 44.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cbdd9141b9385b5249af6bb2f9f030ea9697e70a01e31b4a67e82b3fd8790461
|
|
| MD5 |
e012dc1566373b96c26297ac4ff0085e
|
|
| BLAKE2b-256 |
fe5a4ee79259d4de944e9b80103db310c555e8b05425e1802c043c3c75dc36c2
|
Provenance
The following attestation bundles were made for llimona-0.4.0b1.tar.gz:
Publisher:
push_code.yaml on llimona-org/llimona
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
llimona-0.4.0b1.tar.gz -
Subject digest:
cbdd9141b9385b5249af6bb2f9f030ea9697e70a01e31b4a67e82b3fd8790461 - Sigstore transparency entry: 1342282810
- Sigstore integration time:
-
Permalink:
llimona-org/llimona@cb2514cb5a141823a7be8eaa0a38e9cd0b713671 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/llimona-org
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
push_code.yaml@cb2514cb5a141823a7be8eaa0a38e9cd0b713671 -
Trigger Event:
push
-
Statement type:
File details
Details for the file llimona-0.4.0b1-py3-none-any.whl.
File metadata
- Download URL: llimona-0.4.0b1-py3-none-any.whl
- Upload date:
- Size: 54.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4856004329e106fa03ba4cc90b2df5da0fbad2e1a7b9c62415bed78bc01f74aa
|
|
| MD5 |
a3ff05585018534ac16fff745d93bc80
|
|
| BLAKE2b-256 |
99b419a496a5a1fa6c22e5aa622f4173c121fcc3504015e015d97eb404b6f7da
|
Provenance
The following attestation bundles were made for llimona-0.4.0b1-py3-none-any.whl:
Publisher:
push_code.yaml on llimona-org/llimona
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
llimona-0.4.0b1-py3-none-any.whl -
Subject digest:
4856004329e106fa03ba4cc90b2df5da0fbad2e1a7b9c62415bed78bc01f74aa - Sigstore transparency entry: 1342282822
- Sigstore integration time:
-
Permalink:
llimona-org/llimona@cb2514cb5a141823a7be8eaa0a38e9cd0b713671 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/llimona-org
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
push_code.yaml@cb2514cb5a141823a7be8eaa0a38e9cd0b713671 -
Trigger Event:
push
-
Statement type: