Skip to main content

Open and modular framework for building observable LLM gateways with OpenAI-compatible APIs and pluggable providers.

Project description

Llimona

Llimona is an open and modular Python framework for building production-ready LLM gateways. It provides OpenAI-compatible APIs, provider-aware routing, and an extensible plugin model for integrating multiple backends behind a single interface.

By keeping providers as addons, Llimona stays lightweight at its core while enabling deployments to include only the integrations, policies, and observability components they actually need.

Key Features

  • OpenAI-compatible service interfaces (currently Responses and Models).
  • Provider routing using the provider_name/model_name naming convention.
  • Addon-based extensibility through Python entry points (llimona.addon).
  • Typed YAML configuration with Pydantic validation.
  • Request Context propagation with actor/origin metadata, constraints, and sub-context trees.
  • Sensor support for metrics such as request counters and elapsed time, making request execution observable.

Architecture

Architecture documentation

Requirements

  • Python >= 3.14
  • uv (recommended)

Installation

Install dependencies for local development

uv sync

Install the core package

uv pip install .

Install an addon package

uv pip install ./addons/llimona_azure_openai

Quick Start

1) Create an app config

Example (test_config/app.yaml):

provider_addons:
  - azure_openai
provider_loaders:
  - type: autodiscovery_dirs
    src: !path .

2) Create a provider directory with provider.yaml

Example (example_config/azure_1/provider.yaml):

type: azure_openai
name: azure_1
display_name: Azure Example 1
owner_id: 444444-222-333-222 # Not used, just for future purposes
base_url: !envvar AZURE_OPENAI_1_BASE_URL
credentials:
  api_key: !envvar AZURE_OPENAI_1_API_KEY
services:
- type: openai_responses
- type: openai_models
models:
- name: gpt-4o-mini
  allowed_services:
  - openai_responses

3) Run a request

uv run llimona app --config-file example_config/app.yaml openai responses create azure_1/gpt-4o-mini "Hello" --stream

4) Observe sensor metrics

After the request completes, Llimona prints sensor values that make execution observable:

Sensor value: elapsed_time=0.606314 (Elapsed time of the request.)
Sensor value: request_count=1 (Number of requests being processed for the sensor request_count.)
Sensor value: request_per_unit_of_time=1 (Number of requests in the last 0:01:00.)
Sensor value: request_per_window_of_time=1 (Number of requests until the next reset.)

CLI Usage

Top-level help

llimona --help

List discovered addons

llimona addons

Run commands with an app config

llimona app --config-file <path-to-app.yaml> <command>

Providers

# list all providers
llimona app --config-file <cfg> providers

# inspect one provider
llimona app --config-file <cfg> providers <provider_name>

# list models in one provider
llimona app --config-file <cfg> providers <provider_name> models

OpenAI-compatible interface commands

# create a response
llimona app --config-file <cfg> openai responses create <provider>/<model> "Prompt"

# streaming response
llimona app --config-file <cfg> openai responses create <provider>/<model> "Prompt" --stream

# list models (global or filtered by provider)
llimona app --config-file <cfg> openai models list
llimona app --config-file <cfg> openai models list <provider_name>

Configuration Overview

The app configuration supports these top-level fields:

  • provider_addons: provider addons to register.
  • provider_loader_addons: provider-loader addons to register.
  • sensor_addons: sensor addons to register.
  • id_builder: optional ID builder configuration.
  • provider_loaders: loader definitions.

Built-in provider loader:

  • autodiscovery_dirs: scans child directories under src, reads provider.yaml, and optionally merges definitions from models/*.yaml, services/*.yaml, and sensors/*.yaml.

Architecture Summary

Llimona receives OpenAI-compatible requests, decomposes model IDs, routes to the appropriate provider, and maps provider-specific responses back to interface models.

Every call flows through a Context object, which can carry:

  • action metadata (provider, service, service_action, model)
  • actor and origin information
  • conversation metadata
  • constraints
  • collected sensor values

Routing strategies can create sub-contexts, enabling per-branch observability and post-execution failure inspection.

Sensors make the platform observable by exposing execution metrics across the full request context tree.

For full technical details, see docs/arch.md.

Addons in This Repository

  • addons/llimona_azure_openai: Azure OpenAI provider addon.
  • addons/llimona_smart_provider: smart/virtual provider routing addon.

Development

Install development tools

uv sync --group dev

Run tests

poe test

Lint and format

poe fix

Branching and Versioning

The repository follows a GitFlow-like model with:

  • main as the default integration branch
  • feat/*, fix/*, and chore/* working branches
  • squash-merge pull requests
  • SemVer/PEP 440 release semantics

See branching model document for the complete policy.

Security Notes

  • Do not commit real API keys or secrets in provider files.
  • Inject credentials at runtime through your deployment environment.

License

This project is licensed under the GNU AFFERO GENERAL PUBLIC LICENSE. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llimona-0.3.0b2.tar.gz (44.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llimona-0.3.0b2-py3-none-any.whl (54.5 kB view details)

Uploaded Python 3

File details

Details for the file llimona-0.3.0b2.tar.gz.

File metadata

  • Download URL: llimona-0.3.0b2.tar.gz
  • Upload date:
  • Size: 44.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for llimona-0.3.0b2.tar.gz
Algorithm Hash digest
SHA256 b63c1954b533dcdaab7e30186b5ed8c6800315971f17852a339bfdcb53484f50
MD5 53ed5ff5c9d38e02856ec903ec58136c
BLAKE2b-256 df135c792d7ae41a4930ca4ab871f8237a9ab302725d6e2f3e85dec5a3e15e23

See more details on using hashes here.

Provenance

The following attestation bundles were made for llimona-0.3.0b2.tar.gz:

Publisher: push_code.yaml on llimona-org/llimona

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file llimona-0.3.0b2-py3-none-any.whl.

File metadata

  • Download URL: llimona-0.3.0b2-py3-none-any.whl
  • Upload date:
  • Size: 54.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for llimona-0.3.0b2-py3-none-any.whl
Algorithm Hash digest
SHA256 ee74a993181655e276df848297251b303987d60ce3bc496597fb74d3e1df69a2
MD5 cc74743d04579c1ac4560d0fdb1a9f78
BLAKE2b-256 bad82058c92a9906030f58e955a7749877549740449734f8e98a34d0e368ade1

See more details on using hashes here.

Provenance

The following attestation bundles were made for llimona-0.3.0b2-py3-none-any.whl:

Publisher: push_code.yaml on llimona-org/llimona

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page