Skip to main content

Universal LLM adapter service for Sekha AI - Multi-provider bridge supporting 100+ LLMs via LiteLLM

Project description

Sekha LLM Bridge

Universal LLM Adapter - The Bridge Between Memory and Intelligence

CI License: AGPL v3 Version Python LiteLLM codecov Docker Image PyPI Python Versions


๐ŸŽฏ What is Sekha LLM Bridge?

LLM-Bridge is a REQUIRED component of the Sekha ecosystem. It acts as the universal adapter layer that enables the Sekha Controller to work with any LLM provider - from local Ollama to cloud services like OpenAI, Anthropic, and Google.

Why is it Required?

The Controller (Rust) focuses on memory orchestration, storage, and retrieval. LLM-Bridge (Python) handles all LLM-specific operations, providing:

  • Provider Abstraction: Switch between Ollama, GPT-4, Claude, Gemini without changing Controller code
  • Universal Compatibility: Powered by LiteLLM for 100+ LLM providers
  • Async Processing: Celery-based task queue for expensive LLM operations
  • Retry Logic: Automatic retries with exponential backoff for reliability
  • Type Safety: Pydantic models for request/response validation

๐Ÿ—๏ธ Architecture Role

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚      Sekha Controller (Rust)            โ”‚
โ”‚  โ€ข Memory Orchestration                 โ”‚
โ”‚  โ€ข Context Assembly                     โ”‚
โ”‚  โ€ข Storage (SQLite + Chroma)            โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
               โ”‚ HTTP Calls
               โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚      LLM-Bridge (Python) โ† YOU ARE HERE โ”‚
โ”‚  โ€ข Universal LLM Adapter                โ”‚
โ”‚  โ€ข Embedding Generation                 โ”‚
โ”‚  โ€ข Summarization                        โ”‚
โ”‚  โ€ข Entity Extraction                    โ”‚
โ”‚  โ€ข Importance Scoring                   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
               โ”‚ LiteLLM
               โ–ผ
    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
    โ”‚                       โ”‚
    โ–ผ                       โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”            โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Ollama  โ”‚            โ”‚ OpenAI   โ”‚
โ”‚ (Local) โ”‚            โ”‚ GPT-4    โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜            โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
    โ–ผ                        โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”            โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚Anthropicโ”‚            โ”‚  Google  โ”‚
โ”‚ Claude  โ”‚            โ”‚  Gemini  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜            โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Multi-LLM Workflow Example:

  1. Morning: Use Claude for code review โ†’ Sekha captures via Bridge
  2. Afternoon: Switch to ChatGPT for docs โ†’ Bridge forwards to OpenAI
  3. Evening: Use Ollama locally for planning โ†’ Bridge uses local LLM
  4. All stored in unified sekha.db regardless of which LLM was used!

โœจ Features

Core Services

Endpoint Purpose Used By
POST /embed Generate embeddings for semantic search Controller (on conversation storage)
POST /summarize Hierarchical summarization (daily/weekly/monthly) Controller orchestrator
POST /extract Extract entities from conversations Controller (future: auto-labeling)
POST /score Score conversation importance (1-10) Controller pruning engine
POST /v1/chat/completions OpenAI-compatible chat endpoint Proxy (optional component)

Current Capabilities

  • โœ… Ollama Integration: Full support for local LLMs
  • โœ… LiteLLM Powered: Ready for 100+ providers (OpenAI, Anthropic, etc.)
  • โœ… Async Processing: Celery task queue for background jobs
  • โœ… Retry Logic: 3 retries with exponential backoff
  • โœ… Health Monitoring: /health endpoint with model availability checks
  • โœ… Prometheus Metrics: /metrics for observability

Supported LLM Providers (via LiteLLM)

Currently Tested:

  • Ollama (nomic-embed-text, llama3.1, etc.)

Ready to Enable:

  • OpenAI (GPT-4, GPT-3.5-turbo, text-embedding-ada-002)
  • Anthropic (Claude 3 Opus, Sonnet, Haiku)
  • Google (Gemini Pro, Gemini Flash)
  • Cohere (Command, Embed)
  • Azure OpenAI
  • AWS Bedrock
  • 100+ more via LiteLLM

๐Ÿš€ Quick Start

Installation

# From PyPI (recommended)
pip install sekha-llm-bridge

# Or from source
git clone https://github.com/sekha-ai/sekha-llm-bridge.git
cd sekha-llm-bridge
pip install -e .

With Docker (Full Stack)

LLM-Bridge is included in the full Sekha stack:

git clone https://github.com/sekha-ai/sekha-docker.git
cd sekha-docker/docker
cp .env.example .env

# Edit .env to configure your LLM provider
nano .env

docker compose -f docker-compose.prod.yml up -d

Standalone Development

# Configure (copy and edit)
cp .env.example .env

# Start Redis (required for Celery)
docker run -d -p 6379:6379 redis:7-alpine

# Run
python -m sekha_llm_bridge.main

โš™๏ธ Configuration

Environment Variables

# Server
HOST=0.0.0.0
PORT=5001

# Ollama (local LLMs)
OLLAMA_URL=http://localhost:11434
EMBEDDING_MODEL=nomic-embed-text:latest
SUMMARIZATION_MODEL=llama3.1:8b

# Redis (Celery task queue)
REDIS_URL=redis://localhost:6379/0

# Cloud Providers (optional)
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...

# Logging
LOG_LEVEL=INFO

Using Different LLM Providers

Switch to OpenAI:

EMBEDDING_MODEL=text-embedding-3-small
SUMMARIZATION_MODEL=gpt-4o-mini
OPENAI_API_KEY=sk-...

Switch to Claude:

SUMMARIZATION_MODEL=claude-3-5-sonnet-20241022
ANTHROPIC_API_KEY=sk-ant-...

LiteLLM automatically routes to the correct provider based on model name!


๐Ÿ“ก API Reference

POST /embed

Generate embedding for text.

Request:

{
  "text": "What is the meaning of life?",
  "model": "nomic-embed-text:latest"  // optional
}

Response:

{
  "embedding": [0.123, -0.456, ...],  // 768-dim vector
  "model": "nomic-embed-text:latest",
  "dimension": 768,
  "tokens_used": 42
}

POST /summarize

Generate hierarchical summary.

Request:

{
  "messages": [
    "User discussed Python best practices",
    "Assistant recommended type hints"
  ],
  "level": "daily",  // daily | weekly | monthly
  "model": "llama3.1:8b",  // optional
  "max_words": 200
}

Response:

{
  "summary": "Discussed Python type hints and best practices...",
  "level": "daily",
  "model": "llama3.1:8b",
  "message_count": 2,
  "tokens_used": 156
}

POST /v1/chat/completions

OpenAI-compatible chat endpoint.

Request:

{
  "model": "llama3.1:8b",
  "messages": [
    {"role": "user", "content": "Hello!"}
  ]
}

Response: Standard OpenAI format


๐Ÿ”ง Development

Setup

# Install dev dependencies
pip install -e ".[dev]"

# Or with Poetry
poetry install --with dev

Testing

# Run tests
pytest

# With coverage
pytest --cov=sekha_llm_bridge --cov-report=html

# Type checking
mypy src/

# Linting
ruff check .
black --check .

Project Structure

sekha-llm-bridge/
โ”œโ”€โ”€ src/
โ”‚   โ””โ”€โ”€ sekha_llm_bridge/
โ”‚       โ”œโ”€โ”€ main.py              # FastAPI app
โ”‚       โ”œโ”€โ”€ config.py            # Settings
โ”‚       โ”œโ”€โ”€ models.py            # Pydantic models
โ”‚       โ”œโ”€โ”€ tasks.py             # Celery tasks
โ”‚       โ”œโ”€โ”€ services/
โ”‚       โ”‚   โ”œโ”€โ”€ embedding_service.py
โ”‚       โ”‚   โ”œโ”€โ”€ summarization_service.py
โ”‚       โ”‚   โ”œโ”€โ”€ entity_extraction_service.py
โ”‚       โ”‚   โ””โ”€โ”€ importance_scorer.py
โ”‚       โ””โ”€โ”€ utils/
โ”‚           โ””โ”€โ”€ llm_client.py    # LiteLLM wrapper
โ”œโ”€โ”€ tests/
โ”œโ”€โ”€ requirements.txt
โ””โ”€โ”€ pyproject.toml

๐Ÿค Integration with Controller

The Controller calls LLM-Bridge for:

  1. Embedding Generation: When storing new conversations

    let embedding = llm_bridge.embed_text(&message_content).await?;
    
  2. Summarization: For hierarchical summaries

    let summary = llm_bridge.summarize(messages, "daily").await?;
    
  3. Importance Scoring: For pruning decisions

    let score = llm_bridge.score_importance(&message).await?;
    

All operations are async and include automatic retries.


๐Ÿ“Š Monitoring

Health Check

curl http://localhost:5001/health

Response:

{
  "status": "healthy",
  "timestamp": "2026-01-25T20:00:00Z",
  "ollama_status": {
    "status": "healthy",
    "models_available": ["nomic-embed-text:latest", "llama3.1:8b"]
  }
}

Prometheus Metrics

curl http://localhost:5001/metrics

๐Ÿ“ Changelog

See CHANGELOG.md for full release history.


๐Ÿ—บ๏ธ Roadmap

Q1 2026

  • Ollama integration
  • LiteLLM foundation
  • OpenAI production testing
  • Anthropic Claude integration
  • Google Gemini support

Q2 2026

  • Multi-provider load balancing
  • Cost tracking per provider
  • Custom model fine-tuning support
  • Streaming responses

๐Ÿ”— Related Projects


๐Ÿ“š Documentation

Full docs: docs.sekha.dev


๐Ÿ“„ License

AGPL-3.0-or-later - License Details


๐Ÿ™‹ Support


Built with โค๏ธ by the Sekha AI team

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sekha_llm_bridge-0.2.0.tar.gz (55.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sekha_llm_bridge-0.2.0-py3-none-any.whl (64.6 kB view details)

Uploaded Python 3

File details

Details for the file sekha_llm_bridge-0.2.0.tar.gz.

File metadata

  • Download URL: sekha_llm_bridge-0.2.0.tar.gz
  • Upload date:
  • Size: 55.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sekha_llm_bridge-0.2.0.tar.gz
Algorithm Hash digest
SHA256 7992488681e093e428656db207f84645f17a5ec3e8f24234ebdccec3f16376a3
MD5 36ee17dd39cee37cc700f5e32ff104e0
BLAKE2b-256 5d8d14dc5204b6e01c0959fae1088a0e622212a37135de6900fcef4f6eb75d6d

See more details on using hashes here.

Provenance

The following attestation bundles were made for sekha_llm_bridge-0.2.0.tar.gz:

Publisher: pypi-release.yml on sekha-ai/sekha-llm-bridge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sekha_llm_bridge-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for sekha_llm_bridge-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d79beb71d7640576fb529dde536345b4aeb6b7736ed581114dcb232e5047057c
MD5 5e69a2765770a87400bb4558c5a8ca4c
BLAKE2b-256 9d079e4f4dc54575875011b237dbde197458aa566c919d1bbaa6108767526b45

See more details on using hashes here.

Provenance

The following attestation bundles were made for sekha_llm_bridge-0.2.0-py3-none-any.whl:

Publisher: pypi-release.yml on sekha-ai/sekha-llm-bridge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page