Skip to main content

Framework for collecting and managing LLM context

Project description

llm-kelt

Python Type Hints Linting: Ruff CI License

A framework for collecting, managing, and leveraging context for LLM applications. Supports fact storage, feedback collection, preference pairs, RAG-based retrieval, and fine-tuning workflows.

Features

  • Facts & Context Injection - Store facts that get injected into LLM system prompts
  • RAG Retrieval - Semantic search for relevant facts using embeddings
  • Feedback Collection - Record explicit signals (positive/negative/dismiss)
  • Preference Pairs - Store chosen vs rejected responses for DPO training
  • Training Export - Export to DPO, SFT, and classifier formats
  • LoRA Fine-Tuning - Train adapters with QLoRA support
  • Multi-Tenant - Context-scoped data isolation

Installation

# Basic installation
pip install llm-kelt

# With training dependencies (PyTorch, transformers, PEFT, TRL)
pip install llm-kelt[training]

# Development installation
git clone https://github.com/llm-works/llm-kelt.git
cd llm-kelt
pip install -e ".[dev]"

Quick Start

Setup

from llm_kelt import Client

# Create client scoped to a context
kelt = Client(context_key="default")
kelt.migrate()  # Create database tables

# Add facts about the user
kelt.facts.add("Prefers concise explanations", category="preferences")
kelt.facts.add("Expert Python developer", category="background")
kelt.facts.add("Always include code examples", category="rules")

Context Injection

from llm_kelt.inference import ContextBuilder

# Build system prompt with facts injected
builder = ContextBuilder(kelt.facts)
system_prompt = builder.build_system_prompt(
    base_prompt="You are a helpful assistant.",
    categories=["preferences", "rules"],  # Optional: filter by category
)
# Result: "You are a helpful assistant.\n\n## About the user:\n### Preferences\n- ..."

RAG-Based Retrieval

RAG (Retrieval-Augmented Generation) finds facts relevant to each query using semantic similarity.

from llm_infer.client import LLMClient
from llm_kelt.inference import (
    ContextBuilder, ContextQuery, Embedder, RAGArgs, embed_missing_facts
)

# 1. Embed facts for semantic search (model name discovered from server)
embedder = Embedder(base_url="http://localhost:8001/v1")
await embed_missing_facts(logger, embedder, kelt.facts)

# 2. Create context-aware query interface
llm_client = LLMClient.from_config(config)
query = ContextQuery(
    client=llm_client,
    context_builder=ContextBuilder(kelt.facts),
    base_system_prompt="You are a helpful assistant.",
    embedder=embedder,
)

# 3. Ask questions - RAG finds relevant facts automatically
response = await query.ask(
    "What's my preferred coding style?",
    rag=RAGArgs(top_k=5, min_similarity=0.3),
)

# Filter by category
response = await query.ask(
    "What rules should I follow?",
    rag=RAGArgs(top_k=5, categories=["rules"]),
)

# Clean up
await embedder.close_async()

Training Data Export

from llm_kelt.training import export_feedback_sft
from llm_kelt.training.dpo import export_preferences

# Record preference pairs
kelt.atomic.preferences.record(
    context="Explain gradient descent",
    chosen="Concise, accurate explanation",
    rejected="Verbose, rambling explanation",
)

# Export to DPO format for TRL
result = export_preferences(
    session_factory=kelt.database.session,
    context_key=kelt.context_key,
    output_path="preferences.jsonl",
)
# Output: {"prompt": str, "chosen": str, "rejected": str}

# Export feedback to SFT format
result = export_feedback_sft(
    session_factory=kelt.database.session,
    context_key=kelt.context_key,
    output_path="feedback_sft.jsonl",
    signal="positive",
)
# Output: {"instruction": str, "output": str}

LoRA Fine-Tuning

from appinfra.log import LogConfig, LoggerFactory
from llm_kelt.training import train_lora, RunConfig
from llm_kelt.training.lora import Config as LoraConfig

lg = LoggerFactory.create_root(LogConfig.from_params(level="info"))

# Train LoRA adapter (requires pip install llm-kelt[training])
result = train_lora(
    lg=lg,
    data_path="feedback_sft.jsonl",
    output_dir="./my_adapter",
    base_model="Qwen/Qwen2.5-7B-Instruct",
    lora_config=LoraConfig(r=16, lora_alpha=32),
    training_config=RunConfig(
        num_epochs=3,
        batch_size=4,
        learning_rate=2e-4,
    ),
    quantize=True,  # QLoRA for lower VRAM
)

print(f"Adapter saved to: {result.adapter_path}")
print(f"Train loss: {result.metrics['train_loss']:.4f}")

Architecture

Isolation Context (context_key)
  ├── Facts           → Injected into prompts (with embeddings for RAG)
  ├── Feedback        → Explicit signals (positive/negative)
  ├── Preferences     → DPO training pairs (chosen/rejected)
  ├── Interactions    → Implicit signals (view, click, scroll)
  ├── Content         → Deduplicated content storage
        ├── Directives      → Goals and rules
        └── Predictions     → Hypothesis tracking

Data Flow

┌─────────────────────────────────────────────────────────────────────┐
│                           COLLECTION                                 │
│  Facts  │  Feedback  │  Preferences  │  Interactions  │  Content    │
└────────────────────────────┬────────────────────────────────────────┘
                             │
         ┌───────────────────┼───────────────────┐
         ▼                   ▼                   ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│    INFERENCE    │ │    TRAINING     │ │    ANALYSIS     │
│                 │ │                 │ │                 │
│ • Context       │ │ • Export DPO    │ │ • Stats         │
│   Injection     │ │ • Export SFT    │ │ • Trends        │
│ • RAG Retrieval │ │ • LoRA Training │ │ • Insights      │
│ • Embeddings    │ │ • DPO Training  │ │                 │
└─────────────────┘ └─────────────────┘ └─────────────────┘

Examples

See the examples/ directory for complete working examples:

API Reference

Core

Class Description
Client Main entry point, scoped to a context
FactsClient Store and retrieve facts
FeedbackClient Record explicit feedback signals
PreferencesClient Store preference pairs

Inference

Class/Function Description
ContextBuilder Build system prompts with injected facts
ContextQuery High-level context-aware query interface
Embedder Generate embeddings via OpenAI-compatible API
RAGArgs Configuration for RAG retrieval
embed_missing_facts Batch embed facts without embeddings

Training

Class/Function Description
dpo.export_preferences Export preference pairs for DPO
export_feedback_sft Export feedback for SFT
export_feedback_classifier Export for binary classification
train_lora Train LoRA adapter with SFT
train_dpo Train with Direct Preference Optimization
lora.Config LoRA hyperparameters
RunConfig Training hyperparameters
AdapterRegistry Manage trained adapters

Requirements

  • Python 3.11+
  • PostgreSQL 16+ with pgvector extension
  • For training: CUDA GPU (or MPS on Apple Silicon)

Configuration

  1. Copy the environment template and customize paths:
cp .env.yaml.example .env.yaml

Edit .env.yaml with your local paths:

paths:
  models: !path ~/models/huggingface    # HuggingFace models directory
  adapters: !path ~/models/adapters     # Trained LoRA adapters
  1. The main config is in etc/llm-kelt.yaml. Database and LLM settings:
dbs:
  main:
    url: postgresql://user:pass@localhost:5432/llm_kelt
    extensions: [vector]

llm:
  default_backend: local
  backends:
    local:
      base_url: http://localhost:8000/v1
      model: default

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_kelt-0.2.0.tar.gz (237.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llm_kelt-0.2.0-py3-none-any.whl (165.2 kB view details)

Uploaded Python 3

File details

Details for the file llm_kelt-0.2.0.tar.gz.

File metadata

  • Download URL: llm_kelt-0.2.0.tar.gz
  • Upload date:
  • Size: 237.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for llm_kelt-0.2.0.tar.gz
Algorithm Hash digest
SHA256 88d5dbec3f4411947932c50750f0bd2715faa265c1b7827c08a4046533f647b1
MD5 7ac0b752168ed62dbcfcc08aeef1036d
BLAKE2b-256 a1e686860c0bd0166ea246bf7ec53df262bc23b61c15c9ee97266ea11bcce305

See more details on using hashes here.

Provenance

The following attestation bundles were made for llm_kelt-0.2.0.tar.gz:

Publisher: release.yml on llm-works/llm-kelt

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file llm_kelt-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: llm_kelt-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 165.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for llm_kelt-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5971dc216763b9113e8b89f26277fd6a0ced661643dfcc8932a4c8096d53b74e
MD5 1307a799fca778f91ee8614bda8dc308
BLAKE2b-256 a2fd23bc621b2ff36c0db212f4b056ee41c444e1c4e2f8df5a6cd07a025463f3

See more details on using hashes here.

Provenance

The following attestation bundles were made for llm_kelt-0.2.0-py3-none-any.whl:

Publisher: release.yml on llm-works/llm-kelt

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page