Framework for collecting and managing LLM context
Project description
llm-kelt
A framework for collecting, managing, and leveraging context for LLM applications. Supports fact storage, feedback collection, preference pairs, RAG-based retrieval, and fine-tuning workflows.
Features
- Facts & Context Injection - Store facts that get injected into LLM system prompts
- RAG Retrieval - Semantic search for relevant facts using embeddings
- Feedback Collection - Record explicit signals (positive/negative/dismiss)
- Preference Pairs - Store chosen vs rejected responses for DPO training
- Training Export - Export to DPO, SFT, and classifier formats
- LoRA Fine-Tuning - Train adapters with QLoRA support
- Multi-Tenant - Context-scoped data isolation
Installation
# Basic installation
pip install llm-kelt
# With training dependencies (PyTorch, transformers, PEFT, TRL)
pip install llm-kelt[training]
# Development installation
git clone https://github.com/llm-works/llm-kelt.git
cd llm-kelt
pip install -e ".[dev]"
Quick Start
Setup
from llm_kelt import Client
# Create client scoped to a context
kelt = Client(context_key="default")
kelt.migrate() # Create database tables
# Add facts about the user
kelt.facts.add("Prefers concise explanations", category="preferences")
kelt.facts.add("Expert Python developer", category="background")
kelt.facts.add("Always include code examples", category="rules")
Context Injection
from llm_kelt.inference import ContextBuilder
# Build system prompt with facts injected
builder = ContextBuilder(kelt.facts)
system_prompt = builder.build_system_prompt(
base_prompt="You are a helpful assistant.",
categories=["preferences", "rules"], # Optional: filter by category
)
# Result: "You are a helpful assistant.\n\n## About the user:\n### Preferences\n- ..."
RAG-Based Retrieval
RAG (Retrieval-Augmented Generation) finds facts relevant to each query using semantic similarity.
from llm_infer.client import LLMClient
from llm_kelt.inference import (
ContextBuilder, ContextQuery, Embedder, RAGArgs, embed_missing_facts
)
# 1. Embed facts for semantic search (model name discovered from server)
embedder = Embedder(base_url="http://localhost:8001/v1")
await embed_missing_facts(logger, embedder, kelt.facts)
# 2. Create context-aware query interface
llm_client = LLMClient.from_config(config)
query = ContextQuery(
client=llm_client,
context_builder=ContextBuilder(kelt.facts),
base_system_prompt="You are a helpful assistant.",
embedder=embedder,
)
# 3. Ask questions - RAG finds relevant facts automatically
response = await query.ask(
"What's my preferred coding style?",
rag=RAGArgs(top_k=5, min_similarity=0.3),
)
# Filter by category
response = await query.ask(
"What rules should I follow?",
rag=RAGArgs(top_k=5, categories=["rules"]),
)
# Clean up
await embedder.close_async()
Training Data Export
from llm_kelt.training import export_feedback_sft
from llm_kelt.training.dpo import export_preferences
# Record preference pairs
kelt.atomic.preferences.record(
context="Explain gradient descent",
chosen="Concise, accurate explanation",
rejected="Verbose, rambling explanation",
)
# Export to DPO format for TRL
result = export_preferences(
session_factory=kelt.database.session,
context_key=kelt.context_key,
output_path="preferences.jsonl",
)
# Output: {"prompt": str, "chosen": str, "rejected": str}
# Export feedback to SFT format
result = export_feedback_sft(
session_factory=kelt.database.session,
context_key=kelt.context_key,
output_path="feedback_sft.jsonl",
signal="positive",
)
# Output: {"instruction": str, "output": str}
LoRA Fine-Tuning
from appinfra.log import LogConfig, LoggerFactory
from llm_kelt.training import train_lora, RunConfig
from llm_kelt.training.lora import Config as LoraConfig
lg = LoggerFactory.create_root(LogConfig.from_params(level="info"))
# Train LoRA adapter (requires pip install llm-kelt[training])
result = train_lora(
lg=lg,
data_path="feedback_sft.jsonl",
output_dir="./my_adapter",
base_model="Qwen/Qwen2.5-7B-Instruct",
lora_config=LoraConfig(r=16, lora_alpha=32),
training_config=RunConfig(
num_epochs=3,
batch_size=4,
learning_rate=2e-4,
),
quantize=True, # QLoRA for lower VRAM
)
print(f"Adapter saved to: {result.adapter_path}")
print(f"Train loss: {result.metrics['train_loss']:.4f}")
Architecture
Isolation Context (context_key)
├── Facts → Injected into prompts (with embeddings for RAG)
├── Feedback → Explicit signals (positive/negative)
├── Preferences → DPO training pairs (chosen/rejected)
├── Interactions → Implicit signals (view, click, scroll)
├── Content → Deduplicated content storage
├── Directives → Goals and rules
└── Predictions → Hypothesis tracking
Data Flow
┌─────────────────────────────────────────────────────────────────────┐
│ COLLECTION │
│ Facts │ Feedback │ Preferences │ Interactions │ Content │
└────────────────────────────┬────────────────────────────────────────┘
│
┌───────────────────┼───────────────────┐
▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ INFERENCE │ │ TRAINING │ │ ANALYSIS │
│ │ │ │ │ │
│ • Context │ │ • Export DPO │ │ • Stats │
│ Injection │ │ • Export SFT │ │ • Trends │
│ • RAG Retrieval │ │ • LoRA Training │ │ • Insights │
│ • Embeddings │ │ • DPO Training │ │ │
└─────────────────┘ └─────────────────┘ └─────────────────┘
Examples
See the examples/ directory for complete working examples:
01_facts_and_context.py- Facts storage and context injection02_rag_retrieval.py- RAG with semantic search03_training_export.py- Export to training formats04_lora_training.py- LoRA fine-tuning workflow
API Reference
Core
| Class | Description |
|---|---|
Client |
Main entry point, scoped to a context |
FactsClient |
Store and retrieve facts |
FeedbackClient |
Record explicit feedback signals |
PreferencesClient |
Store preference pairs |
Inference
| Class/Function | Description |
|---|---|
ContextBuilder |
Build system prompts with injected facts |
ContextQuery |
High-level context-aware query interface |
Embedder |
Generate embeddings via OpenAI-compatible API |
RAGArgs |
Configuration for RAG retrieval |
embed_missing_facts |
Batch embed facts without embeddings |
Training
| Class/Function | Description |
|---|---|
dpo.export_preferences |
Export preference pairs for DPO |
export_feedback_sft |
Export feedback for SFT |
export_feedback_classifier |
Export for binary classification |
train_lora |
Train LoRA adapter with SFT |
train_dpo |
Train with Direct Preference Optimization |
lora.Config |
LoRA hyperparameters |
RunConfig |
Training hyperparameters |
AdapterRegistry |
Manage trained adapters |
Requirements
- Python 3.11+
- PostgreSQL 16+ with pgvector extension
- For training: CUDA GPU (or MPS on Apple Silicon)
Configuration
- Copy the environment template and customize paths:
cp .env.yaml.example .env.yaml
Edit .env.yaml with your local paths:
paths:
models: !path ~/models/huggingface # HuggingFace models directory
adapters: !path ~/models/adapters # Trained LoRA adapters
- The main config is in
etc/llm-kelt.yaml. Database and LLM settings:
dbs:
main:
url: postgresql://user:pass@localhost:5432/llm_kelt
extensions: [vector]
llm:
default_backend: local
backends:
local:
base_url: http://localhost:8000/v1
model: default
License
Apache 2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llm_kelt-0.2.0.tar.gz.
File metadata
- Download URL: llm_kelt-0.2.0.tar.gz
- Upload date:
- Size: 237.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
88d5dbec3f4411947932c50750f0bd2715faa265c1b7827c08a4046533f647b1
|
|
| MD5 |
7ac0b752168ed62dbcfcc08aeef1036d
|
|
| BLAKE2b-256 |
a1e686860c0bd0166ea246bf7ec53df262bc23b61c15c9ee97266ea11bcce305
|
Provenance
The following attestation bundles were made for llm_kelt-0.2.0.tar.gz:
Publisher:
release.yml on llm-works/llm-kelt
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
llm_kelt-0.2.0.tar.gz -
Subject digest:
88d5dbec3f4411947932c50750f0bd2715faa265c1b7827c08a4046533f647b1 - Sigstore transparency entry: 1101132595
- Sigstore integration time:
-
Permalink:
llm-works/llm-kelt@0fe66187c6cba3981c3cc99518e06f2a0c076afc -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/llm-works
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@0fe66187c6cba3981c3cc99518e06f2a0c076afc -
Trigger Event:
push
-
Statement type:
File details
Details for the file llm_kelt-0.2.0-py3-none-any.whl.
File metadata
- Download URL: llm_kelt-0.2.0-py3-none-any.whl
- Upload date:
- Size: 165.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5971dc216763b9113e8b89f26277fd6a0ced661643dfcc8932a4c8096d53b74e
|
|
| MD5 |
1307a799fca778f91ee8614bda8dc308
|
|
| BLAKE2b-256 |
a2fd23bc621b2ff36c0db212f4b056ee41c444e1c4e2f8df5a6cd07a025463f3
|
Provenance
The following attestation bundles were made for llm_kelt-0.2.0-py3-none-any.whl:
Publisher:
release.yml on llm-works/llm-kelt
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
llm_kelt-0.2.0-py3-none-any.whl -
Subject digest:
5971dc216763b9113e8b89f26277fd6a0ced661643dfcc8932a4c8096d53b74e - Sigstore transparency entry: 1101132598
- Sigstore integration time:
-
Permalink:
llm-works/llm-kelt@0fe66187c6cba3981c3cc99518e06f2a0c076afc -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/llm-works
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@0fe66187c6cba3981c3cc99518e06f2a0c076afc -
Trigger Event:
push
-
Statement type: