Framework for collecting and managing LLM context

Project description

llm-kelt

Python Type Hints License

A framework for collecting, managing, and leveraging context for LLM applications. Supports fact storage, feedback collection, preference pairs, RAG-based retrieval, and fine-tuning workflows.

Features

Facts & Context Injection - Store facts that get injected into LLM system prompts
RAG Retrieval - Semantic search for relevant facts using embeddings
Feedback Collection - Record explicit signals (positive/negative/dismiss)
Preference Pairs - Store chosen vs rejected responses for DPO training
Training Export - Export to DPO, SFT, and classifier formats
LoRA Fine-Tuning - Train adapters with QLoRA support
Multi-Tenant - Context-scoped data isolation

Installation

# Basic installation
pip install llm-kelt

# With training dependencies (PyTorch, transformers, PEFT, TRL)
pip install llm-kelt[training]

# Development installation
git clone https://github.com/llm-works/llm-kelt.git
cd llm-kelt
pip install -e ".[dev]"

Quick Start

Setup

from llm_kelt import Client

# Create client scoped to a context
kelt = Client(context_key="default")
kelt.migrate()  # Create database tables

# Add facts about the user
kelt.facts.add("Prefers concise explanations", category="preferences")
kelt.facts.add("Expert Python developer", category="background")
kelt.facts.add("Always include code examples", category="rules")

Context Injection

from llm_kelt.inference import ContextBuilder

# Build system prompt with facts injected
builder = ContextBuilder(kelt.facts)
system_prompt = builder.build_system_prompt(
    base_prompt="You are a helpful assistant.",
    categories=["preferences", "rules"],  # Optional: filter by category
)
# Result: "You are a helpful assistant.\n\n## About the user:\n### Preferences\n- ..."

RAG-Based Retrieval

RAG (Retrieval-Augmented Generation) finds facts relevant to each query using semantic similarity.

from llm_infer.client import LLMClient
from llm_kelt.inference import (
    ContextBuilder, ContextQuery, Embedder, RAGArgs, embed_missing_facts
)

# 1. Embed facts for semantic search (model name discovered from server)
embedder = Embedder(base_url="http://localhost:8001/v1")
await embed_missing_facts(logger, embedder, kelt.facts)

# 2. Create context-aware query interface
llm_client = LLMClient.from_config(config)
query = ContextQuery(
    client=llm_client,
    context_builder=ContextBuilder(kelt.facts),
    base_system_prompt="You are a helpful assistant.",
    embedder=embedder,
)

# 3. Ask questions - RAG finds relevant facts automatically
response = await query.ask(
    "What's my preferred coding style?",
    rag=RAGArgs(top_k=5, min_similarity=0.3),
)

# Filter by category
response = await query.ask(
    "What rules should I follow?",
    rag=RAGArgs(top_k=5, categories=["rules"]),
)

# Clean up
await embedder.close_async()

Training Data Export

from llm_kelt.training import export_feedback_sft
from llm_kelt.training.dpo import export_preferences

# Record preference pairs
kelt.atomic.preferences.record(
    context="Explain gradient descent",
    chosen="Concise, accurate explanation",
    rejected="Verbose, rambling explanation",
)

# Export to DPO format for TRL
result = export_preferences(
    session_factory=kelt.database.session,
    context_key=kelt.context_key,
    output_path="preferences.jsonl",
)
# Output: {"prompt": str, "chosen": str, "rejected": str}

# Export feedback to SFT format
result = export_feedback_sft(
    session_factory=kelt.database.session,
    context_key=kelt.context_key,
    output_path="feedback_sft.jsonl",
    signal="positive",
)
# Output: {"instruction": str, "output": str}

LoRA Fine-Tuning

from appinfra.log import LogConfig, LoggerFactory
from llm_kelt.training import train_lora, RunConfig
from llm_kelt.training.lora import Config as LoraConfig

lg = LoggerFactory.create_root(LogConfig.from_params(level="info"))

# Train LoRA adapter (requires pip install llm-kelt[training])
result = train_lora(
    lg=lg,
    data_path="feedback_sft.jsonl",
    output_dir="./my_adapter",
    base_model="Qwen/Qwen2.5-7B-Instruct",
    lora_config=LoraConfig(r=16, lora_alpha=32),
    training_config=RunConfig(
        num_epochs=3,
        batch_size=4,
        learning_rate=2e-4,
    ),
    quantize=True,  # QLoRA for lower VRAM
)

print(f"Adapter saved to: {result.adapter_path}")
print(f"Train loss: {result.metrics['train_loss']:.4f}")

Architecture

Isolation Context (context_key)
  ├── Facts           → Injected into prompts (with embeddings for RAG)
  ├── Feedback        → Explicit signals (positive/negative)
  ├── Preferences     → DPO training pairs (chosen/rejected)
  ├── Interactions    → Implicit signals (view, click, scroll)
  ├── Content         → Deduplicated content storage
        ├── Directives      → Goals and rules
        └── Predictions     → Hypothesis tracking

Data Flow

┌─────────────────────────────────────────────────────────────────────┐
│                           COLLECTION                                 │
│  Facts  │  Feedback  │  Preferences  │  Interactions  │  Content    │
└────────────────────────────┬────────────────────────────────────────┘
                             │
         ┌───────────────────┼───────────────────┐
         ▼                   ▼                   ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│    INFERENCE    │ │    TRAINING     │ │    ANALYSIS     │
│                 │ │                 │ │                 │
│ • Context       │ │ • Export DPO    │ │ • Stats         │
│   Injection     │ │ • Export SFT    │ │ • Trends        │
│ • RAG Retrieval │ │ • LoRA Training │ │ • Insights      │
│ • Embeddings    │ │ • DPO Training  │ │                 │
└─────────────────┘ └─────────────────┘ └─────────────────┘

Examples

See the examples/ directory for complete working examples:

01_facts_and_context.py - Facts storage and context injection
02_rag_retrieval.py - RAG with semantic search
03_training_export.py - Export to training formats
04_lora_training.py - LoRA fine-tuning workflow

API Reference

Core

Class	Description
`Client`	Main entry point, scoped to a context
`FactsClient`	Store and retrieve facts
`FeedbackClient`	Record explicit feedback signals
`PreferencesClient`	Store preference pairs

Inference

Class/Function	Description
`ContextBuilder`	Build system prompts with injected facts
`ContextQuery`	High-level context-aware query interface
`Embedder`	Generate embeddings via OpenAI-compatible API
`RAGArgs`	Configuration for RAG retrieval
`embed_missing_facts`	Batch embed facts without embeddings

Training

Class/Function	Description
`dpo.export_preferences`	Export preference pairs for DPO
`export_feedback_sft`	Export feedback for SFT
`export_feedback_classifier`	Export for binary classification
`train_lora`	Train LoRA adapter with SFT
`train_dpo`	Train with Direct Preference Optimization
`lora.Config`	LoRA hyperparameters
`RunConfig`	Training hyperparameters
`AdapterRegistry`	Manage trained adapters

Requirements

Python 3.11+
PostgreSQL 16+ with pgvector extension
For training: CUDA GPU (or MPS on Apple Silicon)

Configuration

Copy the environment template and customize paths:

cp .env.yaml.example .env.yaml

Edit .env.yaml with your local paths:

paths:
  models: !path ~/models/huggingface    # HuggingFace models directory
  adapters: !path ~/models/adapters     # Trained LoRA adapters

The main config is in etc/llm-kelt.yaml. Database and LLM settings:

dbs:
  main:
    url: postgresql://user:pass@localhost:5432/llm_kelt
    extensions: [vector]

llm:
  default_backend: local
  backends:
    local:
      base_url: http://localhost:8000/v1
      model: default

License

Apache 2.0

Project details

Release history Release notifications | RSS feed

This version

0.2.0

Mar 14, 2026

0.1.0

Feb 26, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_kelt-0.2.0.tar.gz (237.4 kB view details)

Uploaded Mar 14, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llm_kelt-0.2.0-py3-none-any.whl (165.2 kB view details)

Uploaded Mar 14, 2026 Python 3

File details

Details for the file llm_kelt-0.2.0.tar.gz.

File metadata

Download URL: llm_kelt-0.2.0.tar.gz
Upload date: Mar 14, 2026
Size: 237.4 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for llm_kelt-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`88d5dbec3f4411947932c50750f0bd2715faa265c1b7827c08a4046533f647b1`
MD5	`7ac0b752168ed62dbcfcc08aeef1036d`
BLAKE2b-256	`a1e686860c0bd0166ea246bf7ec53df262bc23b61c15c9ee97266ea11bcce305`

See more details on using hashes here.

Provenance

The following attestation bundles were made for llm_kelt-0.2.0.tar.gz:

Publisher: release.yml on llm-works/llm-kelt

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: llm_kelt-0.2.0.tar.gz
- Subject digest: 88d5dbec3f4411947932c50750f0bd2715faa265c1b7827c08a4046533f647b1
- Sigstore transparency entry: 1101132595
- Sigstore integration time: Mar 14, 2026
Source repository:
- Permalink: llm-works/llm-kelt@0fe66187c6cba3981c3cc99518e06f2a0c076afc
- Branch / Tag: refs/tags/v0.2.0
- Owner: https://github.com/llm-works
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@0fe66187c6cba3981c3cc99518e06f2a0c076afc
- Trigger Event: push

File details

Details for the file llm_kelt-0.2.0-py3-none-any.whl.

File metadata

Download URL: llm_kelt-0.2.0-py3-none-any.whl
Upload date: Mar 14, 2026
Size: 165.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for llm_kelt-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5971dc216763b9113e8b89f26277fd6a0ced661643dfcc8932a4c8096d53b74e`
MD5	`1307a799fca778f91ee8614bda8dc308`
BLAKE2b-256	`a2fd23bc621b2ff36c0db212f4b056ee41c444e1c4e2f8df5a6cd07a025463f3`

See more details on using hashes here.

Provenance

The following attestation bundles were made for llm_kelt-0.2.0-py3-none-any.whl:

Publisher: release.yml on llm-works/llm-kelt

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: llm_kelt-0.2.0-py3-none-any.whl
- Subject digest: 5971dc216763b9113e8b89f26277fd6a0ced661643dfcc8932a4c8096d53b74e
- Sigstore transparency entry: 1101132598
- Sigstore integration time: Mar 14, 2026
Source repository:
- Permalink: llm-works/llm-kelt@0fe66187c6cba3981c3cc99518e06f2a0c076afc
- Branch / Tag: refs/tags/v0.2.0
- Owner: https://github.com/llm-works
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@0fe66187c6cba3981c3cc99518e06f2a0c076afc
- Trigger Event: push

llm-kelt 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

llm-kelt

Features

Installation

Quick Start

Setup

Context Injection

RAG-Based Retrieval

Training Data Export

LoRA Fine-Tuning

Architecture

Data Flow

Examples

API Reference

Core

Inference

Training

Requirements

Configuration

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance