A FastAPI-based LLM integration framework for engineering-centric AI development.
Project description
LeanPrompt (Backend)
LeanPrompt is an engineering-centric LLM integration framework based on FastAPI. It helps you use LLMs as reliable and predictable software components, not just text generators.
✨ Key Features
- FastAPI Native: Integrates instantly into existing FastAPI apps as a plugin.
- Markdown-Driven Prompts: Manage prompts as
.mdfiles, separated from code. Filenames become API paths. - Session-Based Context Caching: Saves token costs by sending prompts only at the start of a session and then sending only input deltas.
- Output Guardrails: Built-in output validation and automatic retry logic via Pydantic models.
- WebSocket First: Highly optimized WebSocket support for real-time streaming feedback.
🚀 Quick Start
Installation
pip install leanprompt
Basic Usage
from fastapi import FastAPI
from leanprompt import LeanPrompt, Guard
from pydantic import BaseModel
import os
app = FastAPI()
# Initialize LeanPrompt with your preferred provider
# Configure via environment variable: LEANPROMPT_LLM_PROVIDER="provider|api_key"
provider_env = os.getenv("LEANPROMPT_LLM_PROVIDER", "openai|dummy_key")
provider_name, api_key = provider_env.split("|", 1)
lp = LeanPrompt(app, provider=provider_name, prompt_dir="prompts", api_key=api_key)
# Define output model for validation
class CalculationResult(BaseModel):
result: int
# Create a calculator endpoint
@lp.route("/calc/add", prompt_file="add.md")
@Guard.validate(CalculationResult)
async def add(user_input: str):
"""Performs addition based on user input."""
pass # LeanPrompt handles the logic
API Prefix and WebSocket Path
You can apply a shared prefix to all LeanPrompt routes and the WebSocket endpoint:
app = FastAPI()
lp = LeanPrompt(
app,
provider=provider_name,
prompt_dir="prompts",
api_key=api_key,
api_prefix="/api",
ws_path="ws", # relative -> /api/ws/{client_id}
)
@lp.route("/calc/add", prompt_file="add.md")
async def add(user_input: str):
pass
Clients can keep using the same LeanPrompt path value (/calc/add) while connecting to
ws://localhost:8000/api/ws/{client_id}.
Using an absolute ws_path (e.g., "/ws") keeps the WebSocket route outside the
api_prefix. Avoid ws_path="/" to prevent route collisions.
If you already configure a FastAPI router prefix, LeanPrompt can attach to it directly:
app = FastAPI()
api = FastAPI()
app.mount("/api", api)
lp = LeanPrompt(
api,
provider=provider_name,
prompt_dir="prompts",
api_key=api_key,
ws_path="/ws", # -> /api/ws/{client_id}
)
JWT Annotation Example
LeanPrompt routes can reuse a JWT validator annotation for HTTP requests:
from fastapi import Request
from leanprompt import Guard
def require_jwt(request: Request) -> bool:
# Example only. Insecure for production; validate signature, expiry, and claims.
# Example: jwt.decode(token, key, algorithms=["HS256"])
return bool(request.headers.get("authorization"))
@lp.route("/secure/add", prompt_file="add.md")
@Guard.auth(require_jwt)
@Guard.validate(CalculationResult)
async def secure_add(user_input: str):
pass
For WebSocket authentication, pass a validation hook when you construct LeanPrompt:
from fastapi import WebSocket
def require_ws_jwt(websocket: WebSocket) -> bool:
# Example only. Insecure for production; validate signature, expiry, and claims.
# Example: jwt.decode(token, key, algorithms=["HS256"])
return bool(websocket.headers.get("authorization"))
lp = LeanPrompt(
app,
provider=provider_name,
prompt_dir="prompts",
api_key=api_key,
ws_auth=require_ws_jwt,
)
WebSocket Interceptors
You can intercept inbound/outbound WebSocket messages for metering, auditing, or billing.
If the request interceptor returns False or {"error": "..."}, the request is blocked and
the error payload is returned immediately.
Interceptor signature:
def interceptor(websocket: WebSocket, event: dict):
...
Event payload shape:
{
"direction": "inbound" | "outbound",
"client_id": "...",
"path": "/route",
"payload": { "path": "/route", "message": "..." } | { "response": "...", "path": "/route" },
"raw": "{...}",
"byte_length": 123
}
Return behavior:
- Request interceptor (
ws_request_interceptor)- Return
None/ no return: request continues to normal processing. - Return
False: request is blocked and{ "error": "WebSocket request rejected" }is sent. - Return
{ "error": "..." }: request is blocked and the dict is sent as-is (path is added if missing). - Raise an exception: treated as blocked and
{ "error": "<exception message>" }is sent.
- Return
- Response interceptor (
ws_response_interceptor)- Return value is ignored; it never blocks the response.
- Exceptions are logged and the response still proceeds.
from fastapi import WebSocket
billing_state = {
"credits": 10_000, # bytes
"usage": 0,
}
def ws_billing(websocket: WebSocket, event: dict):
# event keys: direction, client_id, path, payload, raw, byte_length
if event["direction"] == "inbound":
projected = billing_state["usage"] + event["byte_length"]
if projected > billing_state["credits"]:
return {"error": "Billing failed: insufficient credits", "code": "billing_failed"}
billing_state["usage"] = projected
else:
billing_state["usage"] += event["byte_length"]
lp = LeanPrompt(
app,
provider=provider_name,
prompt_dir="prompts",
api_key=api_key,
ws_request_interceptor=ws_billing,
ws_response_interceptor=ws_billing,
)
Complete Example Server
Here's a full example with multiple endpoints:
from fastapi import FastAPI
from leanprompt import LeanPrompt, Guard
from pydantic import BaseModel
import os
# Define output models
class MoodJson(BaseModel):
current_mood: str
confidence: float
reason: str
class CalculationResult(BaseModel):
result: int
app = FastAPI()
# Initialize LeanPrompt
provider_env = os.getenv("LEANPROMPT_LLM_PROVIDER", "openai|dummy_key")
provider_name, api_key = provider_env.split("|", 1)
lp = LeanPrompt(app, provider=provider_name, prompt_dir="examples/prompts", api_key=api_key)
@lp.route("/calc/add", prompt_file="add.md")
@Guard.validate(CalculationResult)
async def add(user_input: str):
"""Performs addition based on user input."""
pass
@lp.route("/calc/multiply", prompt_file="multiply.md")
@Guard.validate(CalculationResult)
async def multiply(user_input: str):
"""Performs multiplication based on user input."""
pass
@lp.route("/mood/json", prompt_file="mood_json.md")
@Guard.validate(MoodJson)
async def get_mood_json(user_input: str):
"""Returns the mood analysis in JSON format."""
pass
# Custom validation for markdown content
def validate_markdown_content(text: str):
if "##" not in text and "**" not in text:
raise ValueError("Response does not look like Markdown")
if "Meanings" not in text:
raise ValueError("Missing required section: 'Meanings'")
return {"raw_markdown": text}
@lp.route("/linguist", prompt_file="word_relationships.md")
@Guard.custom(validate_markdown_content)
async def analyze_words(user_input: str):
"""Analyzes word relationships and returns markdown."""
pass
Using Local LLM (Ollama)
You can use local LLMs like Qwen 2.5 Coder or DeepSeek-Coder-V2 via Ollama.
-
Install and run Ollama:
ollama run qwen2.5-coder
-
Initialize LeanPrompt with
ollamaprovider:lp = LeanPrompt( app, provider="ollama", base_url="http://localhost:11434", # Optional, defaults to this model="qwen2.5-coder" # Specify the model name here or in prompt frontmatter )
Supported Providers
LeanPrompt supports multiple LLM providers:
- OpenAI:
provider="openai" - DeepSeek:
provider="deepseek" - Google Gemini:
provider="google" - Ollama (Local):
provider="ollama"
📂 Project Structure
leanprompt/
├── leanprompt/ # Main library code
│ ├── core.py # Core logic (FastAPI integration)
│ ├── guard.py # Validation logic
│ └── providers/ # LLM provider implementations
├── examples/ # Usage examples
│ ├── main.py # Example FastAPI app
│ └── prompts/ # Example prompt files
├── tests/ # Unit tests
├── setup.py # Package installation script
└── requirements.txt # Dependencies
🏃 Running the Example
-
Install Dependencies:
pip install -r requirements.txt
-
Set Environment Variable:
# Format: provider|api_key export LEANPROMPT_LLM_PROVIDER="openai|your_openai_api_key" # Or for DeepSeek: export LEANPROMPT_LLM_PROVIDER="deepseek|your_deepseek_api_key"
-
Run the Example Server:
# Run from the root directory export PYTHONPATH=$PYTHONPATH:$(pwd) python examples/main.py
📡 API Examples
HTTP Endpoints
Calculation (Add):
curl -X POST "http://localhost:8000/calc/add" \
-H "Content-Type: application/json" \
-d '{"message": "50 + 50"}'
# Response: {"result": 100}
Calculation (Multiply):
curl -X POST "http://localhost:8000/calc/multiply" \
-H "Content-Type: application/json" \
-d '{"message": "10 * 5"}'
# Response: {"result": 50}
Mood Analysis (JSON):
curl -X POST "http://localhost:8000/mood/json" \
-H "Content-Type: application/json" \
-d '{"message": "I am feeling great today!"}'
# Response: {"current_mood": "Happy", "confidence": 0.9, "reason": "Positive language used"}
Word Relationship Analysis:
curl -X POST "http://localhost:8000/linguist" \
-H "Content-Type: application/json" \
-d '{"message": "apple, banana, cherry"}'
# Response: Markdown formatted analysis with meanings and relationships
WebSocket Interface
LeanPrompt provides a WebSocket interface for real-time streaming and context management:
import websocket
import json
def on_message(ws, message):
response = json.loads(message)
print(f"Path: {response.get('path')}")
print(f"Response: {response['response']}")
ws = websocket.WebSocketApp(
"ws://localhost:8000/ws/test_client",
on_message=on_message
)
# Send different requests to test routing and context
ws.send(json.dumps({"path": "/add", "message": "10 + 20"}))
ws.send(json.dumps({"path": "/multiply", "message": "5 * 5"}))
ws.send(json.dumps({"path": "/linguist", "message": "apple, banana, cherry"}))
ws.send(json.dumps({"path": "/linguist", "message": "What color are they?"}))
Context Chaining Example
The WebSocket interface maintains separate conversation contexts for each path:
# First message to /linguist path
ws.send(json.dumps({
"path": "/linguist",
"message": "apple, banana, cherry"
}))
# Follow-up message - AI remembers the previous context
ws.send(json.dumps({
"path": "/linguist",
"message": "What color are they?"
}))
# Response will mention red, yellow, etc. showing context memory
📝 Prompt Templates
LeanPrompt uses markdown files with frontmatter for prompt templates:
Example: add.md
---
model: deepseek-chat
temperature: 0.1
---
You are a calculator.
Perform the addition requested by the user.
Return the result in valid JSON format matching this schema:
{"result": integer}
Example:
User: 1 + 1
AI: {"result": 2}
Only return the JSON object.
Example: word_relationships.md
---
model: deepseek-chat
---
You are a helpful linguist.
The user will provide three English words.
Please provide the meaning of each word and explain the relationships between them.
Return the response in Markdown format.
Use headers like "## Meanings" and "## Relationships" to structure your response.
🛡️ Output Validation
LeanPrompt provides built-in output validation using Pydantic models:
from pydantic import BaseModel
from leanprompt import Guard
class MoodResponse(BaseModel):
mood: str
intensity: int # 1-10
notes: str
@lp.route("/mood", prompt_file="mood.md")
@Guard.validate(MoodResponse)
async def analyze_mood(user_input: str):
pass # Automatically validates and converts LLM response
For custom validation logic:
def validate_markdown(text: str):
if "##" not in text:
raise ValueError("Invalid markdown format")
return text
@lp.route("/custom", prompt_file="custom.md")
@Guard.custom(validate_markdown)
async def custom_endpoint(user_input: str):
pass
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file leanprompt-0.4.1.tar.gz.
File metadata
- Download URL: leanprompt-0.4.1.tar.gz
- Upload date:
- Size: 21.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7bdf908e96215eef846b926b3022908e7026fa25153cde1737d84c676bbee6a9
|
|
| MD5 |
3bd2c177079d7cdb310e8a5495b47f22
|
|
| BLAKE2b-256 |
3875d3f43c8c2c44a68695db28512ab4699f93333a2117326a51890af20ee4c1
|
File details
Details for the file leanprompt-0.4.1-py3-none-any.whl.
File metadata
- Download URL: leanprompt-0.4.1-py3-none-any.whl
- Upload date:
- Size: 19.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e25c3485db1e5bee6bf4bb9f4a0ec7293e4618f88ce78340502f304d5641585a
|
|
| MD5 |
cbcc6cdf5069b840a0d8704c462db239
|
|
| BLAKE2b-256 |
fb5356c3cd9113224f7e3ffa88da03cc65eadf21499d66ffe65125e1aead3d2c
|