Skip to main content

A FastAPI-based LLM integration framework for engineering-centric AI development.

Project description

LeanPrompt (Backend)

LeanPrompt is an engineering-centric LLM integration framework based on FastAPI. It helps you use LLMs as reliable and predictable software components, not just text generators.

✨ Key Features

  • FastAPI Native: Integrates instantly into existing FastAPI apps as a plugin.
  • Markdown-Driven Prompts: Manage prompts as .md files, separated from code. Filenames become API paths.
  • Session-Based Context Caching: Saves token costs by sending prompts only at the start of a session and then sending only input deltas.
  • Output Guardrails: Built-in output validation and automatic retry logic via Pydantic models.
  • WebSocket First: Highly optimized WebSocket support for real-time streaming feedback.

🚀 Quick Start

Installation

pip install leanprompt

Basic Usage

from fastapi import FastAPI
from leanprompt import LeanPrompt, Guard
from pydantic import BaseModel
import os

app = FastAPI()

# Initialize LeanPrompt with your preferred provider
# Configure via environment variable: LEANPROMPT_LLM_PROVIDER="provider|api_key"
provider_env = os.getenv("LEANPROMPT_LLM_PROVIDER", "openai|dummy_key")
provider_name, api_key = provider_env.split("|", 1)

lp = LeanPrompt(app, provider=provider_name, prompt_dir="prompts", api_key=api_key)

# Define output model for validation
class CalculationResult(BaseModel):
    result: int

# Create a calculator endpoint
@lp.route("/calc/add", prompt_file="add.md")
@Guard.validate(CalculationResult)
async def add(user_input: str):
    """Performs addition based on user input."""
    pass  # LeanPrompt handles the logic

API Prefix and WebSocket Path

You can apply a shared prefix to all LeanPrompt routes and the WebSocket endpoint:

app = FastAPI()

lp = LeanPrompt(
    app,
    provider=provider_name,
    prompt_dir="prompts",
    api_key=api_key,
    api_prefix="/api",
    ws_path="ws",  # relative -> /api/ws/{client_id}
)

@lp.route("/calc/add", prompt_file="add.md")
async def add(user_input: str):
    pass

Clients can keep using the same LeanPrompt path value (/calc/add) while connecting to ws://localhost:8000/api/ws/{client_id}.

Using an absolute ws_path (e.g., "/ws") keeps the WebSocket route outside the api_prefix. Avoid ws_path="/" to prevent route collisions.

If you already configure a FastAPI router prefix, LeanPrompt can attach to it directly:

app = FastAPI()
api = FastAPI()
app.mount("/api", api)

lp = LeanPrompt(
    api,
    provider=provider_name,
    prompt_dir="prompts",
    api_key=api_key,
    ws_path="/ws",  # -> /api/ws/{client_id}
)

JWT Annotation Example

LeanPrompt routes can reuse a JWT validator annotation for HTTP requests:

from fastapi import Request
from leanprompt import Guard

def require_jwt(request: Request) -> bool:
    # Example only. Insecure for production; validate signature, expiry, and claims.
    # Example: jwt.decode(token, key, algorithms=["HS256"])
    return bool(request.headers.get("authorization"))

@lp.route("/secure/add", prompt_file="add.md")
@Guard.auth(require_jwt)
@Guard.validate(CalculationResult)
async def secure_add(user_input: str):
    pass

For WebSocket authentication, pass a validation hook when you construct LeanPrompt:

from fastapi import WebSocket

def require_ws_jwt(websocket: WebSocket) -> bool:
    # Example only. Insecure for production; validate signature, expiry, and claims.
    # Example: jwt.decode(token, key, algorithms=["HS256"])
    return bool(websocket.headers.get("authorization"))

lp = LeanPrompt(
    app,
    provider=provider_name,
    prompt_dir="prompts",
    api_key=api_key,
    ws_auth=require_ws_jwt,
)

WebSocket Interceptors

You can intercept inbound/outbound WebSocket messages for metering, auditing, or billing. If the request interceptor returns False or {"error": "..."}, the request is blocked and the error payload is returned immediately.

Interceptor signature:

def interceptor(websocket: WebSocket, event: dict):
    ...

Event payload shape:

{
  "direction": "inbound" | "outbound",
  "client_id": "...",
  "path": "/route",
  "payload": { "path": "/route", "message": "..." } | { "response": "...", "path": "/route" },
  "raw": "{...}",
  "byte_length": 123
}

Return behavior:

  • Request interceptor (ws_request_interceptor)
    • Return None / no return: request continues to normal processing.
    • Return False: request is blocked and { "error": "WebSocket request rejected" } is sent.
    • Return { "error": "..." }: request is blocked and the dict is sent as-is (path is added if missing).
    • Raise an exception: treated as blocked and { "error": "<exception message>" } is sent.
  • Response interceptor (ws_response_interceptor)
    • Return value is ignored; it never blocks the response.
    • Exceptions are logged and the response still proceeds.
from fastapi import WebSocket

billing_state = {
    "credits": 10_000,  # bytes
    "usage": 0,
}

def ws_billing(websocket: WebSocket, event: dict):
    # event keys: direction, client_id, path, payload, raw, byte_length
    if event["direction"] == "inbound":
        projected = billing_state["usage"] + event["byte_length"]
        if projected > billing_state["credits"]:
            return {"error": "Billing failed: insufficient credits", "code": "billing_failed"}
        billing_state["usage"] = projected
    else:
        billing_state["usage"] += event["byte_length"]

lp = LeanPrompt(
    app,
    provider=provider_name,
    prompt_dir="prompts",
    api_key=api_key,
    ws_request_interceptor=ws_billing,
    ws_response_interceptor=ws_billing,
)

Complete Example Server

Here's a full example with multiple endpoints:

from fastapi import FastAPI
from leanprompt import LeanPrompt, Guard
from pydantic import BaseModel
import os

# Define output models
class MoodJson(BaseModel):
    current_mood: str
    confidence: float
    reason: str

class CalculationResult(BaseModel):
    result: int

app = FastAPI()

# Initialize LeanPrompt
provider_env = os.getenv("LEANPROMPT_LLM_PROVIDER", "openai|dummy_key")
provider_name, api_key = provider_env.split("|", 1)
lp = LeanPrompt(app, provider=provider_name, prompt_dir="examples/prompts", api_key=api_key)

@lp.route("/calc/add", prompt_file="add.md")
@Guard.validate(CalculationResult)
async def add(user_input: str):
    """Performs addition based on user input."""
    pass

@lp.route("/calc/multiply", prompt_file="multiply.md")
@Guard.validate(CalculationResult)
async def multiply(user_input: str):
    """Performs multiplication based on user input."""
    pass

@lp.route("/mood/json", prompt_file="mood_json.md")
@Guard.validate(MoodJson)
async def get_mood_json(user_input: str):
    """Returns the mood analysis in JSON format."""
    pass

# Custom validation for markdown content
def validate_markdown_content(text: str):
    if "##" not in text and "**" not in text:
        raise ValueError("Response does not look like Markdown")
    if "Meanings" not in text:
        raise ValueError("Missing required section: 'Meanings'")
    return {"raw_markdown": text}

@lp.route("/linguist", prompt_file="word_relationships.md")
@Guard.custom(validate_markdown_content)
async def analyze_words(user_input: str):
    """Analyzes word relationships and returns markdown."""
    pass

Using Local LLM (Ollama)

You can use local LLMs like Qwen 2.5 Coder or DeepSeek-Coder-V2 via Ollama.

  1. Install and run Ollama:

    ollama run qwen2.5-coder
    
  2. Initialize LeanPrompt with ollama provider:

    lp = LeanPrompt(
        app, 
        provider="ollama", 
        base_url="http://localhost:11434", # Optional, defaults to this
        model="qwen2.5-coder" # Specify the model name here or in prompt frontmatter
    )
    

Supported Providers

LeanPrompt supports multiple LLM providers:

  • OpenAI: provider="openai"
  • DeepSeek: provider="deepseek"
  • Google Gemini: provider="google"
  • Ollama (Local): provider="ollama"

📂 Project Structure

leanprompt/
├── leanprompt/          # Main library code
│   ├── core.py          # Core logic (FastAPI integration)
│   ├── guard.py         # Validation logic
│   └── providers/       # LLM provider implementations
├── examples/            # Usage examples
│   ├── main.py          # Example FastAPI app
│   └── prompts/         # Example prompt files
├── tests/               # Unit tests
├── setup.py             # Package installation script
└── requirements.txt     # Dependencies

🏃 Running the Example

  1. Install Dependencies:

    pip install -r requirements.txt
    
  2. Set Environment Variable:

    # Format: provider|api_key
    export LEANPROMPT_LLM_PROVIDER="openai|your_openai_api_key"
    
    # Or for DeepSeek:
    export LEANPROMPT_LLM_PROVIDER="deepseek|your_deepseek_api_key"
    
  3. Run the Example Server:

    # Run from the root directory
    export PYTHONPATH=$PYTHONPATH:$(pwd)
    python examples/main.py
    

📡 API Examples

HTTP Endpoints

Calculation (Add):

curl -X POST "http://localhost:8000/calc/add" \
     -H "Content-Type: application/json" \
     -d '{"message": "50 + 50"}'
# Response: {"result": 100}

Calculation (Multiply):

curl -X POST "http://localhost:8000/calc/multiply" \
     -H "Content-Type: application/json" \
     -d '{"message": "10 * 5"}'
# Response: {"result": 50}

Mood Analysis (JSON):

curl -X POST "http://localhost:8000/mood/json" \
     -H "Content-Type: application/json" \
     -d '{"message": "I am feeling great today!"}'
# Response: {"current_mood": "Happy", "confidence": 0.9, "reason": "Positive language used"}

Word Relationship Analysis:

curl -X POST "http://localhost:8000/linguist" \
     -H "Content-Type: application/json" \
     -d '{"message": "apple, banana, cherry"}'
# Response: Markdown formatted analysis with meanings and relationships

WebSocket Interface

LeanPrompt provides a WebSocket interface for real-time streaming and context management:

import websocket
import json

def on_message(ws, message):
    response = json.loads(message)
    print(f"Path: {response.get('path')}")
    print(f"Response: {response['response']}")

ws = websocket.WebSocketApp(
    "ws://localhost:8000/ws/test_client",
    on_message=on_message
)

# Send different requests to test routing and context
ws.send(json.dumps({"path": "/add", "message": "10 + 20"}))
ws.send(json.dumps({"path": "/multiply", "message": "5 * 5"}))
ws.send(json.dumps({"path": "/linguist", "message": "apple, banana, cherry"}))
ws.send(json.dumps({"path": "/linguist", "message": "What color are they?"}))

Context Chaining Example

The WebSocket interface maintains separate conversation contexts for each path:

# First message to /linguist path
ws.send(json.dumps({
    "path": "/linguist", 
    "message": "apple, banana, cherry"
}))

# Follow-up message - AI remembers the previous context
ws.send(json.dumps({
    "path": "/linguist", 
    "message": "What color are they?"
}))
# Response will mention red, yellow, etc. showing context memory

📝 Prompt Templates

LeanPrompt uses markdown files with frontmatter for prompt templates:

Example: add.md

---
model: deepseek-chat
temperature: 0.1
---
You are a calculator.
Perform the addition requested by the user.
Return the result in valid JSON format matching this schema:
{"result": integer}

Example:
User: 1 + 1
AI: {"result": 2}

Only return the JSON object.

Example: word_relationships.md

---
model: deepseek-chat
---
You are a helpful linguist.
The user will provide three English words.
Please provide the meaning of each word and explain the relationships between them.
Return the response in Markdown format.
Use headers like "## Meanings" and "## Relationships" to structure your response.

🛡️ Output Validation

LeanPrompt provides built-in output validation using Pydantic models:

from pydantic import BaseModel
from leanprompt import Guard

class MoodResponse(BaseModel):
    mood: str
    intensity: int  # 1-10
    notes: str

@lp.route("/mood", prompt_file="mood.md")
@Guard.validate(MoodResponse)
async def analyze_mood(user_input: str):
    pass  # Automatically validates and converts LLM response

For custom validation logic:

def validate_markdown(text: str):
    if "##" not in text:
        raise ValueError("Invalid markdown format")
    return text

@lp.route("/custom", prompt_file="custom.md")
@Guard.custom(validate_markdown)
async def custom_endpoint(user_input: str):
    pass

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

leanprompt-0.4.1.tar.gz (21.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

leanprompt-0.4.1-py3-none-any.whl (19.1 kB view details)

Uploaded Python 3

File details

Details for the file leanprompt-0.4.1.tar.gz.

File metadata

  • Download URL: leanprompt-0.4.1.tar.gz
  • Upload date:
  • Size: 21.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for leanprompt-0.4.1.tar.gz
Algorithm Hash digest
SHA256 7bdf908e96215eef846b926b3022908e7026fa25153cde1737d84c676bbee6a9
MD5 3bd2c177079d7cdb310e8a5495b47f22
BLAKE2b-256 3875d3f43c8c2c44a68695db28512ab4699f93333a2117326a51890af20ee4c1

See more details on using hashes here.

File details

Details for the file leanprompt-0.4.1-py3-none-any.whl.

File metadata

  • Download URL: leanprompt-0.4.1-py3-none-any.whl
  • Upload date:
  • Size: 19.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for leanprompt-0.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 e25c3485db1e5bee6bf4bb9f4a0ec7293e4618f88ce78340502f304d5641585a
MD5 cbcc6cdf5069b840a0d8704c462db239
BLAKE2b-256 fb5356c3cd9113224f7e3ffa88da03cc65eadf21499d66ffe65125e1aead3d2c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page