A Python package for creating and running LLM programs.

These details have not been verified by PyPI

Project description

LLM Program

llmprogram is a Python package that provides a structured and powerful way to create and run programs that use Large Language Models (LLMs). It uses a YAML-based configuration to define the behavior of your LLM programs, making them easy to create, manage, and share.

How is `llmprogram` different?

There are many libraries and frameworks available for working with LLMs. Here’s what makes llmprogram different:

Focus on Programmatic LLM-Chains: llmprogram is designed to create self-contained, reusable "programs" that can be chained together to build more complex applications. The YAML-based configuration makes it easy to define and version these programs.
Data Quality and Validation: The built-in input and output validation using JSON schemas ensures that your programs are robust and that the data flowing through them is correct. This is crucial for building reliable LLM-powered applications.
Dataset Generation as a First-Class Citizen: llmprogram is designed with the entire lifecycle of an LLM application in mind, from development to production and fine-tuning. The automatic logging to a SQLite database makes it incredibly easy to create high-quality datasets for fine-tuning your own models.
Simplicity and Intuitiveness: The YAML configuration is easy to read and write, and the Python API is simple and intuitive. This makes it easy to get started and to build complex applications without a steep learning curve.

Features

YAML-based Configuration: Define your LLM programs using simple and intuitive YAML files.
Input/Output Validation: Use JSON schemas to validate the inputs and outputs of your programs, ensuring data integrity.
Jinja2 Templating: Use the power of Jinja2 templates to create dynamic prompts for your LLMs.
Caching: Built-in support for Redis caching to save time and reduce costs.
Execution Logging: Automatically log program executions to a SQLite database for analysis and debugging.
Streaming: Support for streaming responses from the LLM.
Extensible with Tools: Extend the functionality of your programs by adding custom tools (functions) that the LLM can call.
Batch Processing: Process multiple inputs in parallel for improved performance.
CLI for Dataset Generation: A command-line interface to generate instruction datasets for LLM fine-tuning from your logged data.
Web Service: Expose your programs as REST API endpoints with automatic OpenAPI documentation.
Analytics: Comprehensive analytics tracking with DuckDB for token usage, LLM calls, program usage, and timing metrics.
AI-Assisted YAML Generation: Generate LLM program YAML files automatically based on natural language descriptions.

Getting Started

Installation

pip install llmprogram

Usage

Set your OpenAI API Key:
```
export OPENAI_API_KEY='your-api-key'
```

Create a program YAML file:

Create a file named sentiment_analysis.yaml:

name: sentiment_analysis
description: Analyzes the sentiment of a given text.
version: 1.0.0

model:
  provider: openai
  name: gpt-4.1-mini
  temperature: 0.5
  max_tokens: 100
  response_format: json_object

system_prompt: |
  You are a sentiment analysis expert. Analyze the sentiment of the given text and return a JSON response with the following format:
  - sentiment (string): "positive", "negative", or "neutral"
  - score (number): A score from -1 (most negative) to 1 (most positive)

input_schema:
  type: object
  required:
    - text
  properties:
    text:
      type: string
      description: The text to analyze.

output_schema:
  type: object
  required:
    - sentiment
    - score
  properties:
    sentiment:
      type: string
      enum: ["positive", "negative", "neutral"]
    score:
      type: number
      minimum: -1
      maximum: 1

template: |
  Analyze the following text:
  {{text}}

Run the program using the CLI:

# Using a JSON input file
llmprogram run sentiment_analysis.yaml --inputs sentiment_inputs.json

# Using inline JSON
llmprogram run sentiment_analysis.yaml --input-json '{"text": "I love this product!"}'

Or create a file named run_sentiment_analysis.py:

import asyncio
from llmprogram import LLMProgram

async def main():
    program = LLMProgram('sentiment_analysis.yaml')
    result = await program(text='I love this new product! It is amazing.')
    print(result)

if __name__ == '__main__':
    asyncio.run(main())

Run the script:

python run_sentiment_analysis.py

Configuration

The behavior of each LLM program is defined in a YAML file. Here are the key sections:

name, description, version: Basic metadata for your program.
model: Defines the LLM provider, model name, and other parameters like temperature and max_tokens.
system_prompt: The instructions that are given to the LLM to guide its behavior.
input_schema: A JSON schema that defines the expected input for the program. The program will validate the input against this schema before execution.
output_schema: A JSON schema that defines the expected output from the LLM. The program will validate the LLM's output against this schema.
template: A Jinja2 template that is used to generate the prompt that is sent to the LLM. The template is rendered with the input variables.

Using with other OpenAI-compatible endpoints

You can use llmprogram with any OpenAI-compatible endpoint, such as Ollama. To do this, you can pass the api_key and base_url to the LLMProgram constructor:

program = LLMProgram(
    'your_program.yaml',
    api_key='your-api-key',  # optional, defaults to OPENAI_API_KEY env var
    base_url='http://localhost:11434/v1'  # example for Ollama
)

Caching

llmprogram supports caching of LLM responses to Redis to improve performance and reduce costs. To enable caching, you need to have a Redis server running.

By default, caching is enabled. You can disable it or configure the Redis connection and cache TTL (time-to-live) when you create an LLMProgram instance:

program = LLMProgram(
    'your_program.yaml',
    enable_cache=True,
    redis_url="redis://localhost:6379",
    cache_ttl=3600  # in seconds
)

Logging and Dataset Generation

llmprogram automatically logs every execution of a program to a SQLite database. The database file is created in the same directory as the program YAML file, with a .db extension.

This logging feature is not just for debugging; it's also a powerful tool for creating high-quality datasets for fine-tuning your own LLMs. Each record in the log contains:

function_input: The input given to the program.
function_output: The output received from the LLM.
llm_input: The prompt sent to the LLM.
llm_output: The raw response from the LLM.

Generating a Dataset

You can use the built-in CLI to generate an instruction dataset from the logged data. The dataset is created in JSONL format, which is commonly used for fine-tuning.

llmprogram generate-dataset /path/to/your_program.db /path/to/your_dataset.jsonl

Each line in the output file will be a JSON object with the following keys:

instruction: The system prompt and the user prompt, combined to form the instruction for the LLM.
output: The output from the LLM.

Command-Line Interface (CLI)

llmprogram comes with a command-line interface for common tasks.

`run`

Run an LLM program with inputs from command line or files.

Usage:

# First, set your OpenAI API key
export OPENAI_API_KEY='your-api-key'

# Run with inputs from a JSON file
llmprogram run program.yaml --inputs inputs.json

# Run with inputs from command line
llmprogram run program.yaml --input-json '{"text": "I love this product!"}'

# Run with inputs from stdin
echo '{"text": "I love this product!"}' | llmprogram run program.yaml

# Run with streaming output
llmprogram run program.yaml --inputs inputs.json --stream

# Save output to a file
llmprogram run program.yaml --inputs inputs.json --output result.json

Arguments:

program_path: The path to the program YAML file.
--inputs, -i: Path to JSON/YAML file containing inputs.
--input-json: JSON string of inputs.
--output, -o: Path to output file (default: stdout).
--stream, -s: Stream the response.

`generate-yaml`

Generate an LLM program YAML file based on a description using an AI assistant.

Usage:

# Generate a YAML program with a simple description
llmprogram generate-yaml "Create a program that analyzes the sentiment of text" --output sentiment_analyzer.yaml

# Generate a YAML program with examples
llmprogram generate-yaml "Create a program that extracts key information from customer reviews" \
  --example-input "The battery life on this phone is amazing! It lasts all day." \
  --example-output '{"product_quality": "positive", "battery": "positive", "durability": "neutral"}' \
  --output review_analyzer.yaml

# Generate a YAML program and output to stdout
llmprogram generate-yaml "Create a program that summarizes long texts"

Arguments:

description: A detailed description of what the LLM program should do.
--example-input: Example of the input the program will receive.
--example-output: Example of the output the program should generate.
--output, -o: Path to output YAML file (default: stdout).
--api-key: OpenAI API key (optional, defaults to OPENAI_API_KEY env var).

`analytics`

Show analytics data collected from LLM program executions.

Usage:

# Show all analytics data
llmprogram analytics

# Show analytics for a specific program
llmprogram analytics --program sentiment_analysis

# Show analytics for a specific model
llmprogram analytics --model gpt-4

# Use a custom analytics database path
llmprogram analytics --db-path /path/to/custom/analytics.duckdb

Arguments:

--db-path: Path to the analytics database (default: llmprogram_analytics.duckdb).
--program: Filter by program name.
--model: Filter by model name.

`generate-dataset`

Generate an instruction dataset for LLM fine-tuning from a SQLite log file.

Usage:

llmprogram generate-dataset <database_path> <output_path>

Arguments:

database_path: The path to the SQLite database file.
output_path: The path to write the generated dataset to.

Web Service

llmprogram includes a built-in web service that exposes your LLM programs as REST API endpoints with automatic OpenAPI documentation.

Running the Web Service

To run the web service, use the llmprogram-web command:

# Run the web service with default settings (examples directory, localhost:8000)
llmprogram-web

# Run the web service with custom directory
llmprogram-web --directory /path/to/your/programs

# Run the web service on a different host/port
llmprogram-web --host 0.0.0.0 --port 8080

# Run with auto-reload for development
llmprogram-web --reload

# Use a custom analytics database path
llmprogram-web --analytics-db /path/to/custom/analytics.duckdb

API Endpoints

The web service automatically generates REST endpoints for each YAML file in your programs directory:

GET / - Root endpoint with API information
GET /programs - List all available programs
GET /programs/{program_name} - Get detailed information about a specific program
POST /programs/{program_name}/run - Run a specific program
GET /analytics/llm-calls - Get LLM call statistics
GET /analytics/program-usage - Get program usage statistics
GET /analytics/token-usage - Get token usage statistics

For each program, the service generates:

A POST endpoint at /programs/{program_name}/run
Automatic request/response validation based on the program's input/output schemas
Full OpenAPI documentation at /docs and /redoc
OpenAPI specification at /openapi.json

Analytics Endpoints

The web service includes comprehensive analytics endpoints:

GET /analytics/llm-calls - Get LLM call statistics including call count, token usage, execution time, cache hits, and unique users
GET /analytics/program-usage - Get program usage statistics including usage count, successful/failed calls, execution time, and unique users
GET /analytics/token-usage - Get token usage statistics including prompt/completion tokens, total tokens, estimated cost, and unique users

All analytics endpoints support filtering by program name, model name, and date range.

Example Usage

After starting the web service, you can interact with it using curl or any HTTP client:

# List available programs
curl http://localhost:8000/programs

# Get information about a specific program
curl http://localhost:8000/programs/sentiment_analysis

# Run a program
curl -X POST http://localhost:8000/programs/sentiment_analysis/run \
  -H "Content-Type: application/json" \
  -d '{"inputs": {"text": "I love this product!"}}'

# Get LLM call statistics
curl http://localhost:8000/analytics/llm-calls

# Get program usage statistics for a specific program
curl "http://localhost:8000/analytics/program-usage?program_name=sentiment_analysis"

# Get token usage statistics with filtering
curl "http://localhost:8000/analytics/token-usage?program_name=sentiment_analysis&model_name=gpt-4"

OpenAPI Documentation

The web service automatically generates comprehensive OpenAPI documentation:

Interactive API documentation: http://localhost:8000/docs
ReDoc documentation: http://localhost:8000/redoc
Raw OpenAPI JSON specification: http://localhost:8000/openapi.json

The generated OpenAPI specification includes:

Endpoint definitions for each program
Request/response schemas based on your program's input/output schemas
Example requests and responses
Detailed descriptions from your program's metadata
Analytics endpoints with filter parameters

Examples

You can find more examples in the examples directory:

Sentiment Analysis: A simple program to analyze the sentiment of a piece of text. (examples/sentiment_analysis.yaml)

To run the examples:

Navigate to the examples directory.

Run the corresponding run_*.py script, or use the CLI:

# Using the CLI with a JSON input file
poetry run llmprogram run sentiment_analysis.yaml --inputs sentiment_inputs.json

# Using the CLI with inline JSON
poetry run llmprogram run sentiment_analysis.yaml --input-json '{"text": "I love this product!"}'

# Using the CLI with batch processing
poetry run llmprogram run sentiment_analysis.yaml --inputs sentiment_batch_inputs.json

# Using the CLI with streaming
poetry run llmprogram run sentiment_analysis.yaml --inputs sentiment_inputs.json --stream

# Using the CLI and saving output to a file
poetry run llmprogram run sentiment_analysis.yaml --inputs sentiment_inputs.json --output result.json

# View analytics data
poetry run llmprogram analytics

# View analytics for a specific program
poetry run llmprogram analytics --program sentiment_analysis

# Generate a new YAML program
poetry run llmprogram generate-yaml "Create a program that classifies email priority" \
  --example-input "Subject: Urgent meeting tomorrow. Body: Please prepare the Q3 report." \
  --example-output '{"priority": "high", "category": "work", "response_required": true}' \
  --output email_classifier.yaml

Or run the web service:

# Run the web service
poetry run llmprogram-web --directory examples

# Then interact with it using curl or any HTTP client
curl -X POST http://localhost:8000/programs/sentiment_analysis/run \
  -H "Content-Type: application/json" \
  -d '{"inputs": {"text": "I love this product!"}}'
  
# View analytics via the web API
curl http://localhost:8000/analytics/llm-calls
curl http://localhost:8000/analytics/program-usage
curl http://localhost:8000/analytics/token-usage

Other examples:

Code Generator: A program that generates Python code from a natural language description. (examples/code_generator.yaml)
Email Generator: A program that generates a professional email based on a few inputs. (examples/email_generator.yaml)

To run the examples, navigate to the examples directory and run the corresponding run_*.py script or use the CLI as shown above.

Development

To run the tests for this package, you will need to install pytest:

pip install pytest

Then, you can run the tests from the root directory of the project:

pytest

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.2

Aug 12, 2025

0.1.1

Aug 11, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llmprogram-0.1.2.tar.gz (25.6 kB view details)

Uploaded Aug 12, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llmprogram-0.1.2-py3-none-any.whl (24.1 kB view details)

Uploaded Aug 12, 2025 Python 3

File details

Details for the file llmprogram-0.1.2.tar.gz.

File metadata

Download URL: llmprogram-0.1.2.tar.gz
Upload date: Aug 12, 2025
Size: 25.6 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for llmprogram-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`46a9baaf6616b4d742caab27df0211f7b1d74ce7ff21cced4f5ba7c54b51c74c`
MD5	`d8cd250e37ba1cd7985fac89b7585217`
BLAKE2b-256	`efddc9d12bb7f01ae786d61bcb4020098ace4c316b9e92c8ffdf8e8d6fc1669e`

See more details on using hashes here.

Provenance

The following attestation bundles were made for llmprogram-0.1.2.tar.gz:

Publisher: publish.yml on Skelf-Research/llmprogram

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: llmprogram-0.1.2.tar.gz
- Subject digest: 46a9baaf6616b4d742caab27df0211f7b1d74ce7ff21cced4f5ba7c54b51c74c
- Sigstore transparency entry: 384038739
- Sigstore integration time: Aug 12, 2025
Source repository:
- Permalink: Skelf-Research/llmprogram@215f50ce669520247986ee79853f53cdb297a75a
- Branch / Tag: refs/tags/0.1.2
- Owner: https://github.com/Skelf-Research
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@215f50ce669520247986ee79853f53cdb297a75a
- Trigger Event: release

File details

Details for the file llmprogram-0.1.2-py3-none-any.whl.

File metadata

Download URL: llmprogram-0.1.2-py3-none-any.whl
Upload date: Aug 12, 2025
Size: 24.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for llmprogram-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`154f1b4b4d5eafde25ca03db44dc0cb3e68ae38bb18c0d12bd00961eeb0124f8`
MD5	`37de4c69a2de92769a00cd1648fdde17`
BLAKE2b-256	`75632ddc6f40281035f61ed4c308c63307653a96b3f59a5020f1e5667052b4d0`

See more details on using hashes here.

Provenance

The following attestation bundles were made for llmprogram-0.1.2-py3-none-any.whl:

Publisher: publish.yml on Skelf-Research/llmprogram

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: llmprogram-0.1.2-py3-none-any.whl
- Subject digest: 154f1b4b4d5eafde25ca03db44dc0cb3e68ae38bb18c0d12bd00961eeb0124f8
- Sigstore transparency entry: 384038742
- Sigstore integration time: Aug 12, 2025
Source repository:
- Permalink: Skelf-Research/llmprogram@215f50ce669520247986ee79853f53cdb297a75a
- Branch / Tag: refs/tags/0.1.2
- Owner: https://github.com/Skelf-Research
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@215f50ce669520247986ee79853f53cdb297a75a
- Trigger Event: release

llmprogram 0.1.2

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

LLM Program

How is llmprogram different?

Features

Getting Started

Installation

Usage

Configuration

Using with other OpenAI-compatible endpoints

Caching

Logging and Dataset Generation

Generating a Dataset

Command-Line Interface (CLI)

run

generate-yaml

analytics

generate-dataset

Web Service

Running the Web Service

API Endpoints

Analytics Endpoints

Example Usage

OpenAPI Documentation

Examples

Development

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

How is `llmprogram` different?

`run`

`generate-yaml`

`analytics`

`generate-dataset`