Skip to main content

Language Model Development Kit.

Project description

Language Model Development Kit

What it offers:

  • Simplest interface to call different Language Model APIs
  • Minimal dependencies: HTTP requests only, no third party packages
  • Streaming
  • Comfy structured outputs via Pydantic models, only if the provider / model supports it natively
  • Parallel completions
  • Unified HTTP error handling
  • Easy location config (for providers with multiple datacenters like AWS Bedrock, GCP Vertex and Azure)
  • Model fallbacks
  • Bring Your Own Key (for each provider)
  • Optional Telemetry following OpenTelemetry GenAI Semantic Conventions
  • In-process observation hook (observe()) to capture request/response pairs from wrapped code

What it does NOT offer:

  • Tools / function calling / MCP
  • Agents
  • Multimodality (only text-in, text-out)
  • Shady under-the-hood prompt modification (e.g. to force structured output)
  • API gateways

If you are looking for a more constrained but out-of-the-box agent interface, I'd recommend pydantic-ai or haystack-ai. If you are looking to keep granular control but extend on tools or multimodality, I'd recommend litellm or leveraging the OpenAI-compatible endpoints that providers normally set up. If you want a unified a token for all providers and are willing to give away telemetry data, check Gateways like openrouter.

Installation

uv add lmdk

Optional OpenTelemetry support:

uv add 'lmdk[telemetry]'

Usage

from lmdk import complete

model = "mistral:mistral-small-2603"
# supports locations as in "vertex:gemini-2.5-flash@europe-west4"
Single prompt
response = complete(model=model, prompt="Tell me a joke")
Multi-turn conversation
messages = [
    UserMessage("My name is Alice."),
    AssistantMessage("Nice to meet you, Alice!"),
    UserMessage("What is my name?"),
]
response = complete(model=model, prompt=messages)
System prompt and generation kwargs
response = complete(
    model=model,
    prompt="Hi!",
    system_instruction="Talk like a pirate",
    generation_kwargs={"temperature": 0.9, "max_tokens": 10}
)
Streaming
token_iter = complete(model=model, prompt="Count from 1 to 5.", stream=True)
Model fallbacks
response = complete(model=["mistral:nonexistent-model", model], prompt="Hi")
# first request will raise NotFoundError bc model does not exist, second will work
Structured output
class Ingredient(BaseModel):
    name: str
    quantity: int
    unit: str = ""

class Recipe(BaseModel):
    ingredients: list[Ingredient]

response = complete(model=model, prompt="How do I make cheescake?", output_schema=Recipe)
# response.parsed will have a Recipe instance
Parallel calls
from lmdk import complete_batch

batch = complete_batch(model=model, prompt_list=["Greet in english", "Saluda en espanyol."])
# `batch` is a CompletionBatch. Iterate it to handle each outcome:
for result in batch:
    if isinstance(result, Exception):
        ...  # this prompt failed
    else:
        ...  # CompletionResponse

# Aggregates over successful responses:
batch.input_tokens, batch.output_tokens, batch.latency
batch.responses  # successes only
batch.errors     # exceptions only
Template Rendering
from lmdk import render_template

# Render a template string with variables
result = render_template(
    template="Hello, {{ name }}!",
    name="World"
)
# Output: "Hello, World!"

# Render a template from a jinja file
result = render_template(
    path="path/to/template.jinja2",
    name="World"
)
Observing wrapped code
from lmdk import observe

with observe() as obs:
    answer = my_function_that_calls_complete()

for record in obs.records:
    record.request    # CompletionRequest sent to the LM
    record.response   # CompletionResponse returned

Useful for tests, evals, and debug tooling where the wrapped function only returns its own result but you also want to inspect the underlying LM calls. Streaming completions are not recorded.

Telemetry

Telemetry is off by default and adds no required dependencies to the default install. To enable OpenTelemetry-based spans and metrics, install the optional extra and set LMDK_TELEMETRY:

uv add 'lmdk[telemetry]'
export LMDK_TELEMETRY=metadata  # spans/metrics without prompt text
# export LMDK_TELEMETRY=content  # also records prompt, system-instruction, and response text

We follows the experimental Gen AI semconv v1.41.0. We only instrument non-streaming responses for now.

lmdk only emits telemetry through the OpenTelemetry SDK. Your application owns exporter, processor, reader, collector endpoint, i.e.: you decide how and where to send the emitted traces.

Below are some minimal exporter setups. Call them once at process start before invoking complete / complete_batch.

Console (debugging)

Prints spans to stdout. Useful to verify instrumentation locally without any backend.

from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor, ConsoleSpanExporter


def configure_console_traces() -> None:
    provider = TracerProvider()
    provider.add_span_processor(BatchSpanProcessor(ConsoleSpanExporter()))
    trace.set_tracer_provider(provider)
Pydantic Logfire

Logfire installs itself as the global TracerProvider, so spans emitted by lmdk are forwarded automatically. Requires uv add logfire and a LOGFIRE_TOKEN.

import os
import logfire


def configure_logfire_traces() -> None:
    logfire.configure(
        token=os.environ["LOGFIRE_TOKEN"],
        service_name="my-app",
        # lmdk already controls prompt/response redaction via LMDK_TELEMETRY;
        # don't let Logfire second-guess scrubbing of content.
        scrubbing=False,
        send_to_logfire=True,
    )
Grafana (OTLP / Tempo)

Ship spans over OTLP to Grafana Cloud (or a self-hosted Tempo + OTel Collector). Requires uv add opentelemetry-exporter-otlp.

import os

from opentelemetry import trace
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor


def configure_grafana_traces() -> None:
    # For Grafana Cloud OTLP, set:
    #   OTEL_EXPORTER_OTLP_ENDPOINT=https://otlp-gateway-<region>.grafana.net/otlp
    #   OTEL_EXPORTER_OTLP_HEADERS=Authorization=Basic%20<base64(instanceID:token)>
    exporter = OTLPSpanExporter(
        endpoint=os.environ["OTEL_EXPORTER_OTLP_ENDPOINT"] + "/v1/traces",
    )
    provider = TracerProvider(resource=Resource.create({"service.name": "my-app"}))
    provider.add_span_processor(BatchSpanProcessor(exporter))
    trace.set_tracer_provider(provider)

Development

Structure

src/lmdk/
├── core.py         # Entry points: complete, complete_batch
├── datatypes.py    # Common message and response schemas
├── provider.py     # Base Provider class and registry
├── providers/      # Concrete implementations (Mistral, Vertex, etc.)
├── errors.py       # Unified HTTP and API error handling
└── utils.py        # Shared helper functions

Tooling

We use just for development tasks. Use:

  • just sync: Updates lockfile and syncs environment.
  • just format: Lints and formats with ruff.
  • just check-types: Static analysis with ty.
  • just check-complexity: Cyclomatic complexity checks with complexipy.
  • just test: Runs pytest with 90% coverage threshold.

See justfile for a complete list of dev commands.

Contribute

  1. Hooks: Install pre-commit hooks via just install-hooks. PRs will fail CI if linting/formatting is not applied.
  2. Issues: Open an issue first using the default template.
  3. PRs: Link your PR to the relevant issue using the PR template.

You can use just validate <model> (runs example.py) to verify which features run properly and which do not for a new provider / model. Not all of them have to pass to open a PR: some providers do not even support native structured output. Do at least the normal non-structured, non-streamed completion. The rest can raise NotImplementedError.

License

MIT

Made with mold template

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lmdk-2.1.1.tar.gz (22.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lmdk-2.1.1-py3-none-any.whl (30.0 kB view details)

Uploaded Python 3

File details

Details for the file lmdk-2.1.1.tar.gz.

File metadata

  • Download URL: lmdk-2.1.1.tar.gz
  • Upload date:
  • Size: 22.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.15 {"installer":{"name":"uv","version":"0.11.15","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for lmdk-2.1.1.tar.gz
Algorithm Hash digest
SHA256 2f26d27e7c9996e58e2205e8483d934aa6707510947018b14cac42419fa31899
MD5 3f75f2b26b0a5ee0ebfb6505f3f77562
BLAKE2b-256 29c100477f2c16a630ce4a60d7c465229b2f3ebd93eab5d74dd5fd11fb966829

See more details on using hashes here.

File details

Details for the file lmdk-2.1.1-py3-none-any.whl.

File metadata

  • Download URL: lmdk-2.1.1-py3-none-any.whl
  • Upload date:
  • Size: 30.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.15 {"installer":{"name":"uv","version":"0.11.15","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for lmdk-2.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 9cbcc3c4aeaf1d226e5ac9c002e6d05e5a589456cf78f75079e7ff5fc1d64731
MD5 2f0bebefc28f2b07534579080cd2f7c7
BLAKE2b-256 e5d3d0a658e772a3b17a206316eeb6027303d65a7e2a2172183a2a17b358b605

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page