Language Model Development Kit.
Project description
Language Model Development Kit
What it offers:
- Simplest interface to call different Language Model APIs
- Minimal dependencies: HTTP requests only, no third party packages
- Streaming
- Comfy structured outputs via Pydantic models, only if the provider / model supports it natively
- Parallel completions
- Unified HTTP error handling
- Easy location config (for providers with multiple datacenters like AWS Bedrock, GCP Vertex and Azure)
- Model fallbacks
- Bring Your Own Key (for each provider)
- Optional Telemetry following OpenTelemetry GenAI Semantic Conventions
- In-process observation hook (
observe()) to capture request/response pairs from wrapped code
What it does NOT offer:
- Tools / function calling / MCP
- Agents
- Multimodality (only text-in, text-out)
- Shady under-the-hood prompt modification (e.g. to force structured output)
- API gateways
If you are looking for a more constrained but out-of-the-box agent interface, I'd recommend pydantic-ai or haystack-ai.
If you are looking to keep granular control but extend on tools or multimodality, I'd recommend litellm or leveraging the OpenAI-compatible endpoints that providers normally set up.
The closest to the "less intrusive path to unified LM calling" idea in lmdk is chatlas, but their main public interface is stateful (versus the full stateless design of lmdk).
If you want a unified a token for all providers and are willing to give away telemetry data, check Gateways like openrouter.
Installation
uv add lmdk
Optional OpenTelemetry support:
uv add 'lmdk[telemetry]'
Usage
from lmdk import complete
model = "mistral:mistral-small-2603"
# supports locations as in "vertex:gemini-2.5-flash@europe-west4"
Single prompt
response = complete(model=model, prompt="Tell me a joke")
Multi-turn conversation
messages = [
UserMessage("My name is Alice."),
AssistantMessage("Nice to meet you, Alice!"),
UserMessage("What is my name?"),
]
response = complete(model=model, prompt=messages)
System prompt and generation kwargs
response = complete(
model=model,
prompt="Hi!",
system_instruction="Talk like a pirate",
generation_kwargs={"temperature": 0.9, "max_tokens": 10}
)
Streaming
token_iter = complete(model=model, prompt="Count from 1 to 5.", stream=True)
Model fallbacks
response = complete(model=["mistral:nonexistent-model", model], prompt="Hi")
# first request will raise NotFoundError bc model does not exist, second will work
Structured output
class Ingredient(BaseModel):
name: str
quantity: int
unit: str = ""
class Recipe(BaseModel):
ingredients: list[Ingredient]
response = complete(model=model, prompt="How do I make cheescake?", output_schema=Recipe)
# response.parsed will have a Recipe instance
Parallel calls
from lmdk import complete_batch
batch = complete_batch(model=model, prompt_list=["Greet in english", "Saluda en espanyol."])
# `batch` is a CompletionBatch. Iterate it to handle each outcome:
for result in batch:
if isinstance(result, Exception):
... # this prompt failed
else:
... # CompletionResponse
# Aggregates over successful responses:
batch.input_tokens, batch.output_tokens, batch.latency
batch.responses # successes only
batch.errors # exceptions only
Template Rendering
from lmdk import render_template
# Render a template string with variables
result = render_template(
template="Hello, {{ name }}!",
name="World"
)
# Output: "Hello, World!"
# Render a template from a jinja file
result = render_template(
path="path/to/template.jinja2",
name="World"
)
Observing wrapped code
from lmdk import observe
with observe() as obs:
answer = my_function_that_calls_complete()
for record in obs.records:
record.request # CompletionRequest sent to the LM
record.response # CompletionResponse returned
Useful for tests, evals, and debug tooling where the wrapped function only returns its own result but you also want to inspect the underlying LM calls. Streaming completions are not recorded.
Telemetry
Telemetry is off by default and adds no required dependencies to the default install.
To enable OpenTelemetry-based spans and metrics, install the optional extra and set LMDK_TELEMETRY:
uv add 'lmdk[telemetry]'
export LMDK_TELEMETRY=metadata # spans/metrics without prompt text
# export LMDK_TELEMETRY=content # also records prompt, system-instruction, and response text
We follows the experimental Gen AI semconv v1.41.0. We only instrument non-streaming responses for now.
lmdk only emits telemetry through the OpenTelemetry SDK. Your application owns exporter, processor, reader, collector endpoint, i.e.: you decide how and where to send the emitted traces.
Below are some minimal exporter setups. Call them once at process start before invoking complete / complete_batch.
Console (debugging)
Prints spans to stdout. Useful to verify instrumentation locally without any backend.
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor, ConsoleSpanExporter
def configure_console_traces() -> None:
provider = TracerProvider()
provider.add_span_processor(BatchSpanProcessor(ConsoleSpanExporter()))
trace.set_tracer_provider(provider)
Pydantic Logfire
Logfire installs itself as the global TracerProvider, so spans emitted by lmdk are forwarded automatically. Requires uv add logfire and a LOGFIRE_TOKEN.
import os
import logfire
def configure_logfire_traces() -> None:
logfire.configure(
token=os.environ["LOGFIRE_TOKEN"],
service_name="my-app",
# lmdk already controls prompt/response redaction via LMDK_TELEMETRY;
# don't let Logfire second-guess scrubbing of content.
scrubbing=False,
send_to_logfire=True,
)
Grafana (OTLP / Tempo)
Ship spans over OTLP to Grafana Cloud (or a self-hosted Tempo + OTel Collector). Requires uv add opentelemetry-exporter-otlp.
import os
from opentelemetry import trace
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
def configure_grafana_traces() -> None:
# For Grafana Cloud OTLP, set:
# OTEL_EXPORTER_OTLP_ENDPOINT=https://otlp-gateway-<region>.grafana.net/otlp
# OTEL_EXPORTER_OTLP_HEADERS=Authorization=Basic%20<base64(instanceID:token)>
exporter = OTLPSpanExporter(
endpoint=os.environ["OTEL_EXPORTER_OTLP_ENDPOINT"] + "/v1/traces",
)
provider = TracerProvider(resource=Resource.create({"service.name": "my-app"}))
provider.add_span_processor(BatchSpanProcessor(exporter))
trace.set_tracer_provider(provider)
Development
Structure
src/lmdk/
├── core.py # Entry points: complete, complete_batch
├── datatypes.py # Common message and response schemas
├── provider.py # Base Provider class and registry
├── providers/ # Concrete implementations (Mistral, Vertex, etc.)
├── errors.py # Unified HTTP and API error handling
└── utils.py # Shared helper functions
Tooling
We use just for development tasks. Use:
just sync: Updates lockfile and syncs environment.just format: Lints and formats withruff.just check-types: Static analysis withty.just check-complexity: Cyclomatic complexity checks withcomplexipy.just test: Runs pytest with 90% coverage threshold.
See justfile for a complete list of dev commands.
Contribute
- Hooks: Install pre-commit hooks via
just install-hooks. PRs will fail CI if linting/formatting is not applied. - Issues: Open an issue first using the default template.
- PRs: Link your PR to the relevant issue using the PR template.
You can use just validate <model> (runs example.py) to verify which features run properly and which do not for a new provider / model.
Not all of them have to pass to open a PR: some providers do not even support native structured output. Do at least the normal non-structured, non-streamed completion. The rest can raise NotImplementedError.
License
MIT
Made with mold template
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file lmdk-2.4.0.tar.gz.
File metadata
- Download URL: lmdk-2.4.0.tar.gz
- Upload date:
- Size: 23.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.21 {"installer":{"name":"uv","version":"0.11.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b775279fa1790609f681f71f8dd4d1a957630631de3be14f4fcbf890e7231680
|
|
| MD5 |
2b4253e29ced741ee59a3d04e18c586d
|
|
| BLAKE2b-256 |
c7d872003067d02cd8cefba73d2f460b4e562d0db12b1138f41d01e9d0d1b849
|
File details
Details for the file lmdk-2.4.0-py3-none-any.whl.
File metadata
- Download URL: lmdk-2.4.0-py3-none-any.whl
- Upload date:
- Size: 31.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.21 {"installer":{"name":"uv","version":"0.11.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4d70eb2bb5afa276eea8e66e2f7efbe2b2a8e8924ec1263ad7046d5830a63cd2
|
|
| MD5 |
bb06145ead4504bcd2caf0fc03bb87f2
|
|
| BLAKE2b-256 |
a5a0f46f4636e2e9b04e2ab6698da4c8ba3ea06479fa4ea1932b77c0c2ec8c86
|