Skip to main content

Production LLM calls. Just the three lines. Reliability, native caching, and reversible context compression on by default.

Project description

justllm

PyPI CI Python License: MIT

Production LLM calls. Just the three lines.

justllm demo

from justllm import LLM

llm = LLM("anthropic/claude-opus-4-8")
llm("Summarize this contract.")

That call already does the work you'd normally wire up yourself, on by default:

  • Context compression. Headroom shrinks tool output by 50–95% before it reaches the model.
  • Prompt-cache optimization. Cache breakpoints go where each provider wants them (Anthropic, OpenAI, Google).
  • Reliability. Calls retry with backoff, then fail over to the next provider.
pip install 'justllm[all]'

A little more

Same three lines. Each of these is one call or one kwarg:

llm.extract(Invoice, text)                    # structured output (validated Pydantic)
llm.stream("...")                             # token streaming
await llm.acall("...")                        # async
llm.map(prompts, concurrency=8)               # many prompts at once, in order
llm.embed(texts)                              # embeddings
chat = llm.chat(); chat.send("..."); chat.send("...")   # multi-turn, remembers history
llm.agent(system="...").run("...")            # tool-calling loop
llm.judge(output, criteria="...")             # LLM-as-judge score
llm.evaluate(cases)                           # run + grade a test set
LLM(router=Cascade(small=cheap, large=big))   # cheap first, escalate when needed

A few more things sit behind opt-in extras: OpenTelemetry traces that include the per-call dollar cost (most setups leave that out), Langfuse-backed prompts, semantic cascade escalation, and exact-match caching. The hard parts are already wired; you just call them.

Runnable recipes: cookbook

Why

The ecosystem splits two ways. You can have powerful but heavy (LiteLLM, LangChain), or simple but thin (aisuite, any-llm). justllm sits in the middle: every optimization is on, and the surface stays at three lines. Keeping it that small was most of the work.

justllm LiteLLM aisuite
three-line call yes yes yes
cross-provider fallback on by default config no
context compression on by default (Headroom) manual trim no
prompt-cache optimization on by default passthrough no
structured output yes (instructor) passthrough no
tool-calling agent yes (minimal) no no
surface area tiny large tiny

It runs on LiteLLM underneath, so think of it as the opinionated layer on top rather than a replacement.


Alpha. The wiring is tested on CI (Python 3.10–3.13) and the call paths are checked against live models.

Cookbook · Roadmap · Changelog · Contributing · MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

justllm-0.7.0.tar.gz (98.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

justllm-0.7.0-py3-none-any.whl (23.2 kB view details)

Uploaded Python 3

File details

Details for the file justllm-0.7.0.tar.gz.

File metadata

  • Download URL: justllm-0.7.0.tar.gz
  • Upload date:
  • Size: 98.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for justllm-0.7.0.tar.gz
Algorithm Hash digest
SHA256 687466d51027cb03a2f31ea02b98828abddd84dc204b64117279c8bdb418911e
MD5 4043ca740cf6281bd32b8bf8024022cf
BLAKE2b-256 2b755ed4a27095967ac5d54fd1af4c9887cdf9bce38c2358d9521ab59a0769a5

See more details on using hashes here.

File details

Details for the file justllm-0.7.0-py3-none-any.whl.

File metadata

  • Download URL: justllm-0.7.0-py3-none-any.whl
  • Upload date:
  • Size: 23.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for justllm-0.7.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e21b198e07acec1dd5587d7a94f02abaddbc2b77af3a6b1b4e6d48bbcd8fe33f
MD5 421012119b47d842be580f3cb2e6b5d5
BLAKE2b-256 e22446a68fdf5efc82ec78c7533b551d0f4854e935b832ee27950b065bf2af5c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page