Production LLM calls. Just the three lines. Reliability, native caching, and reversible context compression on by default.
Project description
justllm
Production LLM calls. Just the three lines.
from justllm import LLM
llm = LLM("anthropic/claude-opus-4-8")
reply = llm("Summarize this contract.")
Cross-provider fallback, native prompt caching, and reversible context compression are on by default. No config. The surface stays tiny on purpose — the moment you need a dozen knobs, that is what LiteLLM is for.
Why
The ecosystem split in two: feature-complete but heavy (LiteLLM, LangChain),
or simple but feature-thin (aisuite, any-llm). Nobody ships the production
layer behind a three-line front door. justllm is that middle.
The one number that makes it worth a switch: compressing the dynamic junk that
bloats agent calls — tool outputs, logs, RAG dumps — cuts the input-token bill
without touching your code. Measured here (gpt-4o token basis): 53% saved on a
JSON API tool result, 97% on repetitive logs, with a safe no-op when
compression wouldn't help. The engine is
Headroom (PyPI: headroom-ai,
content-aware and reversible); justllm applies it only to tool/retrieved
content, never to your prompts. See benchmarks/.
With knobs (when you actually need them)
llm = LLM(
chain=["anthropic/claude-opus-4-8", "openai/gpt-5", "groq/llama-3.1-70b"],
compress=True, # reversible, dynamic-context only
cache="prompt", # "cache" never silently means semantic
)
Status
Pre-alpha (0.0.1). Working today and covered by the benchmark suite:
- Reliability —
with_fallback+RetryPolicy: retry-with-jitter on retryable errors only, ordered cross-provider failover. One retry layer. - Compression —
compress: a thin adapter over Headroom, with a conservative structural fallback when Headroom is not installed.
The unified LLM client wires these together but its provider transport
(_complete) is the remaining piece before 0.1.0.
Benchmarks
pip install -e '.[benchmarks]'
python -m benchmarks.run
Measures token/cost savings from compression, the overhead the layer adds, and that fallback actually recovers provider failures. The suite runs even without the optional deps (using fallbacks), so it is never a hard blocker.
License
MIT © Robert Walmsley
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file justllm-0.0.1.tar.gz.
File metadata
- Download URL: justllm-0.0.1.tar.gz
- Upload date:
- Size: 12.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3bdbbedc3a6bc05a1c601cee6c8779116f458b4554a2c99582e9b01682aebe51
|
|
| MD5 |
6f74528a753c9eba5a263364ca5ee5cb
|
|
| BLAKE2b-256 |
0f34a4683a37d5894d71b1cb14680f8bc343499fb9942488a3a667419ee5fcf6
|
File details
Details for the file justllm-0.0.1-py3-none-any.whl.
File metadata
- Download URL: justllm-0.0.1-py3-none-any.whl
- Upload date:
- Size: 8.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f7f9e256a83b4d408554eec580ad6ac47e721effef78de69f5bcb8eb474ac771
|
|
| MD5 |
f2562c0f1136196f851687a1a81b13e1
|
|
| BLAKE2b-256 |
2203293c5280ece422422981dbe8981195912597589b0cf2ef0698c7b467f6da
|