Skip to main content

A tiny, dependency-free tool belt for LLMs: token counting, cost estimation, retries, prompt templates, and text chunking.

Project description

llmbelt 🧰

A tiny, zero-dependency tool belt for working with LLMs. The small utilities you end up re-writing on every project — token counting, cost estimation, retries, prompt templates, and text chunking — in one clean import.

CI PyPI Python License: MIT

  • 🪶 Zero required dependencies — pure standard library.
  • 🔌 Provider-agnostic — works with Anthropic, OpenAI, Gemini, or anything else.
  • 🧪 Fully tested across Python 3.9–3.12.

Install

pip install llmbelt

# Optional: exact token counts for OpenAI-family models
pip install "llmbelt[tiktoken]"

Usage

Count tokens

from llmbelt import count_tokens, truncate_to_tokens

count_tokens("Hello, world!")              # exact if tiktoken installed, else estimated
count_tokens("Hello", model="gpt-4o")      # use a model-specific encoding

# Trim text to fit a budget (great before sending context to an API)
truncate_to_tokens(long_document, max_tokens=4000)

Estimate cost

from llmbelt import estimate_cost, Price

estimate_cost(input_tokens=1_500, output_tokens=800, model="gpt-4o-mini")   # -> USD

# Bring your own prices (the built-in table is approximate — always verify):
my_prices = {"my-model": Price(input_per_1m=2.0, output_per_1m=6.0)}
estimate_cost(1000, 500, "my-model", pricing=my_prices)

Retry with backoff

from llmbelt import retry

@retry(attempts=5, exceptions=(ConnectionError, TimeoutError))
def call_api():
    ...   # retried with exponential backoff + jitter on failure

# Works on async functions too — awaited, with non-blocking asyncio.sleep backoff
@retry(attempts=5, exceptions=(ConnectionError, TimeoutError))
async def call_api_async():
    ...

Extract JSON from a model reply

from llmbelt import extract_json

extract_json('Sure!\n```json\n{"ok": true}\n```')   # -> {"ok": True}
extract_json('The score is {"value": 0.9}.')         # -> {"value": 0.9}
extract_json("no json here", default=None)           # -> None (else raises ValueError)

Prompt templates

from llmbelt import PromptTemplate

t = PromptTemplate("Translate {text} into {language}.")
t.render(text="hello", language="French")   # "Translate hello into French."
t.render(text="hello")                      # KeyError: Missing template variables: ['language']

Chunk text for RAG

from llmbelt import chunk_text, chunk_by_tokens

chunks = chunk_text(document, chunk_size=1000, overlap=100)
# overlapping chunks so answers aren't split across a boundary

# Budget by tokens instead of characters (exact with tiktoken installed):
chunks = chunk_by_tokens(document, chunk_size=500, overlap=50)

API reference

Function Description
count_tokens(text, model=None) Exact (tiktoken) or estimated token count
estimate_tokens(text) Dependency-free heuristic count
truncate_to_tokens(text, max_tokens, model=None) Trim text to a token budget
estimate_cost(input_tokens, output_tokens, model, pricing=None) USD cost estimate
retry(attempts, base_delay, backoff, jitter, exceptions, ...) Backoff retry decorator (sync and async)
PromptTemplate(template) Templating with missing-variable validation
chunk_text(text, chunk_size, overlap) Overlapping text chunks (by character)
chunk_by_tokens(text, chunk_size, overlap, model=None) Overlapping text chunks (by token budget)
extract_json(text, default=...) Parse the first JSON value out of an LLM reply

Development

git clone https://github.com/YoungAlpaccino/llmbelt
cd llmbelt
pip install -e ".[dev]"
pytest          # run tests
ruff check .    # lint

Publishing to PyPI (maintainer notes)

python -m build
twine upload dist/*

Before first publish: confirm the name llmbelt is free on PyPI. If taken, rename in pyproject.toml, the src/ folder, and imports (a single find-and-replace).


License

MIT — see LICENSE. Use it anywhere, including commercially.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llmbelt-0.2.0.tar.gz (14.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llmbelt-0.2.0-py3-none-any.whl (11.5 kB view details)

Uploaded Python 3

File details

Details for the file llmbelt-0.2.0.tar.gz.

File metadata

  • Download URL: llmbelt-0.2.0.tar.gz
  • Upload date:
  • Size: 14.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.0

File hashes

Hashes for llmbelt-0.2.0.tar.gz
Algorithm Hash digest
SHA256 d18688d6deffea7c8c32952416bed479fcebc65c406812becfac6224d3d7c9d2
MD5 e596925a6d53656d565d75de955a5e87
BLAKE2b-256 cca2716a53639617861c4070c3b9beeecfb87df4f288a36aafee7b46ef597e35

See more details on using hashes here.

File details

Details for the file llmbelt-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: llmbelt-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 11.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.0

File hashes

Hashes for llmbelt-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3a3489a616c78bab4e52bc7304b4b2b61218acf8b17f6eb3d411e08d7122d8ed
MD5 02c68bc30cea2cf89cdd2a9b5f9b875a
BLAKE2b-256 a257a7f97d44c8009a28aadfc1453ead2837649403d635eb35915857d5185a22

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page