Skip to main content

Fast and easy-to-use package for data science

Project description

Speedy Utils

PyPI Python Versions License

Speedy Utils is a Python utility library for caching, parallel processing, file I/O, LLM integration, dataset inspection, and image processing. The repo ships multiple importable packages and keeps import time under the repository's 0.4s hook budget by keeping heavy external dependencies lazy.

Table of Contents

Installation

pip install speedy-utils
# or
uv pip install speedy-utils

Install from source:

pip install git+https://github.com/anhvth/speedy_utils
# or
uv pip install git+https://github.com/anhvth/speedy_utils

Local development:

git clone https://github.com/anhvth/speedy_utils
cd speedy_utils
uv sync

Upgrading from older split packages:

pip uninstall speedy_llm_utils speedy_utils
pip install -U speedy-utils

Packages

The wheel currently installs four packages from src/:

Package Purpose
speedy_utils Core utilities: caching, I/O, formatting, parallelism, timing
llm_utils OpenAI-compatible LLM wrappers and chat-format helpers
vision_utils Image loading, plotting, and mmap-backed image datasets
datasets_utils Dataset inspection helpers, including the viz_chat CLI

Core Utilities

memoize and imemoize

from speedy_utils import memoize, imemoize

@memoize
def expensive_function(x):
    import time

    time.sleep(2)
    return x * x


@imemoize
def fast_function(x):
    return x + 1

memoize uses memory, disk, or both. The default disk cache root is ~/.cache/speedy_cache.

@memoize(
    keys=["x"],
    cache_dir="/tmp/my_cache",
    cache_type="both",   # "memory" | "disk" | "both"
    size=512,
    verbose=True,
)
def fn(x, ignored_arg):
    ...

Both decorators support sync and async functions.

multi_thread

from speedy_utils import multi_thread

results = multi_thread(lambda x: x * 2, [1, 2, 3, 4, 5])

Important public options:

multi_thread(
    func,
    inputs,
    workers=None,
    batch=1,
    ordered=True,
    progress=True,
    progress_update=10,
    progress_total=None,
    progress_weight=None,
    prefetch_factor=4,
    timeout=None,
    error_handler="raise",   # "raise" | "ignore" | "log"
    max_error_files=100,
    store_output_pkl_file=None,
    **fixed_kwargs,
)

Error handling:

def process(item):
    if item == 3:
        raise ValueError("bad item")
    return item * 2


multi_thread(process, [1, 2, 3], error_handler="raise")
multi_thread(process, [1, 2, 3], error_handler="ignore")
multi_thread(process, [1, 2, 3], error_handler="log")

error_handler="log" writes rich error reports under .cache/speedy_utils/error_logs/.

multi_process

from speedy_utils import multi_process

results = multi_process(
    func,
    items,
    num_procs=4,
    num_threads=1,
    backend="spawn",      # "spawn" | "fork"
    error_handler="log",  # "raise" | "ignore" | "log"
    progress=True,
    dump_in_thread=True,
    log_worker="first",   # "zero" | "first" | "all"
)

Current behavior worth knowing:

  • num_procs=None normalizes to 1, not automatic process-count detection.
  • num_procs <= 1 and num_threads <= 1 uses a local sequential backend.
  • num_procs <= 1 and num_threads > 1 uses the in-process thread backend.

File I/O

Use load_jsonl() for JSONL and load_json_or_pickle() for .json and pickle.

from speedy_utils import (
    dump_json_or_pickle,
    dump_jsonl,
    jdumps,
    jloads,
    load_by_ext,
    load_json_or_pickle,
    load_jsonl,
)

records = load_jsonl("data/file.jsonl")
records = load_jsonl("data/**/*.jsonl")
records = load_jsonl(["train/*.jsonl", "val/file.jsonl"])

data = load_json_or_pickle("data.json")
data = load_json_or_pickle("data.pkl")

dump_json_or_pickle({"name": "Alice"}, "out.json")
dump_jsonl([{"a": 1}, {"a": 2}], "out.jsonl")

obj = jloads('{"key": "value",}')
text = jdumps(obj)

data = load_by_ext("data.csv")
data = load_by_ext(["part1.jsonl", "part2.jsonl"])

For streaming or compressed JSONL, use fast_load_jsonl directly:

from speedy_utils.common.utils_io import fast_load_jsonl

for record in fast_load_jsonl(
    "data/large.jsonl.gz",
    progress=True,
    on_error="skip",
    max_lines=1000,
    use_orjson=True,
):
    ...

Data, Printing, and Timing Helpers

from speedy_utils import (
    Clock,
    convert_to_builtin_python,
    dedup,
    flatten_dict,
    flatten_list,
    fprint,
    print_table,
    timef,
)

flatten_list([[1, 2], [3, 4]])
flatten_dict({"a": {"b": 1}, "c": 2})
dedup([3, 1, 2, 1, 3])

fprint({"name": "Dana", "scores": [95, 87, 92]})
print_table([{"a": 1, "b": 2}, {"a": 3, "b": 4}])

@timef
def slow_function():
    ...

clock = Clock()

CLI Tools

The installed console scripts are:

CLI Purpose
mpython Launch sharded Python runs across tmux windows
kill-mpython Kill mpython tmux sessions
sp_chat Launch a Chainlit chat UI for an OpenAI-compatible backend
spu-prefetch-large-model Read large model files into the OS page cache
viz_chat Inspect chat datasets from JSON, JSONL, folders, or HF saves
openapi_client_codegen Generate a sync client from an OpenAPI JSON spec

Examples:

mpython -t 8 script.py
kill-mpython

sp_chat client=8000
sp_chat client=http://10.0.0.3:8000/v1 port=5010 model=Qwen/Qwen2.5-7B-Instruct

spu-prefetch-large-model /path/to/model -j 8

viz_chat data/my_dataset.jsonl
viz_chat data/hf_dataset/ --count 5
viz_chat data/tokenized_dataset/ --tokenizer Qwen/Qwen3-8B

openapi_client_codegen openapi.json -o generated_client.py

LLM

llm_utils wraps OpenAI-compatible chat and completion APIs.

LLM main entry points

from llm_utils import LLM

llm = LLM(client=8000)

The three main sync entry points are:

  • chat_completion(...) for chat responses.
  • generate(...) for raw prompt continuation through the completions API.
  • pydantic_parse(...) for structured outputs.

The convenience llm(...) wrapper routes like this:

  • llm("prompt") -> chat_completion(...)
  • llm("prompt", response_model=MyModel) -> pydantic_parse(...)
  • llm("prompt", return_dict=True) -> normalized dict with raw artifacts

Basic chat completion

from llm_utils import LLM

llm = LLM(model="gpt-4o-mini")
message = llm("What is Python?")
print(message.content)

Equivalent explicit call:

message = llm.chat_completion(
    [
        {"role": "system", "content": "Be concise."},
        {"role": "user", "content": "What is Python?"},
    ]
)

Structured output with Pydantic

from pydantic import BaseModel
from llm_utils import LLM


class Sentiment(BaseModel):
    sentiment: str
    confidence: float


llm = LLM(model="gpt-4o-mini")
result = llm.pydantic_parse(
    "Return JSON for the sentiment of: I love this product!",
    response_model=Sentiment,
)
print(result.sentiment, result.confidence)

Normalized dict output

result = llm(
    "Return JSON for the sentiment of: I love this product!",
    response_model=Sentiment,
    return_dict=True,
)

print(result.keys())
# dict_keys(["completion", "message", "messages", "parsed"])

Streaming chat responses

Streaming is only supported for text completions, not Pydantic parsing.

from llm_utils import LLM

llm = LLM(model="gpt-4o-mini")

for chunk in llm("Tell me a story", stream=True):
    content = chunk.choices[0].delta.content
    if content:
        print(content, end="", flush=True)

Raw prompt continuation with generate()

generate() uses the completions API and returns an OpenAI CompletionChoice-like object.

choice = llm.generate(
    "Write a haiku about coding:",
    max_tokens=50,
    temperature=0.8,
)

print(choice.text)
print(choice.finish_reason)
print(choice.usage.total_tokens)

Current public behavior:

  • generate() expects prompt to be a string.
  • n=1 only; multi-choice generation is rejected.
  • backend-specific metadata such as token_ids or prompt_logprobs is kept when the backend returns it.

Client configuration

from llm_utils import LLM
from openai import OpenAI

llm = LLM(
    client=OpenAI(base_url="http://localhost:8000/v1", api_key="sk-..."),
    model="llama-3",
)

llm = LLM(client=8000, model="llama-3")
llm = LLM(client="http://localhost:8000/v1", model="llama-3")
llm = LLM(client=[8000, 8001, 8002], model="llama-3")

Caching and history inspection

llm = LLM(model="gpt-4o-mini", cache=True)

message = llm("What is 2+2?")
again = llm("What is 2+2?")
fresh = llm("What is 2+2?", cache=False)

history = llm.inspect_history()

inspect_history() returns the recent conversation that was recorded for the last response.

LLMSignature

LLMSignature binds a Signature class to default structured output.

from llm_utils import Input, LLMSignature, Output, Signature


class SentimentSignature(Signature):
    text: str = Input("Text to analyze")
    sentiment: str = Output("positive | negative | neutral")
    confidence: float = Output("Confidence score")


sig = LLMSignature(signature=SentimentSignature, model="gpt-4o-mini")
result = sig("Analyze: I love this!")
print(result.sentiment, result.confidence)

Qwen3LLM

Qwen3LLM adds staged prefix continuation for Qwen3-style reasoning flows.

Standard chat path:

from llm_utils import Qwen3LLM

llm = Qwen3LLM(client=8000)
message = llm.chat_completion(
    [{"role": "user", "content": "Solve x^2 + 2x + 1 = 0"}],
    thinking_max_tokens=32,
    content_max_tokens=128,
)

print(message.content)
print(getattr(message, "reasoning_content", None))
print(getattr(message, "call_count", None))

Custom staged prefix flow:

memory_state = llm.complete_until(
    [{"role": "user", "content": "Plan the answer in stages"}],
    "<memory>",
    stop="</memory>",
    max_tokens=128,
)

think_state = llm.complete_until(
    [{"role": "user", "content": "Plan the answer in stages"}],
    memory_state.assistant_prompt_prefix + "\n<think_efficient>",
    stop="</think_efficient>",
    max_tokens=256,
)

final_state = llm.complete_until(
    [{"role": "user", "content": "Plan the answer in stages"}],
    think_state.assistant_prompt_prefix,
    stop="<|im_end|>",
    max_tokens=256,
)

print(final_state.generated_text)
print(final_state.assistant_prompt_prefix)
print(final_state.call_count)

complete_until() returns a continuation state object, not a ChatCompletionMessage.

Dataset Tools

datasets_utils.viz_chat is a lightweight dataset inspector for conversation data.

Supported inputs:

  • HuggingFace datasets saved with save_to_disk()
  • JSONL files
  • JSON files containing one object or a list of objects
  • Folders of JSON files
  • tokenized datasets when --tokenizer is provided

Examples:

viz_chat data/my_dataset
viz_chat data/conversations.jsonl
viz_chat data/sharegpt.jsonl --format sharegpt
viz_chat data/tokenized_dataset/ --tokenizer Qwen/Qwen3-8B
viz_chat data/with_tools.jsonl --show-tools

Vision Utils

vision_utils exports:

  • read_images
  • read_images_cpu
  • read_images_gpu
  • plot_images_notebook
  • ImageMmap
  • ImageMmapDynamic

Image loading

The image loaders return a dict mapping each input path to a NumPy array or None on failure.

from vision_utils import read_images, read_images_cpu, read_images_gpu

paths = ["img1.jpg", "img2.png"]

images = read_images(paths)
cpu_images = read_images_cpu(paths)
gpu_images = read_images_gpu(paths)

first_image = images[paths[0]]

Notebook plotting

plot_images_notebook() accepts NumPy arrays, PyTorch tensors, lists, or tuples of image arrays. If you loaded images with read_images*, pass the values.

from vision_utils import plot_images_notebook, read_images

paths = ["img1.jpg", "img2.png"]
images = read_images(paths)

plot_images_notebook(list(images.values()))

The current defaults include dpi=300, automatic grid sizing, and automatic format normalization for (H, W), (H, W, C), (C, H, W), (B, H, W, C), and (B, C, H, W) inputs.

Mmap-backed datasets

Both mmap dataset classes take image paths, not a prebuilt mmap filename as the only positional argument.

from vision_utils import ImageMmap, ImageMmapDynamic

paths = ["img1.jpg", "img2.jpg"]

fixed = ImageMmap(paths, size=(224, 224))
dynamic = ImageMmapDynamic(paths)

img = fixed[0]
img2 = dynamic[0]

Testing and Checks

# Run all tests with xdist workers
./tools/uv_test.sh -n 32

# Single test file
./tools/uv_test.sh tests/test_thread.py

# Verbose
./tools/uv_test.sh -v

# Check import-time budget
uv run python scripts/debug_import_time.py speedy_utils llm_utils vision_utils \
    --max-total-sec 0.4 --top 12 --min-sec 0.01 --no-stdlib

# Type checking
uv run python tools/check_syntax.py

# Ruff
uv run ruff check .
uv run ruff format .

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

speedy_utils-2.0.5.tar.gz (736.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

speedy_utils-2.0.5-py3-none-any.whl (141.5 kB view details)

Uploaded Python 3

File details

Details for the file speedy_utils-2.0.5.tar.gz.

File metadata

  • Download URL: speedy_utils-2.0.5.tar.gz
  • Upload date:
  • Size: 736.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.24 {"installer":{"name":"uv","version":"0.9.24","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for speedy_utils-2.0.5.tar.gz
Algorithm Hash digest
SHA256 143b2be7e817f1f8dd9966754db6b2b1fa526b1769ef531184dd2ed4dc484f69
MD5 9afab86dae43f7fbcb8f6bcc23f1eefd
BLAKE2b-256 bcbb8615b9d014a3ec214bbe0ae5ad82181148b163518bb3a0cb40c21ec1467b

See more details on using hashes here.

File details

Details for the file speedy_utils-2.0.5-py3-none-any.whl.

File metadata

  • Download URL: speedy_utils-2.0.5-py3-none-any.whl
  • Upload date:
  • Size: 141.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.24 {"installer":{"name":"uv","version":"0.9.24","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for speedy_utils-2.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 1c25126650c9106930592da0580dd2f9753d021b3589356ee4652a247636ed81
MD5 1e77564b1e64ef463cd04197e5c2c5c0
BLAKE2b-256 cc2422c3d93e4f4859c1ef3c3b7f0414e45314db2c689b96f34895ac637d8d71

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page