Stop OOM crashes in vLLM, SGLang, Unsloth, and HuggingFace. Proactive memory estimation and runtime KV-cache monitoring for LLM inference serving and fine-tuning on Apple Silicon, CUDA, and CPU.
Project description
memory-guard
Stop No available memory for cache blocks in vLLM. Stop CUDA out of memory in Unsloth. Stop frozen Macs in mlx_lm. Works across inference serving and fine-tuning — and learns optimal configs from experience.
No more gpu_memory_utilization trial-and-error. No more KV cache crashes at 3 AM. No more wasted GPU-hours on jobs that OOM in the first minute.
→ How much is your team wasting on OOM crashes? Find your number.
pip install ml-memguard # core (zero dependencies)
pip install ml-memguard[hf] # + HuggingFace Transformers adapter
pip install ml-memguard[unsloth] # + Unsloth adapter
pip install ml-memguard[apple] # + MLX Metal ground-truth monitoring
pip install ml-memguard[cuda] # + CUDA OOM recovery
pip install ml-memguard[vllm] # + vLLM inference serving adapter
pip install ml-memguard[sglang] # + SGLang inference serving adapter
The Problem
Inference serving (vLLM / SGLang / Ollama)
No available memory for the cache blocks. Try increasing gpu_memory_utilization is vLLM's most-filed error. It appears when --gpu-memory-utilization, --max-num-seqs, and --max-num-batched-tokens are misconfigured — which they almost always are on first deploy. There is no formula; the official advice is to tune until it stops crashing. When it does crash mid-serving, it takes live user traffic down with it.
- vLLM has 30+ distinct numbered OOM issues. SGLang has a dedicated OOM tracking issue. Ollama makes the host unresponsive.
- KV cache grows linearly with context length × batch size × layers. A 128k-context Llama 3 70B needs ~40 GB of KV cache on top of ~140 GB for weights.
- There is no built-in tool that tells you the right
max_num_seqsbefore you launch.
Fine-tuning (Unsloth / HuggingFace / mlx_lm)
- Apple Silicon: No OOM exception exists. When you exceed memory, macOS silently swaps to disk, your Mac freezes for minutes, and eventually the OS kills your process.
- CUDA:
torch.cuda.OutOfMemoryErrorcrashes your training run. You restart, guess a smaller batch size, and pray. - Containers: cgroups silently kill your process with no warning when you hit the memory limit.
Existing solutions (PyTorch Lightning BatchSizeFinder, HuggingFace accelerate) are CUDA-only and reactive — they catch OOM exceptions that don't exist on Apple Silicon and do nothing for inference servers.
The Solution
memory-guard is proactive, not reactive. For inference: it calculates the safe max_num_seqs before you launch the server and monitors KV cache utilization at runtime. For fine-tuning: it estimates peak memory before training starts and auto-adjusts batch size, LoRA rank, and sequence length to fit. Both paths use an RL optimizer that learns your specific device over time.
Inference serving (vLLM)
from memory_guard import guard_vllm
from vllm import LLM
llm = LLM(model="meta-llama/Llama-3.1-8B-Instruct", gpu_memory_utilization=0.9)
safe = guard_vllm(llm)
# InferenceSafeConfig:
# max_num_seqs: 32 <- largest concurrent batch that fits
# max_seq_len: 4096
# estimated memory: 18,240 MB
# budget: 19,456 MB
# Wire the KV cache monitor — fires on_shed_load at 92% utilization
safe.monitor.on_shed_load = lambda u: load_balancer.reduce_weight("primary", 0)
safe.monitor.on_warning = lambda u: logger.warning("KV cache at %.0f%%", u * 100)
with safe.monitor.session():
server.serve_forever() # monitor runs in background thread
Fine-tuning (Unsloth / HuggingFace / mlx_lm)
from memory_guard import MemoryGuard
guard = MemoryGuard.auto()
# Pre-flight: estimate memory and auto-downgrade config
safe = guard.preflight(
model_params=9_000_000_000, # 9B parameter model
model_bits=4, # 4-bit quantized
hidden_dim=4096,
num_heads=32,
num_layers=32,
batch_size=4,
seq_length=2048,
lora_rank=32,
lora_layers=16,
)
print(safe)
# SafeConfig (FITS):
# batch_size: 2 <- auto-reduced from 4
# grad_checkpoint: True <- auto-enabled
# grad_accumulation:4 <- compensates for smaller batch
# estimated memory: 3835 MB
# budget: 4643 MB
# Runtime monitoring: polls memory pressure every 5s
with guard.monitor(safe.batch_size) as mon:
for step in range(1000):
# Batch size may decrease mid-training if pressure rises
train_step(batch_size=mon.current_batch_size)
Features
Inference serving
| Feature | vLLM | SGLang | Ollama / custom |
|---|---|---|---|
Safe max_num_seqs pre-flight |
Yes | Yes | Yes (via preflight_inference) |
| KV cache utilization monitoring | Yes | Yes | Yes (KVCacheMonitor) |
| Load-shed signal at 92% KV utilization | Yes | Yes | Yes |
| Warning signal at 80% KV utilization | Yes | Yes | Yes |
| RL optimizer (learns per device/model) | Yes | Yes | Yes |
| Architecture auto-introspection | Yes (hf_config) | Yes (server_args) | Manual |
Fine-tuning
| Feature | Apple Silicon | CUDA | Linux CPU | Windows |
|---|---|---|---|---|
| Proactive memory estimation | Yes | Yes | Yes | Yes |
| Auto-downgrade config | Yes | Yes | Yes | Yes |
| RL optimizer (learns per device) | Yes | Yes | Yes | Yes |
| Runtime pressure monitoring | Yes (Mach kernel + MLX Metal) | Yes (torch.cuda) | Yes (PSI, cgroups) | Yes (kernel32) |
| MLX Metal ground-truth | Yes (mx.metal.get_active_memory) | N/A | N/A | N/A |
| OOM catch & retry | N/A (no OOM on Metal) | Yes | N/A | N/A |
| Container-aware (cgroups v1/v2) | N/A | Yes | Yes | N/A |
| Auto-calibration | Yes | Yes | Yes | Yes |
| FlashAttention-aware | Yes | Yes | Yes | Yes |
| GQA / MoE / Multi-modal | Yes | Yes | Yes | Yes |
How It Works
Inference serving path
preflight_inference() computes the memory footprint of model weights + KV cache at a given max_num_seqs and max_seq_len, then binary-searches for the largest concurrent batch that fits within your GPU budget (default: 80% of available VRAM). The RL optimizer learns which max_num_seqs value worked well on your device and model over time, replacing binary search with a confident recommendation after a handful of runs.
KVCacheMonitor runs a background thread polling the live KV cache token counts from vLLM or SGLang. At 80% utilization it fires on_warning; at 92% it fires on_shed_load. Neither callback does anything by default — they are signals. Your load balancer or health endpoint decides what to do (reduce upstream weight, return 503, etc.). The engine is never mutated while serving.
Fine-tuning path
1. Proactive Estimation
Calculates peak memory from model architecture, accounting for:
- Per-projection LoRA input buffers (Q, K, V, O)
- FlashAttention O(n) vs standard O(n^2) attention scores
- GQA-aware KV cache (uses
num_kv_heads, notnum_heads) - MoE routing buffers and active expert activations
- Optimizer states (Adam 3x, SGD 2x, Adafactor 1.5x)
- MLX lazy evaluation discount (20% reduction on Apple Silicon)
- Framework overhead (25% proportional + 400MB fixed runtime cost)
With gradient checkpointing, activation memory drops to sqrt(layers).
2. Auto-Downgrade (quality-preserving order)
When estimate exceeds budget (available × 80%):
- Enable gradient checkpointing (free quality, ~40% activation savings)
- Halve batch size (compensate with gradient accumulation)
- Halve sequence length
- Halve LoRA rank
- Halve LoRA layers
3. Runtime Monitoring
Background thread polls memory pressure every 5 seconds:
- Apple Silicon:
mx.metal.get_active_memory()(ground-truth from Metal allocator), withkern.memorystatus_levelas fallback. Detects the monotonic memory growth pattern from mlx-examples#1262. - CUDA:
torch.cuda.memory_allocated()vs total VRAM - Linux:
/proc/pressure/memory(PSI), cgroup-aware (memory.highpreferred overmemory.max) - Windows:
GlobalMemoryStatusEx
When pressure exceeds 85%, batch size is halved mid-training.
4. Auto-Calibration
After each training run, the actual peak memory (from mx.metal.get_peak_memory() or torch.cuda.max_memory_allocated()) is recorded alongside the formula estimate. After 3+ runs, a median correction factor is applied to future estimates, narrowing the gap between predicted and actual memory usage over time.
5. RL Optimizer (v0.4)
A contextual bandit that learns which (batch_size, lora_rank) combination works best on your specific device and model. On cold start it falls back to the binary-search path from step 2. After a handful of runs it starts recommending configs it has learned are safe and efficient — and still falls back to binary search on the 5 % exploration floor so novel model architectures always get probed.
guard = MemoryGuard.auto() # loads ~/.memory-guard/rl_policy.json on disk
safe = guard.preflight(...) # bandit recommends once it has learned; binary search until then
# ... training loop ...
guard.record_result(
actual_peak_mb=get_peak_memory(),
oom_occurred=False, # set True if training crashed with OOM
)
# → updates the Q-table and saves the policy file atomically
The policy is a plain JSON file — human-readable, editable, and deletable. See docs/rl_optimizer.md for the full reference and docs/decisions/004-rl-contextual-bandit.md for the design rationale.
Framework Integration
With vLLM
from memory_guard import guard_vllm
from vllm import LLM
llm = LLM(model="meta-llama/Llama-3.1-8B-Instruct", gpu_memory_utilization=0.9)
safe = guard_vllm(llm)
# safe.max_num_seqs → pass to --max-num-seqs
# safe.monitor → KVCacheMonitor, ready to start
safe.monitor.on_shed_load = lambda u: load_balancer.reduce_weight("primary", 0)
safe.monitor.on_warning = lambda u: logger.warning("KV cache %.0f%%", u * 100)
with safe.monitor.session():
server.serve_forever()
guard_vllm reads architecture directly from model_config.hf_config — no manual hidden_dim or num_layers required.
With SGLang
from memory_guard import guard_sglang
from sglang import Runtime
runtime = Runtime(model_path="meta-llama/Llama-3.1-8B-Instruct")
safe = guard_sglang(runtime)
safe.monitor.on_shed_load = lambda u: nginx.upstream_weight("primary", 0)
with safe.monitor.session():
runtime.wait()
Polls token_to_kv_pool (preferred) or scheduler.get_stats() as fallback. Rolling-max smoothing suppresses false-recovery signals from RadixAttention prefix-cache evictions.
With mlx_lm (Apple Silicon)
import mlx.optimizers as optim
from memory_guard import MemoryGuard
from mlx_lm import load
from mlx_lm.tuner.trainer import train, TrainingArgs
from mlx_lm.tuner.utils import linear_to_lora_layers
guard = MemoryGuard.auto()
model, tokenizer = load("mlx-community/Qwen3.5-9B-MLX-4bit")
safe = guard.preflight(
model_params=9e9, model_bits=4,
hidden_dim=4096, num_heads=32, num_layers=32,
batch_size=4, seq_length=2048,
lora_rank=32, lora_layers=16,
)
model.freeze()
linear_to_lora_layers(
model, safe.lora_layers,
{"rank": safe.lora_rank, "scale": 20.0, "dropout": 0.0},
)
optimizer = optim.Adam(learning_rate=1e-4)
# The monitor runs in the background and logs if memory pressure rises.
# Note: mlx_lm's train() uses a fixed batch size. For dynamic adjustment,
# use a custom training loop that reads mon.current_batch_size each step.
with guard.monitor(safe.batch_size) as mon:
train(
model=model, optimizer=optimizer, train_dataset=train_set,
args=TrainingArgs(
batch_size=safe.batch_size,
iters=1000,
max_seq_length=safe.seq_length,
grad_checkpoint=safe.grad_checkpoint,
adapter_file="adapters.safetensors",
),
)
With HuggingFace Transformers (CUDA / CPU)
One call reads the model's architecture, runs preflight, patches trainer.args,
and attaches MemoryGuardCallback for mid-training batch-size downgrade.
from memory_guard import guard_trainer
from transformers import AutoModelForCausalLM, Trainer, TrainingArguments
model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.1-8B")
trainer = Trainer(
model=model,
args=TrainingArguments(output_dir="./output", max_steps=1000),
train_dataset=train_set,
)
guard_trainer(trainer) # reads model, runs preflight, patches args + callback
trainer.train()
Pass preflight_overrides to lock in specific values the adapter can't infer
(e.g. guard_trainer(trainer, batch_size=8, seq_length=4096, lora_rank=32)).
With Unsloth
Three lines — no manual architecture spelunking required. guard_unsloth_model
introspects the loaded model, runs preflight, and returns a SafeConfig you
thread directly into get_peft_model.
from memory_guard import guard_unsloth_model, guard_sft_trainer
from unsloth import FastLanguageModel
from trl import SFTTrainer
# 1. Load model (before LoRA)
model, tokenizer = FastLanguageModel.from_pretrained(
"unsloth/Meta-Llama-3.1-8B-bnb-4bit",
max_seq_length=2048,
load_in_4bit=True,
)
# 2. Preflight — reads architecture automatically, auto-downgrades if needed
safe = guard_unsloth_model(model) # ← the one line that replaces all the math
# 3. Attach LoRA using the safe values
model = FastLanguageModel.get_peft_model(
model,
r=safe.lora_rank,
lora_alpha=safe.lora_rank * 2,
max_seq_length=safe.seq_length,
target_modules=["q_proj", "k_proj", "v_proj", "o_proj"],
)
# 4. Train with mid-training downgrade protection
trainer = SFTTrainer(model=model, tokenizer=tokenizer, ...)
guard_sft_trainer(trainer) # patches trainer.args + adds MemoryGuardCallback
trainer.train()
BnB double-quantization: Unsloth loads with
bnb_4bit_use_double_quant=Trueby default. memory-guard detects this and applies a 5 % correction to the weight-memory estimate. Auto-calibration refines the correction after 3+ training runs.
Framework Adapters
New in v0.2.0 — adapters read the model's architecture automatically so you
don't have to look up hidden_size, num_heads, or num_layers.
Inference serving adapters added in v0.3.0; RL optimizer integrated in v0.4.0.
Full reference: docs/adapters.md.
How model introspection works
introspect_model(model) reads directly from model.config and
model.parameters() without importing torch or transformers at the call site:
| Field read | Source |
|---|---|
hidden_size |
model.config.hidden_size |
num_attention_heads |
model.config.num_attention_heads |
num_hidden_layers |
model.config.num_hidden_layers |
num_key_value_heads |
model.config.num_key_value_heads (falls back to num_attention_heads for MHA) |
model_bits |
quantization_config.load_in_4bit / load_in_8bit, else model.dtype (fp16/bf16 → 16, fp32 → 32) |
num_parameters |
sum(p.numel() for p in model.parameters()) |
When to pass preflight_overrides
Introspected values cover most cases. Override when:
| Scenario | Pass |
|---|---|
| Specific training batch size | batch_size=8 |
| Non-default sequence length | seq_length=4096 |
| Fixed LoRA config | lora_rank=32, lora_layers=24 |
| Model loaded at different precision | model_bits=16 |
# Override batch_size and lora_rank; everything else is introspected
guard_trainer(trainer, batch_size=8, lora_rank=32)
safe = guard_unsloth_model(model, seq_length=4096, lora_rank=16)
QLoRA with BnB double-quantization
When Unsloth (or any HF model) loads with bnb_4bit_use_double_quant=True,
guard_unsloth_model automatically applies a 5 % correction to the weight-memory
estimate. model_bits stays 4; only num_parameters is scaled down to proxy
the reduced quantization-constant footprint. After 3+ runs, auto-calibration
refines the correction further.
API Reference
Inference Serving (v0.3+)
guard.preflight_inference(...) -> InferenceSafeConfig
Find the largest max_num_seqs that fits in your GPU budget. Binary-searches from your requested max down to 1; uses the RL optimizer if it has learned this device/model combination.
safe = guard.preflight_inference(
model_params=8e9, model_bits=4,
hidden_dim=4096, num_kv_heads=8, num_layers=32,
max_seq_len=4096, max_num_seqs=256,
)
# safe.max_num_seqs → int, largest safe concurrent batch
# safe.monitor → KVCacheMonitor (not yet started)
guard_vllm(llm, ...) -> InferenceSafeConfig
One call. Reads architecture from model_config.hf_config, runs preflight, returns InferenceSafeConfig. No manual config spelunking.
guard_sglang(runtime, ...) -> InferenceSafeConfig
Same as guard_vllm for SGLang. Reads server_args.context_length and server_args.max_running_requests.
KVCacheMonitor
Background-thread monitor. Fires on_warning at 80% KV utilization, on_shed_load at 92%. Start with safe.monitor.session() context manager or safe.monitor.start() / safe.monitor.stop() directly.
Fine-Tuning
MemoryGuard.auto(safety_ratio=0.80)
Create with auto-detected platform. safety_ratio controls headroom (0.80 = use 80% of available).
guard.preflight(**config) -> SafeConfig
Estimate memory and auto-downgrade. Returns safe config.
guard.monitor(batch_size) -> RuntimeMonitor
Context manager for runtime monitoring. Use mon.current_batch_size in training loop.
guard.estimate(**config) -> MemoryEstimate
Pure estimation without auto-downgrade.
estimate_training_memory(**config) -> MemoryEstimate
Standalone estimation function.
auto_downgrade(budget_mb, **config) -> DowngradeResult
Standalone downgrade function.
CUDAOOMRecovery(initial_batch_size)
CUDA-specific OOM catch-and-retry wrapper.
RL Optimizer (v0.4)
guard.record_result(actual_peak_mb=None, oom_occurred=False, policy_update=True, model_name="")
Call after each training run to update the calibration store and the RL
policy. actual_peak_mb is auto-detected from MLX/CUDA if not supplied.
Set oom_occurred=True if the run ended with OOM — the policy learns a
negative reward and avoids that config in future.
BanditPolicy.load(path=None) -> BanditPolicy
Load the policy from disk (defaults to ~/.memory-guard/rl_policy.json).
Returns a fresh cold-start policy silently if the file is absent or corrupt.
BanditPolicy.q_value(state_key, action) -> float
Read the current Q-value for a (StateKey, ConfigAction) pair (0.0 for
unseen entries).
StateKey.from_values(available_mb, backend, model_params, model_bits)
Convenience constructor for use with BanditPolicy.q_value() and
BanditPolicy.update() directly.
Full reference: docs/rl_optimizer.md.
Fine-Tuning Adapters (v0.2, pip install ml-memguard[hf])
guard_trainer(trainer, guard=None, **preflight_overrides) -> SafeConfig
Attach memory-guard to a HuggingFace Trainer in one call. Introspects the
model, runs preflight, writes safe values to trainer.args, and appends
MemoryGuardCallback.
MemoryGuardCallback(guard)
TrainerCallback subclass. on_train_begin starts the monitor;
on_step_begin records a pending batch-size downgrade when the monitor signals
pressure; on_epoch_begin applies it (scales gradient_accumulation_steps to
preserve effective batch); on_train_end stops the monitor and records
calibration data.
guard_unsloth_model(model, guard=None, **preflight_overrides) -> SafeConfig
Run preflight on an Unsloth model before FastLanguageModel.get_peft_model is
called. Thread safe.lora_rank, safe.lora_layers, safe.seq_length into
get_peft_model. Detects BnB double-quantization and applies a 5 % correction.
guard_sft_trainer(trainer, guard=None, **preflight_overrides) -> SafeConfig
Identical to guard_trainer but named for TRL SFTTrainer workflows.
Design constraint (ADR 003): guard_vllm and guard_sglang emit signals only — they never mutate a running engine. Load-shedding requires a load balancer or health endpoint in front of the engine. See docs/adapters.md for the full reference.
Estimation Accuracy
Measured accuracy on real training runs. We need your help expanding this table — see Contributing below.
| Model | Device | Batch | Seq | Rank | Estimated | Actual | Error |
|---|---|---|---|---|---|---|---|
| Qwen3.5-9B-4bit | M4 Max 36GB | 1 | 512 | 8 | 6,193 MB | 7,048 MB | 12.1% under |
| Qwen3.5-9B-4bit | M4 Max 36GB | 1 | 128 | 16 | 9,522 MB | 8,879 MB | 7.2% over |
What's tested:
- LoRA fine-tuning with mlx_lm on Apple Silicon (M4 Max)
- HuggingFace
Trainer+MemoryGuardCallbackend-to-end (distilgpt2, CPU, fp32) — integration smoke test passes in CI - BnB 4-bit + double-quantization detection and 5 % correction (unit-tested against mock models)
What's NOT tested yet:
- CUDA GPUs (RTX 3060/4090, A100, H100)
- AMD ROCm (RX 7900, MI300X)
- Smaller devices (M1/M2 MacBook Air 8-16GB)
- Models below 7B or above 13B
- MoE architectures (Mixtral, DeepSeek-MoE)
- Multi-modal models (LLaVA, Qwen-VL)
- DoRA, full fine-tuning
- PyTorch Lightning, Axolotl, LitGPT
The estimation formula is based on published research (FlashAttention, HyC-LoRA, LoRA-FA) and verified on one configuration. Auto-calibration improves accuracy after 3+ runs on any given setup.
Known Limitations
- Single validation point: Estimation accuracy is verified on one model/device combination. Your results may differ significantly — please report them.
- Inference monitoring:
KVCacheMonitor(v0.3.0) emits signals only — it never mutates a running vLLM or SGLang engine. Load-shedding requires a load balancer or health-endpoint pattern in front of the engine. - Calibration cold start: Auto-calibration needs 3+ training runs on a given device before corrections kick in.
- Custom kernels: Frameworks with heavily fused kernels (Unsloth) use less memory than the formula predicts. Calibration corrects this over time.
- MLX Metal thread safety:
mx.metal.get_active_memory()is called from a background thread. MLX's Metal backend has known thread safety limitations. Memory counter reads work in practice but aren't guaranteed thread-safe by the MLX API. - Windows: CUDA path uses well-tested
torch.cudaAPIs. The CPU-only fallback (GlobalMemoryStatusEx) hasn't been validated across Windows versions.
Contributing
Help Us Benchmark
The single most valuable contribution right now is running the benchmark on your hardware and sharing the results. This directly improves estimation accuracy for everyone.
# Install
pip install ml-memguard mlx-lm
# Run with default small model (fast, ~2 minutes)
python bench/bench_accuracy.py
# Run with a specific model
python bench/bench_accuracy.py --model mlx-community/Mistral-7B-Instruct-v0.3-4bit
# Generate a pre-formatted GitHub issue with your results
python bench/bench_accuracy.py --model mlx-community/Qwen3.5-9B-MLX-4bit --submit
Then open a GitHub issue with the output. We'll add your results to the accuracy table above.
Devices we especially need data from:
- M1/M2 MacBook Air (8GB, 16GB)
- M3/M4 MacBook Pro (18GB, 36GB)
- RTX 3060/3090, RTX 4070/4090
- A100, H100
- AMD Radeon RX 7900 / MI300X
- Docker/Kubernetes containers with memory limits
Other Contributions
- Framework adapters: PyTorch Lightning, Axolotl, LitGPT wrappers (HF Transformers and Unsloth ship in v0.2.0; vLLM and SGLang in v0.3.0; RL optimizer in v0.4.0)
- Accuracy data: Real training runs on CUDA or non-Apple hardware — see the table above
- Bug reports: If the estimate was off by >30%, that's a bug — please report it with your config
License
Apache 2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ml_memguard-0.5.0.tar.gz.
File metadata
- Download URL: ml_memguard-0.5.0.tar.gz
- Upload date:
- Size: 149.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
50806216659eecb4fbd7dfc46740f6767c40194c2e0abd0dad5cd3da93c1b005
|
|
| MD5 |
0d0ca640fb49bdf16e607902a1f1cc86
|
|
| BLAKE2b-256 |
4ec75b97eb730e9088fc322eb23c444f55aac49e8e4f9ac5317337f2306f673f
|
File details
Details for the file ml_memguard-0.5.0-py3-none-any.whl.
File metadata
- Download URL: ml_memguard-0.5.0-py3-none-any.whl
- Upload date:
- Size: 91.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2dd73124baacdfb0deb4bea94db75a6ce20f6c3bed738030c916401810048cac
|
|
| MD5 |
6f215167e976a0fadb02bbe2cbd24624
|
|
| BLAKE2b-256 |
00d51a153bf882b83f05cc70d28d2d1ff1a0e0eb5a796f65888567400e91e048
|