CUDA kernel library for Kestrel
Project description
kestrel-kernels
Precompiled CUDA kernels for Kestrel, a high-performance inference engine for Moondream, the world's most efficient vision-language model.
License: These kernels are provided for use with Kestrel only. Other use is not permitted.
These kernels target NVIDIA Ampere/Ada/Hopper GPUs (SM80/SM86/SM89/SM90) and are distributed as precompiled shared libraries for fast installation without CUDA compilation.
Kernel Library
CUDA Kernels (compiled via CMake)
These kernels are implemented in CUDA C++ and compiled during wheel build.
activation - GELU Residual Activation
Computes GELU(h) * (g + 1) fused gated activation used in MoE expert layers. The input tensor is split in half: h passes through GELU, g acts as a gate with +1 bias.
| Tokens | CUDA | PyTorch (eager) | Compile | vs PyTorch |
|---|---|---|---|---|
| 1 | 3.8 us | 64 us | 63 us | 17x |
| 64 | 2.9 us | 49 us | 69 us | 17x |
| 740 | 3.5 us | 49 us | 68 us | 14x |
| 1024 | 3.9 us | 49 us | 68 us | 13x |
| 2048 | 5.1 us | 49 us | 68 us | 10x |
PyTorch eager launches separate kernels for slice, erf, multiply, and add, with intermediate tensors hitting global memory. Our kernel fuses everything into a single pass. torch.compile is slower than eager here, likely because the dynamic x[:, :hidden] slicing prevents effective fusion.
fused_linear_residual - Linear + Bias + Residual
Fused out = x @ W.T + bias + residual using cuBLASLt epilogues.
| Crops | Tokens | CUDA | PyTorch (eager) | vs PyTorch |
|---|---|---|---|---|
| 1 | 729 | 9.0 us | 24 us | 2.7x |
| 2 | 1458 | 12 us | 24 us | 2.0x |
| 4 | 2916 | 16 us | 29 us | 1.8x |
| 8 | 5832 | 46 us | 50 us | 1.1x |
| 13 | 9477 | 44 us | 77 us | 1.7x |
cuBLASLt epilogues fuse bias addition and residual into the matmul, avoiding extra kernel launches and memory traffic.
fused_mlp - Fused MLP with cuBLASLt
Fused out = residual + gelu(x @ W1.T + b1) @ W2.T + b2 using cuBLASLt epilogues.
| Crops | Tokens | CUDA | PyTorch (eager) | vs PyTorch |
|---|---|---|---|---|
| 1 | 729 | 43 us | 56 us | 1.3x |
| 2 | 1458 | 72 us | 89 us | 1.2x |
| 4 | 2916 | 97 us | 124 us | 1.3x |
| 8 | 5832 | 214 us | 259 us | 1.2x |
| 13 | 9477 | 283 us | 379 us | 1.3x |
MLP is matmul-dominated so the speedup is modest. The gain comes from fusing GELU and residual add into cuBLASLt epilogues.
kv_cache_write - KV Cache Write with FP8 Quantization
Writes BF16 key/value tensors to FP8 paged KV cache with quantization.
| Tokens | Kestrel | vLLM | PyTorch (eager) | vs vLLM | vs PyTorch |
|---|---|---|---|---|---|
| 1 | 3.7 us | 4.9 us | 67 us | 1.3x | 18x |
| 8 | 3.5 us | 4.8 us | 35 us | 1.4x | 10x |
| 64 | 3.7 us | 4.8 us | 35 us | 1.3x | 9x |
| 256 | 4.1 us | 4.8 us | 36 us | 1.2x | 9x |
| 1024 | 8.6 us | 9.7 us | 51 us | 1.1x | 6x |
| 4096 | 31 us | 46 us | 124 us | 1.5x | 4x |
Fused K/V processing and optimized vectorization provide 1.1-1.5x speedup over vLLM's implementation.
layernorm_cuda - Fast LayerNorm Forward
Optimized LayerNorm forward pass for common hidden dimensions.
Vision Encoder (N=1152):
| Crops | Tokens | CUDA | PyTorch (eager) | vs PyTorch |
|---|---|---|---|---|
| 1 | 729 | 3.9 us | 8.4 us | 2.2x |
| 2 | 1458 | 4.2 us | 8.4 us | 2.0x |
| 4 | 2916 | 5.5 us | 10 us | 1.8x |
| 8 | 5832 | 8.3 us | 18 us | 2.1x |
| 13 | 9477 | 18 us | 28 us | 1.6x |
Text Decoder (N=2048):
| Context | Tokens | CUDA | PyTorch (eager) | vs PyTorch |
|---|---|---|---|---|
| decode | 1 | 4.2 us | 8.4 us | 2.0x |
| prefill | 740 | 3.7 us | 8.4 us | 2.3x |
Specialized kernels for N=1152 and N=2048 use 4 rows/block with warp-only reductions, avoiding shared memory overhead. Two epilogue strategies trade register pressure vs memory bandwidth.
moe_sum - MoE Output Summation
Sums the weighted outputs from top-k MoE experts back into a single hidden state per token. Computes out[t] = sum(expert_outputs[t, 0:k]) where each token selects k=8 experts.
| Context | Tokens | CUDA | PyTorch (eager) | vs PyTorch |
|---|---|---|---|---|
| decode | 1 | 3.0 us | 5.6 us | 1.9x |
| batch 4 | 4 | 3.0 us | 5.4 us | 1.8x |
| batch 16 | 16 | 2.9 us | 5.3 us | 1.8x |
| prefill | 740 | 5.5 us | 10 us | 1.9x |
| long | 1024 | 10 us | 15 us | 1.5x |
Vectorized 16-byte loads (8 bf16 at once), fully unrolled k=8 reduction. FP32 accumulation provides better numerical stability than bf16 accumulation. Note: vLLM has a similar kernel, but only supports topk=2,3,4 and falls back to PyTorch for topk=8.
rotary_embedding - Rotary Position Embedding
Applies rotary position embedding to query and key tensors (n_heads=32, head_dim=64).
| Context | Tokens | Kestrel | vLLM | PyTorch (eager) | vs vLLM | vs PyTorch |
|---|---|---|---|---|---|---|
| decode | 1 | 3.3 us | 4.9 us | 118 us | 1.5x | 36x |
| batch 4 | 4 | 3.1 us | 4.5 us | 117 us | 1.5x | 38x |
| batch 16 | 16 | 3.1 us | 4.7 us | 117 us | 1.5x | 38x |
| prefill | 740 | 5.0 us | 8.0 us | 119 us | 1.6x | 24x |
Vectorized bfloat162 pair processing, shared memory caching of cos/sin values, FP32 math for numerical stability. Split-head kernel for decode increases SM utilization on small batch sizes.
fp8_quant - FP8 Quantization
Converts BF16 tensors to FP8 (e4m3fn) with per-row dynamic scale computation. Used for quantizing MoE activations before FP8 GEMM.
| Context | Rows | CUDA | PyTorch (eager) | vs PyTorch |
|---|---|---|---|---|
| decode | 8 | 3.1 us | 53 us | 17x |
| batch 4 | 32 | 3.1 us | 52 us | 17x |
| batch 16 | 128 | 3.1 us | 52 us | 17x |
| prefill | 5920 | 6.6 us | 67 us | 10x |
Two kernel variants: warp-per-row for large batches (better SM utilization), block-per-row for small batches. Vectorized 16-byte loads/stores, fused absmax reduction.
tau_tail - TAU Attention Scaling
Applies per-head TAU scaling to Q and V in packed QKV. Computes scale = tanh(tok_linear) + tau_pos_table[position] then scales each head: Q *= scale_q, V *= scale_v.
| Context | Tokens | CUDA | PyTorch (eager) | vs PyTorch |
|---|---|---|---|---|
| decode | 1 | 4.6 us | 45 us | 10x |
| batch 4 | 4 | 4.4 us | 46 us | 10x |
| batch 16 | 16 | 9.0 us | 88 us | 10x |
| prefill | 740 | 6.5 us | 63 us | 10x |
CuTe DSL Kernels (precompiled for wheel distribution)
These kernels are written in NVIDIA CuTe DSL (Python) and precompiled to .so files during wheel build. The kernel source templates are excluded from wheel distribution.
Current runtime status:
- Production runtime for these kernels still uses the CuTe-generated AOT shared library path, loaded through the existing
tvm_ffiwrapper. - We now have a DLPack-based direct-cubin
topkpath in the source tree that does not usecutlass,libcute_dsl_runtime, ortvm_ffiin the migrated hot path. - That path builds the kernel on Linux, ships the emitted
cubinplus manifest, and launches it through_pybridgeusing the DLPack C exchange API for tensor and stream interop. - On B200 (
sm100), the preallocatedtopkdirect-cubin path is now at parity or better than the current production-style precompiled path:- batch
257:6.77 usdirect cubin vs7.24 usexisting precompiled path topk_fwd, batch257:8.95 usdirect cubin vs9.79 usexisting precompiled path
- batch
- On the Windows L4 dev host, the same Linux-built
sm89cubin ran successfully through the rebuilt_pybridgepath with correct results and correct non-default stream behavior. - The long-term runtime direction is now: Linux-only CuTe builders, bundled cubin artifacts,
_pybridgelaunchers, andtorch-c-dlpack-extas the dependency that guarantees the DLPack C exchange API is available for runtime interop.
Design notes for the ongoing refactor live in docs/CUTE_RUNTIME_REFACTOR_DESIGN.md.
topk - Bitonic Top-K Selection
GPU top-k selection using bitonic sort network with optional fused softmax.
| Context | Tokens | Kestrel | Quack | PyTorch (eager) | vs Quack | vs PyTorch |
|---|---|---|---|---|---|---|
| decode | 1 | 23 us | 29 us | 17 us | 1.3x | 0.8x |
| batch 16 | 16 | 22 us | 27 us | 17 us | 1.2x | 0.8x |
| prefill | 740 | 22 us | 28 us | 17 us | 1.2x | 0.7x |
Note: Currently slower than PyTorch for N=64, k=8. PyTorch uses radix-based QuickSelect which is more efficient for small N. Algorithm should be revisited.
An experimental direct-cubin runtime also exists for topk in the source tree. It demonstrates that this CuTe kernel can be built on Linux and run through our own native launcher on both Linux and Windows without a runtime dependency on cutlass or tvm_ffi.
Python API:
from kestrel_kernels.topk import topk_fwd
values, indices = topk_fwd(scores, k=8, softmax=True)
sampling - Top-p Token Sampling
CuTe DSL rejection-based top-p sampler for probability tensors.
Runtime dispatch uses the CuTe kernel path by default on CUDA, with fallback retained for unsupported cases and runtime errors.
Benchmarks below are H100 (sm90) dispatch-like timings (uniform generation + kernel launch), measured with heavy warmup and interleaved randomized runs:
| Shape (batch, vocab) | Kestrel CuTe | FlashInfer | vs FlashInfer |
|---|---|---|---|
| (1, 51200) | 17.37 us | 20.78 us | 1.20x |
| (4, 51200) | 21.17 us | 21.84 us | 1.03x |
| (128, 51200) | 38.96 us | 42.44 us | 1.09x |
| (32, 1024) | 15.25 us | 20.50 us | 1.34x |
Python API:
from kestrel_kernels.sampling import top_p_sampling_from_probs
sampled_ids = top_p_sampling_from_probs(probs, top_p, generator=generator)
cute_moe - MoE Matrix Multiplications
Grouped GEMM kernels for Mixture-of-Experts layers, written in CuTe DSL for H100 (SM90). Supports BF16 and FP8 (W8A8) precision with both warp-level and WGMMA variants, automatically selected based on batch size.
FP8 W8A8 Full MoE Layer (up + activation + down + sum, E=64, k=8, with CUDA Graphs):
| Context | Tokens | Kestrel | vLLM (Triton) | vs vLLM |
|---|---|---|---|---|
| decode | 1 | 29 us | 51 us | 1.72x |
| batch 4 | 4 | 79 us | 103 us | 1.30x |
| batch 16 | 16 | 146 us | 169 us | 1.16x |
| prefill | 740 | 245 us | 481 us | 1.96x |
Python API:
from kestrel_kernels import (
invoke_cute_moe_up,
invoke_cute_moe_down,
invoke_cute_moe_up_fp8,
invoke_cute_moe_down_fp8,
)
# BF16 up projection
out_up = invoke_cute_moe_up(
hidden_states, w1, w2,
topk_weights, topk_ids,
sorted_token_ids, expert_ids, num_tokens_post_pad,
)
# BF16 down projection
out_down = invoke_cute_moe_down(
moe_out, w3,
topk_weights, topk_ids,
sorted_token_ids, expert_ids, num_tokens_post_pad,
)
moe_align - MoE Token Alignment
Prepares sorted token indices for block-sparse MoE operations. Given topk_ids, outputs sorted token IDs grouped by expert for block-sparse matmul.
| Context | Tokens | Kestrel | vLLM | vs vLLM |
|---|---|---|---|---|
| decode | 1 | 6.7 us | 9.8 us | 1.5x |
| batch 4 | 4 | 6.5 us | 9.8 us | 1.5x |
| batch 16 | 16 | 7.0 us | 10 us | 1.4x |
| prefill | 740 | 12 us | 9.2 us | 0.8x |
| long | 1024 | 12 us | 9.5 us | 0.8x |
Uses optimized single-CTA shared-memory histogram for decode (numel < 1024). Prefill path needs optimization.
Python API:
from kestrel_kernels.moe_align import moe_align_block_size
moe_align_block_size(
topk_ids, num_experts, block_size,
sorted_token_ids, expert_ids, num_tokens_post_pad,
expert_map, # optional for expert parallelism
)
gelu_residual - GELU Residual Activation (CuTe DSL)
CuTe DSL implementation of GELU residual activation for BF16. Computes GELU(h) * (g + 1) fused gated activation used in MoE expert layers. Uses vectorized memory access and streaming stores.
| Context | Rows | CuTe | CUDA | PyTorch | vs CUDA | vs PyTorch |
|---|---|---|---|---|---|---|
| decode | 8 | 2.3 us | 2.5 us | 7.5 us | 1.10x | 3.3x |
| batch 4 | 32 | 2.4 us | 3.0 us | 8.6 us | 1.24x | 3.6x |
| batch 16 | 128 | 2.6 us | 2.9 us | 8.9 us | 1.09x | 3.4x |
| prefill | 5920 | 9.9 us | 11.2 us | 55.9 us | 1.14x | 5.6x |
fp8_quant_cute - FP8 Quantization (CuTe DSL)
CuTe DSL implementation of FP8 row-wise quantization. Converts BF16 tensors to FP8 (e4m3fn) with per-row dynamic scaling.
hidden=1024 (MoE down projection input):
| Context | Rows | CuTe | CUDA | vs CUDA |
|---|---|---|---|---|
| decode | 8 | 2.5 us | 2.7 us | 1.09x |
| batch 4 | 32 | 2.8 us | 3.0 us | 1.07x |
| batch 16 | 128 | 2.8 us | 3.0 us | 1.08x |
| prefill | 5920 | 5.3 us | 6.6 us | 1.23x |
hidden=2048 (MoE up projection input):
| Context | Rows | CuTe | CUDA | vs CUDA |
|---|---|---|---|---|
| decode | 8 | 2.6 us | 2.7 us | 1.02x |
| batch 4 | 32 | 2.9 us | 3.0 us | 1.04x |
| batch 16 | 128 | 2.9 us | 3.0 us | 1.04x |
| prefill | 5920 | 8.2 us | 10.7 us | 1.31x |
flash_attn - Flash Attention (Prefill & Decode)
Flash Attention kernels written in CuTe DSL, with a dedicated decode path optimized for paged FP8 KV cache. 1.3-2.5x faster than FlashInfer on typical Moondream workloads.
- FP8 KV cache with per-tensor scaling
- Paged KV (page_size=1) for fine-grained memory management
- CUDA graph compatible
- Causal and prefix-LM masking, variable-length sequences, GQA/MQA
FP8 KV Paged Decode (with CUDA Graphs):
| Batch | KV Len | Kestrel | FlashInfer | vs FlashInfer |
|---|---|---|---|---|
| 1 | 740 | 9.6 us | 12.9 us | 1.34x |
| 1 | 1024 | 8.7 us | 13.1 us | 1.50x |
| 4 | 740 | 17.1 us | 23.9 us | 1.40x |
| 8 | 512 | 10.0 us | 25.2 us | 2.51x |
| 16 | 256 | 9.6 us | 17.6 us | 1.83x |
| 32 | 128 | 11.8 us | 26.5 us | 2.24x |
FP8 KV Paged Prefill:
| Seq Len | Kestrel | FlashInfer | vs FlashInfer |
|---|---|---|---|
| 740 | 19.9 us | 47.6 us | 2.40x |
| 1024 | 27.3 us | 58.9 us | 2.16x |
Python API:
kestrel-kernels is shipped as an inference-only backend for Moondream/kestrel; flash_attn has a single forward entry point. Pass fixed-length tensors with seqlen_q / seqlen_k implicit in the shape, or paged/varlen tensors with page_table / seqused_k / cu_seqlens_*.
from kestrel_kernels.flash_attn.cute.interface import _flash_attn_fwd
# Fixed-length attention
out, _ = _flash_attn_fwd(q, k, v, causal=True)
# Paged / variable-length (one call handles both — pass whichever kwargs apply)
out, _ = _flash_attn_fwd(
q, k, v,
page_table=page_table,
seqused_k=seqused_k,
causal=True,
)
Autograd wrappers (flash_attn_func / flash_attn_varlen_func) and the backward pass were deleted — this package no longer supports training.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file kestrel_kernels-0.3.1-cp313-cp313-win_amd64.whl.
File metadata
- Download URL: kestrel_kernels-0.3.1-cp313-cp313-win_amd64.whl
- Upload date:
- Size: 34.9 MB
- Tags: CPython 3.13, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3fcc8c13364a76123fc396add287967e8376121a53897670d01cb841e64f5ef0
|
|
| MD5 |
5a0050f2506831e45baa094c875e2084
|
|
| BLAKE2b-256 |
ff239c72f86b89f0497c401f9a5f374355af6e78b762fb47c55d15ba4e316cd6
|
File details
Details for the file kestrel_kernels-0.3.1-cp313-cp313-manylinux_2_34_aarch64.manylinux_2_35_aarch64.whl.
File metadata
- Download URL: kestrel_kernels-0.3.1-cp313-cp313-manylinux_2_34_aarch64.manylinux_2_35_aarch64.whl
- Upload date:
- Size: 14.4 MB
- Tags: CPython 3.13, manylinux: glibc 2.34+ ARM64, manylinux: glibc 2.35+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4fe677976e67f5750d50ca383d18e35f8a1287fe0bf8639a153c860d19f5925a
|
|
| MD5 |
389168a3b44065de9db069c7452c881e
|
|
| BLAKE2b-256 |
840ad14ed612d9c1c9e4c9dcd973a72391bd08c2a997c2cb8657520aef2771a9
|
File details
Details for the file kestrel_kernels-0.3.1-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_31_x86_64.whl.
File metadata
- Download URL: kestrel_kernels-0.3.1-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_31_x86_64.whl
- Upload date:
- Size: 33.4 MB
- Tags: CPython 3.13, manylinux: glibc 2.24+ x86-64, manylinux: glibc 2.31+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0d03cf34a637add77ffa8b371ceb69e453fbb2c897b2b8ccf6b0f11d796c15fc
|
|
| MD5 |
ebc76074d9a8f810b45bcb192ba52a3a
|
|
| BLAKE2b-256 |
f4f85e834e291baf2bd8f1ad9512a4eb4c7c8247510a32fd3927a045aa366c6d
|
File details
Details for the file kestrel_kernels-0.3.1-cp312-cp312-win_amd64.whl.
File metadata
- Download URL: kestrel_kernels-0.3.1-cp312-cp312-win_amd64.whl
- Upload date:
- Size: 34.9 MB
- Tags: CPython 3.12, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
75a9b6b31ed3cfa5ac40e3527a56458019e113465c382757969c304c5ae2b733
|
|
| MD5 |
186972f7f4e9dadc9402932739053024
|
|
| BLAKE2b-256 |
b359f2d5a16849a302d3211a455f269233a2badf31db8fb7f93cf80fe3722ae7
|
File details
Details for the file kestrel_kernels-0.3.1-cp312-cp312-manylinux_2_34_aarch64.manylinux_2_35_aarch64.whl.
File metadata
- Download URL: kestrel_kernels-0.3.1-cp312-cp312-manylinux_2_34_aarch64.manylinux_2_35_aarch64.whl
- Upload date:
- Size: 14.4 MB
- Tags: CPython 3.12, manylinux: glibc 2.34+ ARM64, manylinux: glibc 2.35+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
987607e36e1ddbeb29d579789239e2b733c9017c421cb52a4dcc4a7c78613e14
|
|
| MD5 |
b5c1bf7e8ea0255fb2ed7b0560d826e4
|
|
| BLAKE2b-256 |
fbd4c0c78701acba386e544155e7369c1a35a4f773eef8f7e144b01667b07047
|
File details
Details for the file kestrel_kernels-0.3.1-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_31_x86_64.whl.
File metadata
- Download URL: kestrel_kernels-0.3.1-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_31_x86_64.whl
- Upload date:
- Size: 33.4 MB
- Tags: CPython 3.12, manylinux: glibc 2.24+ x86-64, manylinux: glibc 2.31+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a6866e90adb73b70506fd7520f6f4f816403730b32746289d2304b2eff2fbb96
|
|
| MD5 |
7bd34d88e66722cfc044da7d83c673bf
|
|
| BLAKE2b-256 |
13c36d4ae7632aa4787364a6e33de6befb83c707dc98f013af149815821aafee
|
File details
Details for the file kestrel_kernels-0.3.1-cp312-cp312-macosx_13_0_arm64.whl.
File metadata
- Download URL: kestrel_kernels-0.3.1-cp312-cp312-macosx_13_0_arm64.whl
- Upload date:
- Size: 401.5 kB
- Tags: CPython 3.12, macOS 13.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5d5d85f49c9d95982c060aeb926fd4b2f28efc46b91d525e6d185b376c0370a6
|
|
| MD5 |
91aeb5b3ac8f3af54bfebf464f1d4af8
|
|
| BLAKE2b-256 |
dbaff33e5fa3359ef3ae7585d6cac78a32cc92c70dd14a244f588fa991c069e2
|
File details
Details for the file kestrel_kernels-0.3.1-cp311-cp311-win_amd64.whl.
File metadata
- Download URL: kestrel_kernels-0.3.1-cp311-cp311-win_amd64.whl
- Upload date:
- Size: 34.9 MB
- Tags: CPython 3.11, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9f1ed1e6699f90574a6f8517c0620a730b1cfee31f3c65e75545f855ae879c2e
|
|
| MD5 |
d1f0e57c0ce2e3a12c89283398ca2afc
|
|
| BLAKE2b-256 |
a2e44727184e29cb0ef0ba6b2e35d3ea33651ab11d0d3e9b6fe68cf9e22eb2e0
|
File details
Details for the file kestrel_kernels-0.3.1-cp311-cp311-manylinux_2_34_aarch64.manylinux_2_35_aarch64.whl.
File metadata
- Download URL: kestrel_kernels-0.3.1-cp311-cp311-manylinux_2_34_aarch64.manylinux_2_35_aarch64.whl
- Upload date:
- Size: 14.4 MB
- Tags: CPython 3.11, manylinux: glibc 2.34+ ARM64, manylinux: glibc 2.35+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0ba56a2c6ea7d667311d7f3e1103ba00a630250b7a7f686edb78e113d5a43602
|
|
| MD5 |
8329b595929ab66a38deca7883a2ed43
|
|
| BLAKE2b-256 |
99a838898ef0486092f9cd887c8fd36583491fab3c2fb86d3289ba7577e1cde5
|
File details
Details for the file kestrel_kernels-0.3.1-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_31_x86_64.whl.
File metadata
- Download URL: kestrel_kernels-0.3.1-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_31_x86_64.whl
- Upload date:
- Size: 33.4 MB
- Tags: CPython 3.11, manylinux: glibc 2.24+ x86-64, manylinux: glibc 2.31+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5048a06a6e00d385503c7ab12c5b181d3dcb18fd0c10b20722d26f838f484cc6
|
|
| MD5 |
3a9d4c3e731e8ea8d3e7380db8620ff2
|
|
| BLAKE2b-256 |
07c59933e7ba39b993e0388589d7423c8b461c7aa36de8f659f6257bd710e4a7
|
File details
Details for the file kestrel_kernels-0.3.1-cp310-cp310-win_amd64.whl.
File metadata
- Download URL: kestrel_kernels-0.3.1-cp310-cp310-win_amd64.whl
- Upload date:
- Size: 34.9 MB
- Tags: CPython 3.10, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a2bedecbd386c70474da479d8a672535f6e3d260eb88eba86cd2b1ad6e49b6c4
|
|
| MD5 |
468752022fba0cfc5c08a3330426e0b2
|
|
| BLAKE2b-256 |
69f20c2d399965edc30252f10c72bd2f540d1d04ab4fb0fe82f64a1488c14475
|
File details
Details for the file kestrel_kernels-0.3.1-cp310-cp310-manylinux_2_34_aarch64.manylinux_2_35_aarch64.whl.
File metadata
- Download URL: kestrel_kernels-0.3.1-cp310-cp310-manylinux_2_34_aarch64.manylinux_2_35_aarch64.whl
- Upload date:
- Size: 14.4 MB
- Tags: CPython 3.10, manylinux: glibc 2.34+ ARM64, manylinux: glibc 2.35+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2d857903f91a34bd0e26c31b2b058f27807228ff69ca310035658c0d87477256
|
|
| MD5 |
81ce03fa2c7a9d0731e348093ca58a9e
|
|
| BLAKE2b-256 |
03dccdafd48ad9a21e0d244300f56e7344061dd369868ac0eea8b2c170ac9ef6
|
File details
Details for the file kestrel_kernels-0.3.1-cp310-cp310-manylinux_2_24_x86_64.manylinux_2_31_x86_64.whl.
File metadata
- Download URL: kestrel_kernels-0.3.1-cp310-cp310-manylinux_2_24_x86_64.manylinux_2_31_x86_64.whl
- Upload date:
- Size: 33.4 MB
- Tags: CPython 3.10, manylinux: glibc 2.24+ x86-64, manylinux: glibc 2.31+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
01908fb2834f076af0c35fa4e88a8409979922c3d6ea3fcbcf477ed98b555a50
|
|
| MD5 |
1102a71fb9d13978a67d21043dc22c26
|
|
| BLAKE2b-256 |
3802d691bbe4d0433d90aafc6a147c12ae5b925be3a761748188a0a87cc73663
|