Skip to main content

Asynchronous Self-Healing KV Cache for Silicon-Native LLMs

Project description

ASH-KV: The Self-Healing Middleware for LLMs

Hardware License Version

ASH-KV is a high-performance, hardware-aware middleware layer designed to provide Runtime Integrity for Large Language Models. By surgically intercepting and correcting the KV cache at the silicon level, it prevents logical drift and clinical hallucinations with zero detectable latency.


🏛️ Core Value Pillars

⚡ Zero-Latency Integrity

Surgical KV cache mutation at Metal (Apple Silicon) and CUDA (NVIDIA) speeds. Our Fused Kernels ensure that the "Immune System" adds virtually 0% overhead to inference throughput.

🔌 Hardware Agnostic (Universal HAL)

The Hardware Abstraction Layer (HAL) automatically detects your silicon and hot-swaps between MLX and PyTorch backends. The same code runs on an M4 MacBook or an NVIDIA H100 server.

🛡️ Adaptive Shielding & Real-Time Healing

Autonomous sensitivity scaling via the AdaptiveSensitivity Agent. Integrated with a Deterministic Clinical Rules Engine (DCRE), ASH-KV monitors token generation in real-time and prunes attention heads the microsecond a contraindication is detected.

♾️ Infinite Horizon (NVMe Paging)

Break the VRAM ceiling. ASH-KV dynamically offloads "Cold" context chunks to NVMe storage, allowing for 100k+ token windows on consumer-grade hardware without OOM crashes.


🚀 Quick Start

1. Installation

pip install .

2. Corporate Integration (3 Lines of Code)

Integrate ASH-KV into any production pipeline to add an immediate safety layer.

from mlx_ash_kv.api import protect

# Wrap your existing model with the ASH-KV shield
protected_model, cache, shield, proxies = protect(model, sensitivity=0.85)

# Inference continues normally, but with real-time surgical healing

🛠️ Command Center (CLI)

ASH-KV comes with a professional CLI for systems verification and benchmarking.

  • ash-kv install: Verify hardware drivers, silicon backend, and NVMe Paging Stress Test.
  • ash-kv benchmark: Run the 100-case "Hard Truth" evaluation suite.
  • ash-kv monitor: Launch the Live Diagnostic TUI to see layer-wise health and [HOT/WARM] memory distribution.
  • ash-kv demo: Launch the Gradio B2B Reliability Playground.

🔬 Scientific Foundation

ASH-KV implements Asynchronous Self-Healing protocols that offload hallucination detection to secondary silicon (like the ANE or secondary GPU cores), ensuring the main generation loop remains unobstructed.


⚠️ DISCLAIMER

ASH-KV is a hardware-level reliability layer designed to assist professionals. It is NOT a substitute for professional medical or legal judgment. All AI-generated outputs, even those "healed" by ASH-KV, must be verified by qualified human professionals before making clinical or legal decisions.


Built for the future of mission-critical Agentic Reasoning.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlx_ash_kv-8.2.0.tar.gz (14.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mlx_ash_kv-8.2.0-py3-none-any.whl (16.2 kB view details)

Uploaded Python 3

File details

Details for the file mlx_ash_kv-8.2.0.tar.gz.

File metadata

  • Download URL: mlx_ash_kv-8.2.0.tar.gz
  • Upload date:
  • Size: 14.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for mlx_ash_kv-8.2.0.tar.gz
Algorithm Hash digest
SHA256 dae8b3f492e9a2defa9f9b8ecd07b12bb9ad4386f45ef07e98ade5db24d245c2
MD5 f56b3c245b5fd5a3621a442746fd1f81
BLAKE2b-256 76b4edae68f4f0a200a3bb397cdb90ece9afa106274718faf3b7e3c8d68aba8d

See more details on using hashes here.

File details

Details for the file mlx_ash_kv-8.2.0-py3-none-any.whl.

File metadata

  • Download URL: mlx_ash_kv-8.2.0-py3-none-any.whl
  • Upload date:
  • Size: 16.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for mlx_ash_kv-8.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0fdfc44fbcec2ba713394099a16d62227ad2839a0c9c9b5be50262fea135bd22
MD5 8bd22930b419b34022e80355ac08955e
BLAKE2b-256 d44fe0dc0ae94459adf782d7451a0c00319516f20244f39a6b944f11fef5979a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page