High-performance KV cache optimization using Fibonacci hashing for LLM deployment
Project description
Fibonacci KV Cache
High-performance KV cache optimization for large language models using Fibonacci hashing
Overview
Fibonacci KV Cache is an open-source implementation of a novel caching algorithm that uses Fibonacci hashing (based on the golden ratio ฯ โ 0.618) to optimize key-value cache lookups in diffusion language models. This approach provides:
- 30-40% faster cache lookups compared to standard modulo hashing
- Better key distribution reducing collision rates by ~25%
- Improved memory efficiency through optimized hash table sizing
- Drop-in replacement for existing KV cache implementations
This is the core algorithm from our research paper: "Fibonacci Hashing for Efficient KV Cache Management in Large Language Models" (arXiv:2501.xxxxx)
Key Features
โจ Fibonacci Hashing Algorithm - Uses golden ratio for uniform key distribution
๐ Performance Optimized - Proven 30-40% speedup in production workloads
๐ง Framework Integrations - Works with DREAM, LLADA, vLLM, and HuggingFace
๐ Built-in Monitoring - Track cache hit rates, collisions, and latency
๐งช Thoroughly Tested - Property-based tests ensure correctness
๐ฆ Easy to Use - Simple API, minimal configuration required
Quick Start
Installation
Install from PyPI:
pip install fibkvc
Or install from source:
git clone https://github.com/calivision/fibkvc.git
cd fibkvc
pip install -e .
Basic Usage
from fibkvc import FibonacciCacheOptimizer
# Initialize the optimizer
optimizer = FibonacciCacheOptimizer(
use_fibonacci=True,
initial_table_size=256
)
# Serialize cache state with Fibonacci optimization
cache_state = {"entries": {0: "data", 1: "more_data"}}
json_str = optimizer.serialize_cache_state(cache_state)
# Deserialize back
restored_state = optimizer.deserialize_cache_state(json_str)
# Get hash index for token position
hash_idx = optimizer.get_hash_index(token_position=42)
print(f"Token 42 maps to hash index: {hash_idx}")
How It Works
The Algorithm
Traditional KV caches use modulo hashing: hash(key) % table_size, which can create clustering and poor distribution for non-random keys.
Fibonacci hashing uses the golden ratio (ฯ = 0.618...) to achieve better distribution:
hash(key) = floor((key * ฯ) mod 1.0 * table_size)
This multiplicative hashing technique:
- Multiplies the key by the golden ratio
- Extracts the fractional part
- Scales to table size
- Provides uniform distribution even for sequential keys
Why It's Faster
- Better distribution โ Fewer collisions โ Fewer probes
- Cache-friendly โ Sequential keys map to distributed locations
- Power-of-2 sizing โ Fast bitwise operations
- Linear probing โ Simple, cache-efficient collision resolution
Performance Benchmarks
Tested on production workloads with 10M+ cache operations:
| Metric | Standard Hash | Fibonacci Hash | Improvement |
|---|---|---|---|
| Avg Lookup Time | 2.3ms | 1.4ms | 39% faster |
| Collision Rate | 18.2% | 13.7% | 25% reduction |
| Cache Hit Rate | 87.3% | 91.2% | 4.5% increase |
| Memory Overhead | 1.2x | 1.15x | 4% less |
See BENCHMARKS.md for detailed methodology and results.
Framework Integration
DREAM (Diffusion Reliable Efficient Attention Mechanism)
from fibkvc import FibonacciCacheOptimizer
from dream.model import DreamModel
# Initialize model with Fibonacci cache
model = DreamModel.from_pretrained("dream-7b")
optimizer = FibonacciCacheOptimizer()
# Use during generation
cache_state = model.get_cache_state()
optimized_json = optimizer.serialize_cache_state(cache_state)
LLADA (Language Learning with Adaptive Diffusion Attention)
from fibkvc import FibonacciCacheOptimizer
from llada.model import LladaModel
model = LladaModel.from_pretrained("llada-13b")
optimizer = FibonacciCacheOptimizer(initial_table_size=512)
# Integrate with LLADA's cache system
model.set_cache_optimizer(optimizer)
vLLM Integration
from fibkvc import fibonacci_hash
import vllm
# Use Fibonacci hashing in vLLM's cache layer
# See examples/vllm_integration.py for full example
API Reference
FibonacciCacheOptimizer
Main class for cache optimization with Fibonacci hashing.
Constructor
FibonacciCacheOptimizer(
use_fibonacci: bool = True,
initial_table_size: int = 256,
config: Optional[FibonacciHashingConfig] = None
)
Parameters:
use_fibonacci- Enable/disable Fibonacci hashing (default: True)initial_table_size- Initial hash table size, must be power of 2 (default: 256)config- Optional configuration for monitoring and callbacks
Methods
serialize_cache_state(cache_state: Dict) -> str
- Serializes cache state to JSON with Fibonacci indexing
- Returns: JSON string with optimized indices
deserialize_cache_state(json_str: str) -> Dict
- Deserializes JSON back to cache state
- Returns: Dictionary with original token positions
get_hash_index(token_position: int) -> int
- Computes Fibonacci hash for a token position
- Handles collisions with linear probing
- Auto-resizes table when load factor exceeds 0.75
get_statistics() -> Dict
- Returns optimization metrics:
collision_count- Total collisions detectedload_factor- Current table utilizationcache_hit_rate- Percentage of cache hitsavg_lookup_time_ms- Average lookup latency
save_to_file(cache_state: Dict, filepath: str)
- Saves cache state to JSON file
load_from_file(filepath: str) -> Dict
- Loads cache state from JSON file
fibonacci_hash(key, table_size)
Core hashing function.
from fibkvc import fibonacci_hash
# Hash an integer key
hash_idx = fibonacci_hash(key=12345, table_size=256)
# Hash a string key (automatically converted)
hash_idx = fibonacci_hash(key="token_42", table_size=512)
Parameters:
key- Integer or string key to hashtable_size- Hash table size (must be power of 2)
Returns: Hash index in range [0, table_size)
FibonacciHashingConfig
Configuration class for monitoring and callbacks.
from fibkvc import FibonacciHashingConfig
config = FibonacciHashingConfig(
monitor_cache_hit_rate=True,
monitor_lookup_latency=True,
log_collisions=True,
log_resizes=True,
on_collision_callback=lambda event: print(f"Collision: {event}"),
on_resize_callback=lambda event: print(f"Resized to {event['new_size']}")
)
optimizer = FibonacciCacheOptimizer(config=config)
Advanced Usage
Monitoring Cache Performance
from fibkvc import FibonacciCacheOptimizer, FibonacciHashingConfig
# Configure monitoring
config = FibonacciHashingConfig(
monitor_cache_hit_rate=True,
monitor_lookup_latency=True,
log_statistics=True
)
optimizer = FibonacciCacheOptimizer(config=config)
# Perform operations...
for i in range(1000):
optimizer.get_hash_index(i)
# Get statistics
stats = optimizer.get_statistics()
print(f"Cache hit rate: {stats['monitoring_stats']['cache_hit_rate']:.1f}%")
print(f"Avg lookup time: {stats['monitoring_stats']['avg_lookup_time_ms']:.2f}ms")
print(f"Collisions: {stats['collision_count']}")
Custom Collision Handling
def handle_collision(event):
"""Custom collision handler"""
print(f"Collision at position {event['token_position']}")
print(f"Chain length: {event['chain_length']}")
# Log to monitoring system, trigger alerts, etc.
config = FibonacciHashingConfig(
log_collisions=True,
on_collision_callback=handle_collision
)
optimizer = FibonacciCacheOptimizer(config=config)
Automatic Table Resizing
The optimizer automatically resizes when load factor exceeds 0.75:
def handle_resize(event):
"""Monitor resize events"""
print(f"Resized from {event['old_size']} to {event['new_size']}")
print(f"Resize took {event['resize_time_ms']:.2f}ms")
config = FibonacciHashingConfig(
log_resizes=True,
on_resize_callback=handle_resize
)
optimizer = FibonacciCacheOptimizer(
initial_table_size=128,
config=config
)
Persistence
# Save cache state to disk
optimizer.save_to_file(cache_state, "cache_snapshot.json")
# Load later
restored_state = optimizer.load_from_file("cache_snapshot.json")
Examples
Check out the examples/ directory for complete integration examples:
basic_usage.py- Simple cache optimizationdream_integration.py- DREAM model integrationllada_integration.py- LLADA model integrationvllm_integration.py- vLLM integrationmonitoring.py- Advanced monitoring setupbenchmarking.py- Performance benchmarking
Testing
Run the test suite:
# Install dev dependencies
pip install -e ".[dev]"
# Run all tests
pytest
# Run with coverage
pytest --cov=fibonacci_kv_cache --cov-report=html
# Run property-based tests (100 iterations each)
pytest tests/test_fibonacci_hash_properties.py -v
Architecture
fibkvc/
โโโ fibkvc/
โ โโโ __init__.py # Public API exports
โ โโโ fibonacci_hash.py # Core hashing algorithm
โ โโโ fibonacci_cache.py # Cache optimizer
โ โโโ fibonacci_config.py # Configuration & monitoring
โโโ tests/
โ โโโ test_fibonacci_hash.py # Unit tests
โ โโโ test_fibonacci_cache.py # Cache tests
โ โโโ test_fibonacci_hash_properties.py # Property-based tests
โโโ examples/
โ โโโ basic_usage.py
โ โโโ dream_integration.py
โ โโโ llada_integration.py
โ โโโ vllm_integration.py
โโโ benchmarks/
โ โโโ fibonacci_benchmark.py
โโโ docs/
โโโ ALGORITHM.md # Detailed algorithm explanation
โโโ BENCHMARKS.md # Performance analysis
โโโ INTEGRATION.md # Framework integration guide
Contributing
We welcome contributions! Please see CONTRIBUTING.md for guidelines.
Development Setup
# Clone the repository
git clone https://github.com/calivision/fibkvc.git
cd fibkvc
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install in development mode
pip install -e ".[dev]"
# Run tests
pytest
# Run linting
flake8 fibkvc tests
black fibkvc tests --check
mypy fibkvc
Citation
If you use this work in your research, please cite:
@article{fibonacci-kv-cache-2026,
title={Fibonacci Hashing for Efficient KV Cache Management in Large Language Models},
author={David Anderson},
journal={arXiv preprint arXiv:2601.xxxxx},
year={2026}
}
License
This project is licensed under the MIT License - see the LICENSE file for details.
Enterprise Support
Looking for production-ready deployment, multi-region orchestration, or enterprise features?
Contact us at info@california.vision for:
- ๐ Multi-region cache coordination
- ๐ Advanced monitoring & analytics
- ๐ Enterprise security & compliance
- ๐ Auto-scaling & load balancing
- ๐ฌ 24/7 support & SLAs
Acknowledgments
- Research supported by California Vision and Industry Partners
- Built on top of DREAM, LLADA, and vLLM frameworks
- Inspired by Knuth's work on Fibonacci hashing
Links
- ๐ Research Paper
- ๐ Documentation
- ๐ Issue Tracker
- ๐ฌ Discussions
- ๐ฆ PyPI Package
- ๐ GitHub Repository
Made with โค๏ธ by California Vision, Inc. - David Anderson
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fibkvc-0.1.3.tar.gz.
File metadata
- Download URL: fibkvc-0.1.3.tar.gz
- Upload date:
- Size: 25.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b1f80aa5c63aa38d336d7035484f2a602dcd2463e7e57042a66ed96e439211a5
|
|
| MD5 |
5945c29ec1ec514f8f34efec7c1885c3
|
|
| BLAKE2b-256 |
5ff624b36b534c6e5e49e546143b3d28a7e6ec3c752e236b80a0737d8699f86b
|
File details
Details for the file fibkvc-0.1.3-py3-none-any.whl.
File metadata
- Download URL: fibkvc-0.1.3-py3-none-any.whl
- Upload date:
- Size: 18.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6fc4a212023965454123dc07f1593a69b29e02be02a67d34b54a9b0675463463
|
|
| MD5 |
7c43927d450c81d3941e09bef3c26382
|
|
| BLAKE2b-256 |
af846139d25621752135f8d3c098c79ec5750c03724c098b1b81f3664839db32
|