Production-grade memory, intelligent caching, and persistence layer for LLM and RAG applications.
Project description
fennec-memory
Production-grade memory, intelligent caching, and persistence for LLM and RAG applications
fennec-memory gives your AI application a complete memory stack — from short-term conversation buffers and semantic long-term memory, to a multi-level intelligent cache with semantic deduplication, to a durable persistence layer with encryption, versioning, and multi-tenant support.
What's Inside
The library is organized into three independent but composable subsystems:
| Subsystem | Module | Purpose |
|---|---|---|
| Memory | fennec_memory.memory |
Conversation and agent memory types |
| Cache | fennec_memory.cache |
Multi-level semantic + key-value cache |
| Persistence | fennec_memory.persistence |
Durable storage with security and versioning |
Installation
pip install fennec-memory
Install optional extras based on your stack:
pip install fennec-memory[redis] # Redis storage backend
pip install fennec-memory[faiss] # FAISS vector index for semantic cache
pip install fennec-memory[openai] # OpenAI embeddings
pip install fennec-memory[huggingface] # HuggingFace / sentence-transformers
pip install fennec-memory[ollama] # Ollama local embeddings
pip install fennec-memory[crypto] # Encryption for persistence layer
pip install fennec-memory[all] # Everything
Memory System
Conversation Memory Types
from fennec_memory.memory import (
ConversationBufferMemory,
ConversationBufferWindowMemory,
ConversationSummaryMemory,
ConversationEntityMemory,
)
# Simple buffer — keeps full history
mem = ConversationBufferMemory()
mem.add("user", "What is RAG?")
mem.add("assistant", "RAG stands for Retrieval-Augmented Generation.")
print(mem.get_history())
# Sliding window — keeps last N turns
window_mem = ConversationBufferWindowMemory(window_size=5)
# Summary memory — compresses old turns into a summary
summary_mem = ConversationSummaryMemory()
# Entity memory — tracks named entities across the conversation
entity_mem = ConversationEntityMemory()
Long-Term Memory
from fennec_memory.memory import LongTermMemory, MemoryConfig
config = MemoryConfig(
max_long_term=10_000,
importance_threshold=0.6,
enable_decay=True,
decay_rate=0.01,
enable_persistence=True,
persistence_path="./memory_store",
)
ltm = LongTermMemory(config=config)
ltm.store("The user prefers concise answers.", importance=0.85)
results = ltm.retrieve("user preferences", top_k=5)
Semantic Memory
from fennec_memory.memory import SemanticMemory
sem = SemanticMemory()
sem.add("Paris is the capital of France.")
sem.add("The Eiffel Tower is in Paris.")
results = sem.search("French landmarks", top_k=3)
for r in results:
print(r.text, r.score)
Intelligent Memory Manager
from fennec_memory.memory import AIMemoryManager, MemoryConfig
manager = AIMemoryManager(config=MemoryConfig())
# Automatically routes to short-term, working, or long-term
manager.remember("User said they are a software engineer.", importance=0.7)
# Retrieve relevant memories for a query
context = manager.recall("What do I know about this user?", top_k=5)
User Profile Manager
from fennec_memory.memory import UserProfileManager
profiles = UserProfileManager()
profiles.update("user_123", preferences={"language": "Arabic", "tone": "formal"})
profile = profiles.get("user_123")
print(profile.preferences)
Privacy & Security
from fennec_memory.memory import SensitiveDataMasker, MemoryEncryptor, AccessController, Permission
masker = SensitiveDataMasker()
safe_text = masker.mask("My SSN is 123-45-6789")
# "My SSN is ***-**-****"
encryptor = MemoryEncryptor(key="your-secret-key")
encrypted = encryptor.encrypt("sensitive memory content")
decrypted = encryptor.decrypt(encrypted)
RAG Context Builder
from fennec_memory.memory import ContextBuilder, Document
builder = ContextBuilder()
docs = [Document(content="...", metadata={"source": "wiki"})]
context = builder.build(query="What is RAG?", memories=mem.get_history(), documents=docs)
print(context.formatted_prompt)
Cache System
Multi-Level Semantic Cache
from fennec_memory.cache import MultiLevelCache, CacheConfig, StorageBackend
config = CacheConfig(
storage_backend=StorageBackend.SQLITE, # or REDIS, MEMORY
semantic_threshold=0.92, # cosine similarity for cache hit
ttl_seconds=3600,
)
cache = MultiLevelCache(config=config)
# Store a response
cache.set(query="What is the capital of France?", response="Paris.")
# Retrieve — returns on exact or semantic match
result = cache.get("Capital city of France?")
if result:
print(result.response) # "Paris." — semantic hit
print(result.similarity) # 0.96
Cache Manager with Embeddings
from fennec_memory.cache import CacheManager, CacheConfig
from fennec_memory.cache import OpenAIEmbeddingProvider, EmbeddingConfig
embedder = OpenAIEmbeddingProvider(
config=EmbeddingConfig(api_key="sk-..."),
)
cache_manager = CacheManager(config=CacheConfig(), embedder=embedder)
Redis Backend
from fennec_memory.cache import CacheConfig, StorageBackend, RedisConfig
config = CacheConfig(
storage_backend=StorageBackend.REDIS,
redis=RedisConfig(host="localhost", port=6379, db=0),
)
Multi-Tenant Cache
from fennec_memory.cache import TenantManager, TenantConfig
tenants = TenantManager()
tenants.register("tenant_a", TenantConfig(max_entries=5000, ttl_seconds=1800))
tenants.register("tenant_b", TenantConfig(max_entries=1000, ttl_seconds=600))
Embedding Providers
from fennec_memory.cache import (
OpenAIEmbeddingProvider,
HuggingFaceEmbeddingProvider,
OllamaEmbeddingProvider,
)
# OpenAI
provider = OpenAIEmbeddingProvider(api_key="sk-...")
# HuggingFace (local)
provider = HuggingFaceEmbeddingProvider(model="sentence-transformers/all-MiniLM-L6-v2")
# Ollama (local)
provider = OllamaEmbeddingProvider(model="nomic-embed-text")
Persistence System
Persistence Manager
from fennec_memory.persistence import PersistenceManager, PersistenceManagerConfig, StorageType
config = PersistenceManagerConfig(
default_storage=StorageType.KEY_VALUE,
enable_versioning=True,
enable_encryption=True,
)
pm = PersistenceManager(config=config)
# Store data
await pm.save(key="session:abc123", data={"messages": [...]})
# Retrieve
result = await pm.load(key="session:abc123")
print(result.data)
Storage Backends
from fennec_memory.persistence import KeyValueStorage, VectorStorage, DatabaseStorage, ObjectStorage
# Key-value store
kv = KeyValueStorage()
await kv.set("key", {"value": 42})
data = await kv.get("key")
# Vector store
vs = VectorStorage()
await vs.upsert(id="doc1", vector=[0.1, 0.2, ...], metadata={"source": "wiki"})
results = await vs.search(query_vector=[0.1, 0.2, ...], top_k=5)
Versioning & Snapshots
from fennec_memory.persistence import VersionManager, BackupManager
version_mgr = VersionManager()
version_mgr.save_version(key="config:v1", data={...})
history = version_mgr.get_history("config:v1")
backup_mgr = BackupManager()
snapshot = backup_mgr.create_snapshot("session:abc123")
backup_mgr.restore(snapshot)
Security & Encryption
from fennec_memory.persistence import EncryptionEngine, AccessControlManager, DataSanitizer, Permission
enc = EncryptionEngine(secret_key="your-32-byte-key")
encrypted = enc.encrypt(b"sensitive payload")
plain = enc.decrypt(encrypted)
acl = AccessControlManager()
acl.grant(user="alice", resource="session:*", permission=Permission.READ)
acl.check(user="bob", resource="session:xyz", permission=Permission.WRITE)
Storage Routing
from fennec_memory.persistence import StorageRouter, RoutingRule, DataTier
router = StorageRouter()
router.add_rule(RoutingRule(tier=DataTier.HOT, storage_type=StorageType.KEY_VALUE))
router.add_rule(RoutingRule(tier=DataTier.COLD, storage_type=StorageType.OBJECT))
Integration with fennec-community
fennec-memory is designed to work seamlessly alongside fennec-community and fennec-guard:
from fennec_memory.memory import AIMemoryManager, ContextBuilder
from fennec_memory.cache import MultiLevelCache
from fennec_community.rag.core import RAGSystem
memory = AIMemoryManager()
cache = MultiLevelCache()
rag = RAGSystem(...)
def chat(user_id: str, query: str) -> str:
# Check semantic cache first
cached = cache.get(query)
if cached:
return cached.response
# Retrieve relevant memories
past_context = memory.recall(query, top_k=5)
# Run RAG
answer = rag.query(query, extra_context=past_context)
# Store in memory and cache
memory.remember(f"Q: {query} A: {answer}", importance=0.6)
cache.set(query=query, response=answer)
return answer
Requirements
- Python >= 3.9
- pydantic >= 2.0
- numpy >= 1.24
All other dependencies are optional — install only what you need.
License
MIT License — see LICENSE for details.
Contributing
Contributions are welcome! Please open an issue or pull request on GitHub.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fennec_memory-0.1.0.tar.gz.
File metadata
- Download URL: fennec_memory-0.1.0.tar.gz
- Upload date:
- Size: 113.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ec4852b041134be7b55b544c0c696133d73f1547127054512777e2d3d084c66c
|
|
| MD5 |
b1ce4adedfda2891a3588a49de1a9b60
|
|
| BLAKE2b-256 |
9969101f5a21ac1d52435f8a7b88698015067c14dbd0af227f128381815dd117
|
File details
Details for the file fennec_memory-0.1.0-py3-none-any.whl.
File metadata
- Download URL: fennec_memory-0.1.0-py3-none-any.whl
- Upload date:
- Size: 133.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
13b4c4d65828f0f9d1a848c36cff2bf20d88e108adc94a4cfd2227e9caa34461
|
|
| MD5 |
70effe3f8046a8de8dad6a2ab19ebab1
|
|
| BLAKE2b-256 |
7192920b1a6786530b022819e416ead06ef948a435442897c60e4e31e0fae926
|