Hierarchical Paged Context Management for Constraint-Preserving LLM Conversations
Project description
🧠 HierMem
Hierarchical context management for long-horizon LLM conversations with explicit constraint preservation.
📌 Release Summary (v1.0.1)
The v1.0.1 release marks our transition to a stable, research-grade python package available on PyPI while solidifying the architecture for the upcoming research paper publication. It introduces:
- Full multi-provider support (Ollama, OpenAI, Anthropic, Google Gemini, Groq) via integrated LiteLLM routing.
- Rigorous constraint-adherence benchmarking, proving efficiency gains over standard RAG and baseline LLM techniques.
- Dynamic Context Pacing, automatically balancing L0-L3 memory access via the underlying curator model.
📖 Table of Contents
- Why HierMem?
- Installation Guide (Crucial)
- Quick Start
- Setup & Configuration
- Images & Dimension Setup
- Benchmark Snapshot
- Research & Paper
⚡ Why HierMem?
Long conversations typically degrade due to "catastrophic forgetting", burying critical rules and logic. HierMem tackles this natively at the system architecture layer:
- Constraint-first Protection: Active rules have designated zones and are strictly protected against context overflow.
- Four-Level Memory Hierarchy:
- L0: Topic Index
- L1: Summaries
- L2: Embeddings
- L3: Raw Turns
- Curator Orchestration: Uses a smaller, efficient underlying model (e.g. Qwen2.5 3B) to select information and dynamically adapt the pipeline, significantly reducing compute costs in the main inference model.
🛠️ Installation Guide
⚠️ CRITICAL: Virtual Environment Required
To avoid dependency clashes with your global Python environment, you MUST create a virtual environment (
venv) before installing.
Standard Installation
# 1. Clone the repository
git clone https://github.com/yashdoke7/llm-hiermem.git
cd llm-hiermem
# 2. Create and activate a virtual environment (Mandatory)
python -m venv .venv
# On Windows:
.venv\Scripts\activate
# On macOS/Linux:
source .venv/bin/activate
# 3. Install the package
pip install hiermem
# ...or install from local source:
pip install -e .
Optional Extras
If you want to use the benchmarking suite, development utilities, or streamlit UI demo, install the respective extras:
pip install -e .[eval]
pip install -e .[dev]
pip install -e .[demo]
🚀 Quick Start
Once installed, usage is straightforward in your custom Python apps.
With Python API:
from core.pipeline import HierMemPipeline
# Automatically initiates the pipeline based on `.env` configuration
pipeline = HierMemPipeline.create()
# Interaction 1
r1 = pipeline.process_turn("Always answer in bullet points. What is Python?")
print(r1.assistant_response)
# Interaction 2
r2 = pipeline.process_turn("Now compare Python and Go for backend systems.")
print(r2.assistant_response)
With the built-in CLI Tool:
hiermem config
hiermem chat
hiermem ask "Give me a 3-point summary of memory hierarchies"
⚙️ Setup & Configuration
For optimal setup, define your configurations via .env or Environment Variables.
- Copy the example environment file:
cp .env.example .env
API Keys & Environment Variables
Only apply keys for providers you actually intend to use for your models:
# Credentials
GROQ_API_KEY=your_key_here
OPENAI_API_KEY=your_key_here
ANTHROPIC_API_KEY=your_key_here
GOOGLE_API_KEY=your_key_here
# Provider Routing
DEFAULT_PROVIDER=ollama
MAIN_PROVIDER=openai
CURATOR_PROVIDER=ollama
SUMMARIZER_PROVIDER=ollama
# Model Assignments
MAIN_LLM_MODEL=gpt-4o-mini
CURATOR_MODEL=ollama/qwen2.5:3b
SUMMARIZER_MODEL=ollama/qwen2.5:3b
Deployment Details
HierMem functions as a memory-orchestration runtime library. It manages memory storage entirely locally (vector DB via ChromaDB), querying local or cloud inference providers simultaneously.
Note: Vector storage reset is set to CLEAR_VECTOR_ON_START=false for persistence by default. If running evaluation benchmarks, set this to true to ensure clean context blocks!
🖼️ Images & Dimension Setup
Note for Contributors: When adding layout and diagram images to the repository (especially for academic or PyPI use), it is highly recommended to stick to standard ratios to prevent scaling blur.
- Recommended Ratio:
16:9- Recommended Dimensions:
1920x1080(or1280x720for lighter files)- Hosting: Images in this README have been deliberately hyperlinked via absolute raw GitHub URLs instead of relative paths, ensuring complete visibility on PyPI deployment pages.
📊 Benchmark Snapshot (Qwen2.5-14B)
HierMem achieves profound efficiency and reliability enhancements, validated by our Gemini 3.1 Pro continuous tracking evaluation across 15 synthetic datasets.
| Metric | HierMem | Raw LLM | Delta |
|---|---|---|---|
| Mean Judge Score (out of 10) | 8.461 | 6.908 | +1.553 |
| Constraint Survival | 0.933 | 0.740 | +0.193 |
| Mean Compute Cost/Turn | $0.0176 | $0.0264 | -33.3% |
| Mean Session Cost | $0.881 | $1.322 | -33.3% |
Evaluation Metrics Breakdown
Overall Quality Trend
Overall Pareto Frontier
Cost Breakdown
Latency Trend
🎓 Research & Paper
Our rigorous synthetic benchmarks reveal strong architectural integrity against memory decay in lengthy contexts. Find the draft manuscript in the repository under docs/paper.tex.
- Release Paper Snapshot: v1.0.0-paper Release Tag
- Dataset Access: HF Datasets: hiermem-constraint-tracking
Citation
@software{doke2026hiermem,
title={HierMem: Constraint-Preserving Hierarchical Context Management for Long-Horizon LLM Conversations},
author={Yash Doke},
year={2026},
url={https://github.com/yashdoke7/llm-hiermem}
}
Maintained by Yash Doke • LinkedIn
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file hiermem-1.0.2.tar.gz.
File metadata
- Download URL: hiermem-1.0.2.tar.gz
- Upload date:
- Size: 49.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1cef327166fe42da4c67b0a24db653ef600613405947aae3c6cc80b9b27ce310
|
|
| MD5 |
5ad3371e1ba09c59aaafc065ca2dad3f
|
|
| BLAKE2b-256 |
01f266ad827a06fa4e86b6747a702f0fc1265fe7d8eaab02727bfe0fdbcc8d78
|
File details
Details for the file hiermem-1.0.2-py3-none-any.whl.
File metadata
- Download URL: hiermem-1.0.2-py3-none-any.whl
- Upload date:
- Size: 54.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
03b5290e92ed534f7bf62d5714ba352e878719df720530f5c03b98c92471f26f
|
|
| MD5 |
4019ce15a31b1011f8f43748cbe450e4
|
|
| BLAKE2b-256 |
e9a158d67c29e32aaaa689e438dff193d25271aac0f37066f5c567105ec7bd2e
|