Skip to main content

Hierarchical Paged Context Management for Constraint-Preserving LLM Conversations

Project description

🧠 HierMem

Hierarchical context management for long-horizon LLM conversations with explicit constraint preservation.

PyPI - Version GitHub Release License: MIT Python LinkedIn

📌 Release Summary (v1.0.3)

The v1.0.3 release marks our transition to a stable, research-grade python package available on PyPI while solidifying the architecture for the upcoming research paper publication. It introduces:

  • Full multi-provider support (Ollama, OpenAI, Anthropic, Google Gemini, Groq) via integrated LiteLLM routing.
  • Rigorous constraint-adherence benchmarking, proving efficiency gains over standard RAG and baseline LLM techniques.
  • Dynamic Context Pacing, automatically balancing L0-L3 memory access via the underlying curator model.
  • Context Pressure Tracking, visually proving an overall 4.7x architectural compression ratio on conversation context management.

📖 Table of Contents


⚡ Why HierMem?

Long conversations typically degrade due to "catastrophic forgetting", burying critical rules and logic. HierMem tackles this natively at the system architecture layer:

  • Constraint-first Protection: Active rules have designated zones and are strictly protected against context overflow.
  • Four-Level Memory Hierarchy:
    • L0: Topic Index
    • L1: Summaries
    • L2: Embeddings
    • L3: Raw Turns
  • Curator Orchestration: Uses a smaller, efficient underlying model (e.g. Qwen2.5 3B) to select information and dynamically adapt the pipeline, significantly reducing compute costs in the main inference model.

🛠️ Installation Guide

⚠️ CRITICAL: Virtual Environment Required

To avoid dependency clashes with your global Python environment, you MUST create a virtual environment (venv) before installing.

Standard Installation

# 1. Clone the repository
git clone https://github.com/yashdoke7/llm-hiermem.git
cd llm-hiermem

# 2. Create and activate a virtual environment (Mandatory)
python -m venv .venv

# On Windows:
.venv\Scripts\activate
# On macOS/Linux:
source .venv/bin/activate

# 3. Install the package
pip install hiermem
# ...or install from local source:
pip install -e .

Optional Extras

If you want to use the benchmarking suite, development utilities, or streamlit UI demo, install the respective extras:

pip install -e .[eval]
pip install -e .[dev]
pip install -e .[demo]

🚀 Quick Start

Once installed, usage is straightforward in your custom Python apps.

With Python API:

from core.pipeline import HierMemPipeline

# Automatically initiates the pipeline based on `.env` configuration
pipeline = HierMemPipeline.create()

# Interaction 1
r1 = pipeline.process_turn("Always answer in bullet points. What is Python?")
print(r1.assistant_response)

# Interaction 2
r2 = pipeline.process_turn("Now compare Python and Go for backend systems.")
print(r2.assistant_response)

With the built-in CLI Tool:

hiermem config
hiermem chat
hiermem ask "Give me a 3-point summary of memory hierarchies"

⚙️ Setup & Configuration

For optimal setup, define your configurations via .env or Environment Variables.

  1. Copy the example environment file:
    cp .env.example .env
    

API Keys & Environment Variables

Only apply keys for providers you actually intend to use for your models:

# Credentials
GROQ_API_KEY=your_key_here
OPENAI_API_KEY=your_key_here
ANTHROPIC_API_KEY=your_key_here
GOOGLE_API_KEY=your_key_here

# Provider Routing
DEFAULT_PROVIDER=ollama
MAIN_PROVIDER=openai
CURATOR_PROVIDER=ollama
SUMMARIZER_PROVIDER=ollama

# Model Assignments
MAIN_LLM_MODEL=gpt-4o-mini
CURATOR_MODEL=ollama/qwen2.5:3b
SUMMARIZER_MODEL=ollama/qwen2.5:3b

Deployment Details

HierMem functions as a memory-orchestration runtime library. It manages memory storage entirely locally (vector DB via ChromaDB), querying local or cloud inference providers simultaneously. Note: Vector storage reset is set to CLEAR_VECTOR_ON_START=false for persistence by default. If running evaluation benchmarks, set this to true to ensure clean context blocks!


🖼️ Images & Dimension Setup

Note for Contributors: When adding layout and diagram images to the repository (especially for academic or PyPI use), it is highly recommended to stick to standard ratios to prevent scaling blur.

  • Recommended Ratio: 16:9
  • Recommended Dimensions: 1920x1080 (or 1280x720 for lighter files)
  • Hosting: Images in this README have been deliberately hyperlinked via absolute raw GitHub URLs instead of relative paths, ensuring complete visibility on PyPI deployment pages.

📊 Benchmark Snapshot (Qwen2.5-14B)

HierMem achieves profound efficiency and reliability enhancements, validated by our Gemini 3.1 Pro continuous tracking evaluation across 15 synthetic datasets.

Metric HierMem Raw LLM Delta
Mean Judge Score (out of 10) 8.461 6.908 +1.553
Constraint Survival 0.933 0.740 +0.193
Mean Compute Cost/Turn $0.0176 $0.0264 -33.3%
Compression Ratio 4.7x 3.2x (Truncated) +46.8%

Evaluation Metrics Breakdown

Context Pressure & Compression Ratio
Context Pressure Trend

Overall Quality Trend
Overall Quality Trend

Overall Pareto Frontier
Overall Pareto Frontier

Cost Breakdown
Cost Breakdown

Latency Trend
Overall Latency Trend


🎓 Research & Paper

Our rigorous synthetic benchmarks reveal strong architectural integrity against memory decay in lengthy contexts. Find the draft manuscript in the repository under docs/paper.tex.

Citation

@software{doke2026hiermem,
  title={HierMem: Constraint-Preserving Hierarchical Context Management for Long-Horizon LLM Conversations},
  author={Yash Doke},
  year={2026},
  url={https://github.com/yashdoke7/llm-hiermem}
}

Maintained by Yash Doke • LinkedIn

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hiermem-1.0.4.tar.gz (45.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hiermem-1.0.4-py3-none-any.whl (47.8 kB view details)

Uploaded Python 3

File details

Details for the file hiermem-1.0.4.tar.gz.

File metadata

  • Download URL: hiermem-1.0.4.tar.gz
  • Upload date:
  • Size: 45.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for hiermem-1.0.4.tar.gz
Algorithm Hash digest
SHA256 9a74d132fbb5406d383947a94a4920a69fdd1c24061f265afccca25a4c9f7883
MD5 ded67c5fd660d8e57315d80ad9bbc890
BLAKE2b-256 c875bcb1db94d36b83ddbd3c2a7a988d7381a064a5e4b81d964e3890cd9aee47

See more details on using hashes here.

File details

Details for the file hiermem-1.0.4-py3-none-any.whl.

File metadata

  • Download URL: hiermem-1.0.4-py3-none-any.whl
  • Upload date:
  • Size: 47.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for hiermem-1.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 f14671fdd8945a40900c1f0f0f6c03d852eaf9ab8e40464ebbdcba1b135ffe65
MD5 317f8528dffe3a67fd31545c645f025a
BLAKE2b-256 2a7d28c5ab52915c76d5b8901406ff52b81c83d33aaab6b6f3f4b9d7e90f06ab

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page