Skip to main content

Hierarchical Paged Context Management for Constraint-Preserving LLM Conversations

Project description

🧠 HierMem

Hierarchical context management for long-horizon LLM conversations with explicit constraint preservation.

PyPI - Version GitHub Release License: MIT Python LinkedIn

📌 Release Summary (v1.0.3)

The v1.0.3 release marks our transition to a stable, research-grade python package available on PyPI while solidifying the architecture for the upcoming research paper publication. It introduces:

  • Full multi-provider support (Ollama, OpenAI, Anthropic, Google Gemini, Groq) via integrated LiteLLM routing.
  • Rigorous constraint-adherence benchmarking, proving efficiency gains over standard RAG and baseline LLM techniques.
  • Dynamic Context Pacing, automatically balancing L0-L3 memory access via the underlying curator model.
  • Context Pressure Tracking, visually proving an overall 4.7x architectural compression ratio on conversation context management.

📖 Table of Contents


⚡ Why HierMem?

Long conversations typically degrade due to "catastrophic forgetting", burying critical rules and logic. HierMem tackles this natively at the system architecture layer:

  • Constraint-first Protection: Active rules have designated zones and are strictly protected against context overflow.
  • Four-Level Memory Hierarchy:
    • L0: Topic Index
    • L1: Summaries
    • L2: Embeddings
    • L3: Raw Turns
  • Curator Orchestration: Uses a smaller, efficient underlying model (e.g. Qwen2.5 3B) to select information and dynamically adapt the pipeline, significantly reducing compute costs in the main inference model.

🛠️ Installation Guide

⚠️ CRITICAL: Virtual Environment Required

To avoid dependency clashes with your global Python environment, you MUST create a virtual environment (venv) before installing.

Standard Installation

# 1. Clone the repository
git clone https://github.com/yashdoke7/llm-hiermem.git
cd llm-hiermem

# 2. Create and activate a virtual environment (Mandatory)
python -m venv .venv

# On Windows:
.venv\Scripts\activate
# On macOS/Linux:
source .venv/bin/activate

# 3. Install the package
pip install hiermem
# ...or install from local source:
pip install -e .

Optional Extras

If you want to use the benchmarking suite, development utilities, or streamlit UI demo, install the respective extras:

pip install -e .[eval]
pip install -e .[dev]
pip install -e .[demo]

🚀 Quick Start

Once installed, usage is straightforward in your custom Python apps.

With Python API:

from core.pipeline import HierMemPipeline

# Automatically initiates the pipeline based on `.env` configuration
pipeline = HierMemPipeline.create()

# Interaction 1
r1 = pipeline.process_turn("Always answer in bullet points. What is Python?")
print(r1.assistant_response)

# Interaction 2
r2 = pipeline.process_turn("Now compare Python and Go for backend systems.")
print(r2.assistant_response)

With the built-in CLI Tool:

hiermem config
hiermem chat
hiermem ask "Give me a 3-point summary of memory hierarchies"

⚙️ Setup & Configuration

For optimal setup, define your configurations via .env or Environment Variables.

  1. Copy the example environment file:
    cp .env.example .env
    

API Keys & Environment Variables

Only apply keys for providers you actually intend to use for your models:

# Credentials
GROQ_API_KEY=your_key_here
OPENAI_API_KEY=your_key_here
ANTHROPIC_API_KEY=your_key_here
GOOGLE_API_KEY=your_key_here

# Provider Routing
DEFAULT_PROVIDER=ollama
MAIN_PROVIDER=openai
CURATOR_PROVIDER=ollama
SUMMARIZER_PROVIDER=ollama

# Model Assignments
MAIN_LLM_MODEL=gpt-4o-mini
CURATOR_MODEL=ollama/qwen2.5:3b
SUMMARIZER_MODEL=ollama/qwen2.5:3b

Deployment Details

HierMem functions as a memory-orchestration runtime library. It manages memory storage entirely locally (vector DB via ChromaDB), querying local or cloud inference providers simultaneously. Note: Vector storage reset is set to CLEAR_VECTOR_ON_START=false for persistence by default. If running evaluation benchmarks, set this to true to ensure clean context blocks!


🖼️ Images & Dimension Setup

Note for Contributors: When adding layout and diagram images to the repository (especially for academic or PyPI use), it is highly recommended to stick to standard ratios to prevent scaling blur.

  • Recommended Ratio: 16:9
  • Recommended Dimensions: 1920x1080 (or 1280x720 for lighter files)
  • Hosting: Images in this README have been deliberately hyperlinked via absolute raw GitHub URLs instead of relative paths, ensuring complete visibility on PyPI deployment pages.

📊 Benchmark Snapshot (Qwen2.5-14B)

HierMem achieves profound efficiency and reliability enhancements, validated by our Gemini 3.1 Pro continuous tracking evaluation across 15 synthetic datasets.

Metric HierMem Raw LLM Delta
Mean Judge Score (out of 10) 8.461 6.908 +1.553
Constraint Survival 0.933 0.740 +0.193
Mean Compute Cost/Turn $0.0176 $0.0264 -33.3%
Compression Ratio 4.7x 3.2x (Truncated) +46.8%

Evaluation Metrics Breakdown

Context Pressure & Compression Ratio
Context Pressure Trend

Overall Quality Trend
Overall Quality Trend

Overall Pareto Frontier
Overall Pareto Frontier

Cost Breakdown
Cost Breakdown

Latency Trend
Overall Latency Trend


🎓 Research & Paper

Our rigorous synthetic benchmarks reveal strong architectural integrity against memory decay in lengthy contexts. Find the draft manuscript in the repository under docs/paper.tex.

Citation

@software{doke2026hiermem,
  title={HierMem: Constraint-Preserving Hierarchical Context Management for Long-Horizon LLM Conversations},
  author={Yash Doke},
  year={2026},
  url={https://github.com/yashdoke7/llm-hiermem}
}

Maintained by Yash Doke • LinkedIn

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hiermem-1.0.3.tar.gz (51.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hiermem-1.0.3-py3-none-any.whl (55.0 kB view details)

Uploaded Python 3

File details

Details for the file hiermem-1.0.3.tar.gz.

File metadata

  • Download URL: hiermem-1.0.3.tar.gz
  • Upload date:
  • Size: 51.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for hiermem-1.0.3.tar.gz
Algorithm Hash digest
SHA256 23b8df3d11918d4064bd96e06901d8decb5d0488ada722c47d04033e477ae16e
MD5 dda6eb8dbd228ff48efb73fd2ce449d2
BLAKE2b-256 1c7014dd31ad1f2ea87c46a890e92e53dfa0e393412245f3f3198dfe64fa6cb7

See more details on using hashes here.

File details

Details for the file hiermem-1.0.3-py3-none-any.whl.

File metadata

  • Download URL: hiermem-1.0.3-py3-none-any.whl
  • Upload date:
  • Size: 55.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for hiermem-1.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 9486dcbe7ba3281592ddec87046b6d19c326101fc82a3522c0a09552857d77e3
MD5 16e4bf1fae9507595b2c534763184b90
BLAKE2b-256 72d6b4269e03a3a32c984816ea450f46411514bf9ea6e2950f43f0947da134dc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page