A 0-LLM-token memory management system for edge-friendly long-term agent memory, with optional LLM enhancement.
Project description
EdgeMem
EdgeMem is a 0-LLM-token memory management system for multi-session agent conversations. It builds lightweight local memory indexes from dialogue turns, routes questions through timeline and graph structure, and optionally calls an LLM only for final answer generation and evaluation. Because its core ingestion and retrieval path is low-cost and local-first, it is suitable for edge devices; optional LLM-enhanced features will continue to expand over time.
Highlights
- Edge-first ingestion: local entity extraction, keyword extraction, time parsing, and JSON persistence.
- Structured memory retrieval: timeline retrieval, hypergraph routing, tri-graph diffusion, optional embedding retrieval, and score fusion.
- LoCoMo and LongMemEval-S support: reproducible CLIs for long-term memory QA retrieval and generation.
- Explainable evidence: every retrieved snippet includes source, score, session, speaker, and timestamp metadata.
- Standalone visual demo:
docs/edgemem_visual_demo.htmldemonstrates graph-routed memory vs summary-only memory.
Project Layout
edgemem_op/
edgemem/ # Core Python package
cli/ # answer/evaluate/baseline CLIs
data_models.py # MemoryNode, HyperEdge, HyperGraph, Evidence
timeline.py # Timeline retriever
hypergraph.py # Hypergraph memory and routing
trigraph.py # Entity/keyword-turn-session graph retriever
embedding_retriever.py # Optional embedding retriever
generator.py # Evidence-to-answer prompt
eval_judge.py # LLM-as-judge evaluation
edgemem_memory_plugin/ # Embeddable memory plugin facade
scripts/ # Analysis and demo trace utilities
docs/ # Method notes, formulation, visual demo
evaluation/data/locomo/ # Place LoCoMo data here locally; not committed
longmemeval/... # Place LongMemEval-S cleaned data here locally; not committed
examples/traces/ # Exported retrieval trace example
requirements.txt # Minimal runtime dependencies
pyproject.toml # Package metadata
Installation
cd edgemem_op
python -m pip install -r requirements.txt
edgemem-init
edgemem-doctor
If you want to use a larger spaCy model:
edgemem-init --model en_core_web_md
export EDGEMEM_SPACY_MODEL=en_core_web_md
Configuration
EdgeMem reads OpenAI-compatible API settings from environment variables:
export LLM_API_KEY="your-key"
export LLM_BASE_URL="https://api.openai.com/v1" # optional
export LLM_MODEL="gpt-4o-mini"
export EMB_API_KEY="$LLM_API_KEY" # only needed for embed/triembed
export EMB_BASE_URL="$LLM_BASE_URL"
export EMB_MODEL="text-embedding-3-small"
You can also copy my_config.example.py to my_config.py for local development, but do not commit my_config.py.
Python Memory Plugin
EdgeMem can be embedded directly as a long-term memory plugin:
from edgemem_memory_plugin import EdgeMemMemoryPlugin, MemoryEvent, MemoryQuery
memory = EdgeMemMemoryPlugin()
memory.write_event(
MemoryEvent(
user_id="demo_user",
session_id="session_1",
text="I care about turn-level R@5 for LongMemEval.",
)
)
hits = memory.retrieve(
MemoryQuery(
user_id="demo_user",
query="What metric do I care about?",
top_k=3,
mode="fused",
rerank="trigraph",
)
)
print(memory.format_context(hits))
Run the included example:
python edgemem_memory_plugin/examples/basic_usage.py
Quick Start
Package smoke tests after installation:
edgemem-init
edgemem-doctor
edgemem-memory-demo
edgemem-smoke-longmemeval --help
edgemem-smoke-locomo --help
If the spaCy model download is blocked, the package can still run basic smoke tests with spaCy's blank English fallback, but NER quality will be lower.
Timeline-only baseline:
python -m edgemem.cli.answer \
--dataset evaluation/data/locomo/locomo10.json \
--mode timeline \
--question "When did Caroline go to the LGBTQ support group?"
Fused timeline + tri-graph retrieval:
python -m edgemem.cli.answer \
--dataset evaluation/data/locomo/locomo10.json \
--mode fused \
--rerank trigraph \
--context-window 1 \
--question "When did Caroline go to the LGBTQ support group?"
Run the LoCoMo evaluation:
python -m edgemem.cli.evaluate \
--dataset evaluation/data/locomo/locomo10.json \
--mode fused \
--rerank trigraph \
--run-name locomo_all_trigraph_context \
--max-workers 10 \
--context-window 1 \
--verbose
Run LoCoMo retrieval-only detailed metrics without answer generation:
python -m edgemem.cli.evaluate_locomo_retrieval \
--dataset evaluation/data/locomo/locomo10.json \
--mode fused \
--rerank trigraph \
--top-k 5 \
--ks 1,3,5,10 \
--context-window 1 \
--run-name locomo_retrieval_detailed \
--verbose
This writes turn/session/context retrieval metrics to
results/locomo_retrieval_summary_<run-name>.json.
Convenience smoke test wrapper:
edgemem-smoke-locomo \
--dataset evaluation/data/locomo/locomo10.json \
--limit 5
Add --with-llm plus LLM_API_KEY/LLM_BASE_URL/LLM_MODEL to generate
answers and run LLM-as-judge.
Run EdgeMem retrieval on LongMemEval-S cleaned:
python -m edgemem.cli.evaluate_longmemeval \
--dataset longmemeval/longmemeval-data-cleaned/data/longmemeval_s_cleaned.json \
--mode fused \
--rerank trigraph \
--retrieval-top-k 50 \
--generation-top-k 5 \
--context-window 1 \
--run-name longmemeval_s_cleaned_edgemem \
--verbose
This command evaluates retrieval without calling an LLM. Add --generate to also write
results/longmemeval_hypotheses_<run-name>.jsonl with {question_id, hypothesis} lines
for the official LongMemEval QA evaluator.
Convenience smoke test wrapper:
edgemem-smoke-longmemeval \
--dataset longmemeval/longmemeval-data-cleaned/data/longmemeval_s_cleaned.json \
--limit 3
Dataset pages:
- LoCoMo: https://github.com/snap-stanford/locomo
- LongMemEval: https://github.com/xiaowu0162/LongMemEval
Visual Demo
Open this file directly in a browser:
docs/edgemem_visual_demo.html
To export a real retrieval trace from the current code:
python scripts/export_edgemem_trace.py \
--dataset evaluation/data/locomo/locomo10.json \
--conversation-index 0 \
--qa-index 0 \
--top-k 5 \
--max-nodes 80 \
--output examples/traces/edgemem_trace_demo.json
Notes for Open Source Release
my_config.py,.env, caches, generated results, benchmark datasets, and real API keys are intentionally ignored.- Download LoCoMo and LongMemEval data separately before running benchmark CLIs.
- The core non-embedding retrieval path requires no LLM during ingestion or retrieval.
- LLM calls are used by
generator.py,eval_judge.py, embedding retrieval, and scripts that explicitly rejudge results.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file edgemem-0.1.3.tar.gz.
File metadata
- Download URL: edgemem-0.1.3.tar.gz
- Upload date:
- Size: 46.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d96d0270b2ad2d166e28e100161cd94325cee18db30fa9f5736c57aee5ef73e0
|
|
| MD5 |
c420972bae30ba8d263f336228d29d39
|
|
| BLAKE2b-256 |
73352b8dc3b098d8d26a588adb68c00623d38adb8e16ee41003a4ac1e5e5a92a
|
File details
Details for the file edgemem-0.1.3-py3-none-any.whl.
File metadata
- Download URL: edgemem-0.1.3-py3-none-any.whl
- Upload date:
- Size: 60.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9be1d510de2ec28ba6790808ad141f657da090f36d7dbc35fd360d3abcd280c6
|
|
| MD5 |
522aa1cf900ec7e6650bf9571943b1a7
|
|
| BLAKE2b-256 |
fe8e6190649b5c5de7e5cfde65508ba1b0feeb1ec7feec272772f7eb74fae6d5
|