Skip to main content

Engram memory integration for LlamaIndex — durable, explainable memory for agents.

Project description

llama-index-memory-engram

Durable, explainable memory for LlamaIndex agents — powered by Engram.

EngramMemory is a BaseMemory implementation that replaces LlamaIndex's built-in chat-history buffer with Engram's hybrid retrieval pipeline (BM25 + vector + knowledge graph + reranker). Every message your agent sees is persisted to an Engram bucket; reads come back in chronological order, and semantic recall is one call away.

Install

pip install llama-index-memory-engram

Usage

import os
from llama_index.llms.openai import OpenAI
from llama_index.core.agent.workflow import FunctionAgent
from llama_index.memory.engram import EngramMemory

os.environ["ENGRAM_API_KEY"] = "eng_live_..."   # or pass api_key=... explicitly

memory = EngramMemory.from_defaults(
    bucket="user-42",         # one bucket per user / session / agent
    read_limit=50,            # how many recent messages get() returns
)

agent = FunctionAgent(
    llm=OpenAI("gpt-4o"),
    tools=[...],
)

# IMPORTANT: pass memory to .run(), NOT to FunctionAgent(...)
response = await agent.run(
    "What did we decide about the Q3 launch?",
    memory=memory,
)

⚠️ Pass memory to agent.run(), not the constructor

Modern LlamaIndex FunctionAgent.__init__ accepts arbitrary **kwargs but silently swallows memory= — your memory backend will appear configured but never actually be touched. You must pass memory to each agent.run(..., memory=memory) invocation. This is a LlamaIndex API quirk, not an issue with this package; we caught it in our e2e test and it's the single most common pitfall when wiring EngramMemory.

If you want it to feel like a constructor arg, wrap once:

async def chat(prompt: str) -> str:
    r = await agent.run(prompt, memory=memory)
    return str(r)

Get an API key at https://lumetra.io. Keys look like eng_live_....

Direct semantic recall

get() returns the most recent read_limit messages, which is what agents expect from chat history. When you want hybrid retrieval over the entire bucket, call query() directly:

result = memory.query("regulatory risks we discussed last quarter")
print(result["answer"])
print(result["memories_found"])

Bucket scoping

Pick a bucket name per logical conversation scope:

EngramMemory(bucket=f"user-{user_id}")           # per user
EngramMemory(bucket=f"session-{session_id}")     # per session
EngramMemory(bucket=f"agent-{agent_id}")         # per agent

Buckets are created on first write — no admin call needed.

Self-hosted Engram

EngramMemory(
    bucket="ops",
    base_url="https://engram.internal.example.com",
    api_key="...",
)

API reference

Method Behavior
put(message) Append one ChatMessage to the bucket.
put_messages(messages) Append many.
get(input=None) Return the most recent read_limit messages, oldest-first.
get_all() Same as get().
set(messages) Clear the bucket, then write messages.
reset() Clear the bucket.
query(question) Hybrid retrieval over the entire bucket. Returns the raw Engram response.
list_buckets(limit, offset) List buckets visible to this API key.
delete_memory(memory_id) Delete a single memory by id.

All methods have async equivalents (aput, aget, ...) inherited from BaseMemory; they currently run the sync implementation in a thread.

Configuration

Constructor arg Env var Default
api_key ENGRAM_API_KEY required
bucket "default"
base_url "https://api.lumetra.io"
read_limit 50
timeout 120.0

License

MIT — see LICENSE.

For data-handling details see PRIVACY.md and https://lumetra.io/privacy.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llama_index_memory_engram-0.1.1.tar.gz (7.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llama_index_memory_engram-0.1.1-py3-none-any.whl (8.8 kB view details)

Uploaded Python 3

File details

Details for the file llama_index_memory_engram-0.1.1.tar.gz.

File metadata

File hashes

Hashes for llama_index_memory_engram-0.1.1.tar.gz
Algorithm Hash digest
SHA256 19b7885a8255065f529f4e9b3c28c7205881cd49fb0949f7660fb0ae7d5cbdff
MD5 045681bfc0f06eaf84be247ef1dc54d2
BLAKE2b-256 b6659a261c7a1e55759b1cd3fd12c09948305f5a29a413baa44827fb478a50c5

See more details on using hashes here.

File details

Details for the file llama_index_memory_engram-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for llama_index_memory_engram-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1bf238337e56126a57b06abe002fe4deb5260c3946c4e0c860adc83990732e3e
MD5 86c1ff9ebc45ebeb44147ccfcd9a6222
BLAKE2b-256 2a2380e19f8a3f944a8421096c8a7e0cbc46bd75b2bf056dfec88c42dbead337

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page