A flexible memory system for Gen AI applications
Project description
GLLM Memory
Description
Memory layer for AI agents.
The public API is MemoryManager.
You can use it in two ways:
- HTTP mode: use
api_keyand optionalhost - SDK mode: use
MemoryManagerConfigand passconfig=...
In SDK mode, you can register your own LLM, embedding model, memory store, and optional reranker without exposing backend-specific config to application code.
Prerequisites
Mandatory
- Python 3.11+ — Install here
- pip — Install here
- uv — Install here
- gcloud CLI (for authentication) — Install here, then log in using:
gcloud auth login
Mem0 Configuration
- Mem0 API key (HTTP client): from Mem0 dashboard.
- Self-hosted URL: set
MEM0_HOSTif the API is not Mem0 cloud.
Environment variables (typical):
| Variable | Role |
|---|---|
MEM0_API_KEY |
Required for the HTTP client when not passed in code. |
MEM0_HOST |
Optional; base URL for self-hosted Mem0 API. |
MEMORY_PROVIDER |
Optional; default is Mem0 (mem0). |
TIMEOUT_SEC |
Optional; request timeout in seconds (default 30). Used when building clients from env. |
Two ways to connect
- HTTP API — pass
api_keyand optionallyhosttoMemoryManager. Same as settingMEM0_API_KEY/MEM0_HOSTand using defaults. - SDK mode — pass
config=MemoryManagerConfig(...)toMemoryManager. This path uses the local SDK integration and lets you register LLM, embedding, memory store, and reranker through the builder API. Seeexamples/example_mem0_sdk_client.py.
Do not commit secrets to git.
📦 Installation
Install from Artifact Registry
This requires authentication via the gcloud CLI.
uv pip install \
--extra-index-url "https://oauth2accesstoken:$(gcloud auth print-access-token)@glsdk.gdplabs.id/gen-ai-internal/simple/" \
gllm-memory
🔧 Local Development Setup
Prerequisites
- Python 3.11+ — Install here
- pip — Install here
- uv — Install here
- gcloud CLI — Install here, then log in using:
gcloud auth login
- Git — Install here
- Access to the GDP Labs SDK GitHub repository
1. Clone Repository
git clone git@github.com:GDP-ADMIN/gl-sdk.git
cd gl-sdk/libs/gllm-memory
2. Setup Authentication
Set the following environment variables to authenticate with internal package indexes:
export UV_INDEX_GEN_AI_INTERNAL_USERNAME=oauth2accesstoken
export UV_INDEX_GEN_AI_INTERNAL_PASSWORD="$(gcloud auth print-access-token)"
export UV_INDEX_GEN_AI_USERNAME=oauth2accesstoken
export UV_INDEX_GEN_AI_PASSWORD="$(gcloud auth print-access-token)"
3. Quick Setup
Run:
make setup
4. Activate Virtual Environment
source .venv/bin/activate
🚀 Quick Start
For Using the Library
-
Install the package:
uv pip install gllm-memory
-
Set your Mem0 API key:
export MEM0_API_KEY="your_api_key_here"
-
For Self-Hosted Mem0 (Optional):
export MEM0_API_KEY="your_api_key_here" export MEM0_HOST="https://your-mem0-server.com"
For Development
-
Complete setup (this will install all dependencies, setup pre-commit, and activate the environment):
make setup source .venv/bin/activate
-
Set your Mem0 API key:
export MEM0_API_KEY="your_api_key_here"
-
Run an example:
# HTTP API (add, search, list, delete_by_user_query, delete) python examples/simple_usage.py # SDK mode with MemoryManagerConfig python examples/example_mem0_sdk_client.py
Architecture
The system follows a layered architecture below:
┌──────────────────────────────────────────────────────────────┐
│ Application Layer │
├──────────────────────────────────────────────────────────────┤
│ Memory Manager │
├──────────────────────────────────────────────────────────────┤
│ Memory Client (Base) │
├──────────────────────────────────────────────────────────────┤
│ Provider Layer (Mem0) │
├──────────────────────────────────────────────────────────────┤
│ Mem0 Platform (HTTP client or Python SDK) │
└──────────────────────────────────────────────────────────────┘
🌐 HTTP Mode
Use this mode if you want to connect to the HTTP API directly.
Point the client at your own server:
from gllm_memory import MemoryManager
manager = MemoryManager(
api_key="your-api-key",
host="https://your-mem0-server.com",
)
If you want local SDK mode, use MemoryManager(config=...) instead of api_key and host.
🧩 SDK Mode With MemoryManagerConfig
Use this mode if you want to:
- register your own LM Invoker
- register your own EM Invoker
- choose the memory store from config
- configure an optional reranker
- keep application code independent from backend-specific config shape
gllm-memory does not create provider-specific invokers for you in normal SDK usage.
You create the invoker instances in your application, then register them in
MemoryManagerConfig.
SDK Mode Example
from gllm_memory import MemoryManager, MemoryManagerConfig
from gllm_inference.lm_invoker.openai_lm_invoker import OpenAILMInvoker
lm_invoker = OpenAILMInvoker(
model_name="gpt-5-nano",
api_key="your_openai_api_key",
)
def build_em_invoker():
from gllm_inference.em_invoker.openai_em_invoker import OpenAIEMInvoker
return OpenAIEMInvoker(
model_name="text-embedding-3-small",
api_key="your_openai_api_key",
)
em_invoker = build_em_invoker()
config = (
MemoryManagerConfig.builder()
.memory_store.elasticsearch(
host="localhost",
port=9200,
collection_name="memories",
embedding_model_dims=1536,
)
.embedding.register(
em_invoker,
model="text-embedding-3-small",
embedding_dims=1536,
)
.llm.register(lm_invoker, model="gpt-5-nano")
.reranker.llm_reranker(
model="gpt-5-nano",
api_key="your_openai_api_key",
top_k=5,
)
.build()
)
memory_manager = MemoryManager(config=config)
gllm-memory does not require a provider-specific helper import for this step.
You only need to pass an LM Invoker instance and an EM Invoker instance. The
reranker is optional; when configured with llm_reranker, the builder emits the
Mem0-compatible reranker section for SDK search calls that use rerank=True.
If your installed gllm_inference version still has a circular import on
OpenAIEMInvoker, instantiate the EM invoker with a local lazy import like the
example above.
SDK Mode With Default Config
If you want to use the default SDK setup, you can build an empty config:
from gllm_memory import MemoryManager, MemoryManagerConfig
config = MemoryManagerConfig.builder().build()
memory_manager = MemoryManager(config=config)
Default SDK behavior:
- memory store uses Elasticsearch
- embedding uses
gllm-inference: EM Invokerwith OpenAI defaults - llm uses
gllm-inference: LM Invokerwith OpenAI defaults - reranker is omitted unless configured explicitly
Required environment variables for the default SDK config:
ELASTICSEARCH_HOSTELASTICSEARCH_PORTELASTICSEARCH_COLLECTION_NAMEELASTICSEARCH_EMBEDDING_MODEL_DIMSOPENAI_API_KEY
Optional environment variables:
ELASTICSEARCH_USERELASTICSEARCH_PASSWORDOPENAI_BASE_URLOPENAI_MODEL_NAME(default SDK LLM model override)OPENAI_EMBEDDING_MODEL(used byexamples/example_mem0_sdk_client.py)
SDK Mode With Another Memory Store
You can register another memory store with the same builder style:
config = (
MemoryManagerConfig.builder()
.memory_store.register(
"pgvector",
{
"host": "localhost",
"port": 5432,
"dbname": "postgres",
"user": "postgres",
"password": "postgres",
"collection_name": "memories",
},
)
.embedding.register(em_invoker, embedding_dims=1536)
.llm.register(lm_invoker)
.build()
)
Notes:
memory_storeis the public config name- you do not need to know the backend-native config structure for the built-in builder helpers
- non-Elasticsearch stores use the backend's native behavior unless
gllm-memoryadds custom handling for them
Core API methods
MemoryManager exposes async methods; query is required where noted.
Methods
add(user_id, agent_id, messages, scopes, metadata, infer, is_important)- Add new memories from message objectssearch(query, user_id, agent_id, scopes, metadata, threshold, top_k, include_important, rerank)- Search and retrieve memories by query (query is required)list_memories(user_id, agent_id, scopes, metadata, keywords, page, page_size)- Get all memories with pagination and keywords filteringupdate(memory_id, new_content, metadata, user_id, agent_id, scopes, is_important)- Update an existing memory by IDdelete(memory_ids, user_id, agent_id, scopes, metadata)- Delete memories by IDs or by user/agent identifiersdelete_by_user_query(query, user_id, agent_id, scopes, metadata, threshold, top_k)- Delete memories by query ( query is required)
Example (HTTP API)
from gllm_memory import MemoryManager
from gllm_inference.schema.message import Message
from gllm_memory.enums import MemoryScope
memory_manager = MemoryManager(api_key="...", host="...") # host optional
messages = [
Message.user("I love pizza"),
Message.assistant("Noted."),
]
await memory_manager.add(
user_id="user_123",
agent_id="agent_456",
messages=messages,
scopes={MemoryScope.USER},
metadata={"conversation_id": "chat_001"}, # Optional
infer=True, # Optional, defaults to True
is_important=False, # Optional, defaults to False
)
memories = await memory_manager.search(
query="What does the user like?",
user_id="user_123",
scopes={MemoryScope.USER},
metadata=None, # Optional
threshold=0.3, # Optional, defaults to 0.3
top_k=10, # Optional, defaults to 10
include_important=False, # Optional, defaults to False
rerank=False, # Optional, defaults to False; if True, applies re-ranking to results
)
await memory_manager.list_memories(
user_id="user_123",
scopes={MemoryScope.USER},
metadata=None, # Optional
keywords="food", # Optional
page=1, # Optional, defaults to 1
page_size=100 # Optional, defaults to 100
)
await memory_manager.update(
memory_id="memory_uuid_123",
new_content="Updated text",
user_id="user_123",
agent_id="agent_456",
scopes={MemoryScope.USER, MemoryScope.ASSISTANT}, # Optional
is_important=None, # Optional; None leaves existing flag unchanged
)
await memory_manager.delete_by_user_query(
query="food preferences",
user_id="user_123",
scopes={MemoryScope.USER, MemoryScope.ASSISTANT},
metadata=None, # Optional
threshold=0.3, # Optional, defaults to 0.3
top_k=10 # Optional, defaults to 10
)
# Delete memories by identifiers
delete_result = await memory_manager.delete(
memory_ids=None, # Optional
user_id="user_123",
scopes={MemoryScope.USER, MemoryScope.ASSISTANT},
metadata=None # Optional
)
# Then use await manager.add(...), search(...), etc.
Example (SDK Mode)
from gllm_memory import MemoryManager, MemoryManagerConfig
from gllm_memory.enums import MemoryScope
from gllm_inference.lm_invoker.openai_lm_invoker import OpenAILMInvoker
from gllm_inference.schema.message import Message
lm_invoker = OpenAILMInvoker(model_name="gpt-5-nano", api_key="...")
def build_em_invoker():
from gllm_inference.em_invoker.openai_em_invoker import OpenAIEMInvoker
return OpenAIEMInvoker(model_name="text-embedding-3-small", api_key="...")
em_invoker = build_em_invoker()
config = (
MemoryManagerConfig.builder()
.memory_store.elasticsearch(
host="localhost",
port=9200,
collection_name="memories",
embedding_model_dims=1536,
)
.embedding.register(em_invoker, embedding_dims=1536)
.llm.register(lm_invoker)
.reranker.llm_reranker(model="gpt-5-nano", api_key="...", top_k=5)
.build()
)
memory_manager = MemoryManager(config=config)
messages = [
Message.user("I love pizza"),
Message.assistant("Noted."),
]
await memory_manager.add(
user_id="user_123",
agent_id="agent_456",
messages=messages,
scopes={MemoryScope.USER},
)
🔧 Code Quality
# Format code with ruff
ruff format gllm_memory/ tests/
# Check code quality
ruff check gllm_memory/ tests/
# Fix auto-fixable issues
ruff check gllm_memory/ tests/ --fix
Local Development Utilities
The following Makefile commands are available for quick operations:
Install uv
make install-uv
Install Pre-Commit
make install-pre-commit
Install Dependencies
make install
Update Dependencies
make update
Run Tests
make test
Contributing
Please refer to the Python Style Guide for information about code style, documentation standards, and SCA requirements.
Contributing Steps
-
Fork and clone the repository
-
Set up development environment:
# Complete setup: installs uv, configures auth, installs packages, sets up pre-commit make setup
-
Activate virtual environment:
source .venv/bin/activate
-
Run tests to ensure everything works:
make test
-
Make your changes and ensure tests pass:
# Make your changes # Ensure tests pass make test
-
Submit a pull request:
# Submit a pull request git push origin your-branch
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file gllm_memory_binary-0.2.0-cp312-cp312-win_amd64.whl.
File metadata
- Download URL: gllm_memory_binary-0.2.0-cp312-cp312-win_amd64.whl
- Upload date:
- Size: 888.9 kB
- Tags: CPython 3.12, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ad57d3bdd4a223254942081136a01e139cd9f8fd3f37cc7cb31ca53e7c9077c4
|
|
| MD5 |
b7da31995357d174e920513e1859ea11
|
|
| BLAKE2b-256 |
9c93dc98d9b93987f97c22689264ecf47046ffe212eaa163e9d7b3d1b5a5031b
|
Provenance
The following attestation bundles were made for gllm_memory_binary-0.2.0-cp312-cp312-win_amd64.whl:
Publisher:
build-binary.yml on GDP-ADMIN/gl-sdk
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
gllm_memory_binary-0.2.0-cp312-cp312-win_amd64.whl -
Subject digest:
ad57d3bdd4a223254942081136a01e139cd9f8fd3f37cc7cb31ca53e7c9077c4 - Sigstore transparency entry: 1409216661
- Sigstore integration time:
-
Permalink:
GDP-ADMIN/gl-sdk@377716c5603062ddcc6b7410a8beab11ff7d4f8a -
Branch / Tag:
refs/tags/gllm_memory-v0.2.0 - Owner: https://github.com/GDP-ADMIN
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
build-binary.yml@377716c5603062ddcc6b7410a8beab11ff7d4f8a -
Trigger Event:
push
-
Statement type:
File details
Details for the file gllm_memory_binary-0.2.0-cp312-cp312-manylinux_2_31_x86_64.whl.
File metadata
- Download URL: gllm_memory_binary-0.2.0-cp312-cp312-manylinux_2_31_x86_64.whl
- Upload date:
- Size: 1.4 MB
- Tags: CPython 3.12, manylinux: glibc 2.31+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.8.24
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d6b73ecc7c8822dddee9d47a80a1105af49049cb4a2448407db1c2928a0504a4
|
|
| MD5 |
a7da8b0d6b9e38ba4321fbc51fb1a55c
|
|
| BLAKE2b-256 |
385994b42ff674522e422aebe68e88dbf96033760955959831e6161d2fc32d65
|
File details
Details for the file gllm_memory_binary-0.2.0-cp312-cp312-macosx_13_0_arm64.whl.
File metadata
- Download URL: gllm_memory_binary-0.2.0-cp312-cp312-macosx_13_0_arm64.whl
- Upload date:
- Size: 1.1 MB
- Tags: CPython 3.12, macOS 13.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
730920c06eb93f995643a6855da7ea6b6dc266f695118461b0b16efae56afbf4
|
|
| MD5 |
dadae38457f8de2dfa4e8df50e624b81
|
|
| BLAKE2b-256 |
02b1c6be960bb6c5168e93278e4d92b0fd2ad24ed0ebe0b123192b6560da70f1
|
Provenance
The following attestation bundles were made for gllm_memory_binary-0.2.0-cp312-cp312-macosx_13_0_arm64.whl:
Publisher:
build-binary.yml on GDP-ADMIN/gl-sdk
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
gllm_memory_binary-0.2.0-cp312-cp312-macosx_13_0_arm64.whl -
Subject digest:
730920c06eb93f995643a6855da7ea6b6dc266f695118461b0b16efae56afbf4 - Sigstore transparency entry: 1409216706
- Sigstore integration time:
-
Permalink:
GDP-ADMIN/gl-sdk@377716c5603062ddcc6b7410a8beab11ff7d4f8a -
Branch / Tag:
refs/tags/gllm_memory-v0.2.0 - Owner: https://github.com/GDP-ADMIN
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
build-binary.yml@377716c5603062ddcc6b7410a8beab11ff7d4f8a -
Trigger Event:
push
-
Statement type:
File details
Details for the file gllm_memory_binary-0.2.0-cp311-cp311-win_amd64.whl.
File metadata
- Download URL: gllm_memory_binary-0.2.0-cp311-cp311-win_amd64.whl
- Upload date:
- Size: 930.3 kB
- Tags: CPython 3.11, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
60f2468a12aee80ed20a0eece29e1e83289995f3ac396261cb935a20fc02489e
|
|
| MD5 |
19fad7d88e25b4e505ad9b75d858ec36
|
|
| BLAKE2b-256 |
9f644fff56b8a5b8ef6ae0fc8d7cef9236349cc6a45cdcc6e52cd61086e30158
|
Provenance
The following attestation bundles were made for gllm_memory_binary-0.2.0-cp311-cp311-win_amd64.whl:
Publisher:
build-binary.yml on GDP-ADMIN/gl-sdk
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
gllm_memory_binary-0.2.0-cp311-cp311-win_amd64.whl -
Subject digest:
60f2468a12aee80ed20a0eece29e1e83289995f3ac396261cb935a20fc02489e - Sigstore transparency entry: 1409216659
- Sigstore integration time:
-
Permalink:
GDP-ADMIN/gl-sdk@377716c5603062ddcc6b7410a8beab11ff7d4f8a -
Branch / Tag:
refs/tags/gllm_memory-v0.2.0 - Owner: https://github.com/GDP-ADMIN
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
build-binary.yml@377716c5603062ddcc6b7410a8beab11ff7d4f8a -
Trigger Event:
push
-
Statement type:
File details
Details for the file gllm_memory_binary-0.2.0-cp311-cp311-manylinux_2_31_x86_64.whl.
File metadata
- Download URL: gllm_memory_binary-0.2.0-cp311-cp311-manylinux_2_31_x86_64.whl
- Upload date:
- Size: 1.2 MB
- Tags: CPython 3.11, manylinux: glibc 2.31+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.8.24
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
46eef9b8dbe13672e9d6a574f4dead4f2e4a77a2a27e4c9d9b6cf909dce47cfc
|
|
| MD5 |
fb1dd230b0a2af2482a23f877e04cf26
|
|
| BLAKE2b-256 |
cdffb7fd5d546b7727e3ce516c31561315aad1005460e7e60eae7a4c6a5decb5
|
File details
Details for the file gllm_memory_binary-0.2.0-cp311-cp311-macosx_13_0_arm64.whl.
File metadata
- Download URL: gllm_memory_binary-0.2.0-cp311-cp311-macosx_13_0_arm64.whl
- Upload date:
- Size: 1.1 MB
- Tags: CPython 3.11, macOS 13.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
78a7e9028d35aa759a303650577d1521a3bba6d20cc647264398d737519d2eb1
|
|
| MD5 |
7540ecdbc09517151031bdce16130b6c
|
|
| BLAKE2b-256 |
ffc3c700e3f7aac99aebd8f8587c1a0b38a696ab4ddfc907e3cfb2f07d0bfacd
|
Provenance
The following attestation bundles were made for gllm_memory_binary-0.2.0-cp311-cp311-macosx_13_0_arm64.whl:
Publisher:
build-binary.yml on GDP-ADMIN/gl-sdk
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
gllm_memory_binary-0.2.0-cp311-cp311-macosx_13_0_arm64.whl -
Subject digest:
78a7e9028d35aa759a303650577d1521a3bba6d20cc647264398d737519d2eb1 - Sigstore transparency entry: 1409216751
- Sigstore integration time:
-
Permalink:
GDP-ADMIN/gl-sdk@377716c5603062ddcc6b7410a8beab11ff7d4f8a -
Branch / Tag:
refs/tags/gllm_memory-v0.2.0 - Owner: https://github.com/GDP-ADMIN
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
build-binary.yml@377716c5603062ddcc6b7410a8beab11ff7d4f8a -
Trigger Event:
push
-
Statement type: