Skip to main content

A flexible memory system for Gen AI applications

Project description

GLLM Memory

Description

Memory layer for AI agents.

The public API is MemoryManager.

You can use it in two ways:

  1. HTTP mode: use api_key and optional host
  2. SDK mode: use MemoryManagerConfig and pass config=...

In SDK mode, you can register your own LLM, embedding model, memory store, and optional reranker without exposing backend-specific config to application code.

Prerequisites

Mandatory

  1. Python 3.11+Install here
  2. pipInstall here
  3. uvInstall here
  4. gcloud CLI (for authentication) — Install here, then log in using:
    gcloud auth login
    

Mem0 Configuration

  • Mem0 API key (HTTP client): from Mem0 dashboard.
  • Self-hosted URL: set MEM0_HOST if the API is not Mem0 cloud.

Environment variables (typical):

Variable Role
MEM0_API_KEY Required for the HTTP client when not passed in code.
MEM0_HOST Optional; base URL for self-hosted Mem0 API.
MEMORY_PROVIDER Optional; default is Mem0 (mem0).
TIMEOUT_SEC Optional; request timeout in seconds (default 30). Used when building clients from env.

Two ways to connect

  1. HTTP API — pass api_key and optionally host to MemoryManager. Same as setting MEM0_API_KEY / MEM0_HOST and using defaults.
  2. SDK mode — pass config=MemoryManagerConfig(...) to MemoryManager. This path uses the local SDK integration and lets you register LLM, embedding, memory store, and reranker through the builder API. See examples/example_mem0_sdk_client.py.

Do not commit secrets to git.


📦 Installation

Install from Artifact Registry

This requires authentication via the gcloud CLI.

uv pip install \
  --extra-index-url "https://oauth2accesstoken:$(gcloud auth print-access-token)@glsdk.gdplabs.id/gen-ai-internal/simple/" \
  gllm-memory

🔧 Local Development Setup

Prerequisites

  1. Python 3.11+Install here
  2. pipInstall here
  3. uvInstall here
  4. gcloud CLIInstall here, then log in using:
    gcloud auth login
    
  5. GitInstall here
  6. Access to the GDP Labs SDK GitHub repository

1. Clone Repository

git clone git@github.com:GDP-ADMIN/gl-sdk.git
cd gl-sdk/libs/gllm-memory

2. Setup Authentication

Set the following environment variables to authenticate with internal package indexes:

export UV_INDEX_GEN_AI_INTERNAL_USERNAME=oauth2accesstoken
export UV_INDEX_GEN_AI_INTERNAL_PASSWORD="$(gcloud auth print-access-token)"
export UV_INDEX_GEN_AI_USERNAME=oauth2accesstoken
export UV_INDEX_GEN_AI_PASSWORD="$(gcloud auth print-access-token)"

3. Quick Setup

Run:

make setup

4. Activate Virtual Environment

source .venv/bin/activate

🚀 Quick Start

For Using the Library

  1. Install the package:

    uv pip install gllm-memory
    
  2. Set your Mem0 API key:

    export MEM0_API_KEY="your_api_key_here"
    
  3. For Self-Hosted Mem0 (Optional):

    export MEM0_API_KEY="your_api_key_here"
    export MEM0_HOST="https://your-mem0-server.com"
    

For Development

  1. Complete setup (this will install all dependencies, setup pre-commit, and activate the environment):

    make setup
    source .venv/bin/activate
    
  2. Set your Mem0 API key:

    export MEM0_API_KEY="your_api_key_here"
    
  3. Run an example:

    # HTTP API (add, search, list, delete_by_user_query, delete)
    python examples/simple_usage.py
    # SDK mode with MemoryManagerConfig
    python examples/example_mem0_sdk_client.py
    

Architecture

The system follows a layered architecture below:

┌──────────────────────────────────────────────────────────────┐
│                    Application Layer                         │
├──────────────────────────────────────────────────────────────┤
│                    Memory Manager                            │
├──────────────────────────────────────────────────────────────┤
│                    Memory Client (Base)                      │
├──────────────────────────────────────────────────────────────┤
│                    Provider Layer (Mem0)                     │
├──────────────────────────────────────────────────────────────┤
│                    Mem0 Platform (HTTP client or Python SDK) │
└──────────────────────────────────────────────────────────────┘

🌐 HTTP Mode

Use this mode if you want to connect to the HTTP API directly.

Point the client at your own server:

from gllm_memory import MemoryManager

manager = MemoryManager(
    api_key="your-api-key",
    host="https://your-mem0-server.com",
)

If you want local SDK mode, use MemoryManager(config=...) instead of api_key and host.

🧩 SDK Mode With MemoryManagerConfig

Use this mode if you want to:

  1. register your own LM Invoker
  2. register your own EM Invoker
  3. choose the memory store from config
  4. configure an optional reranker
  5. keep application code independent from backend-specific config shape

gllm-memory does not create provider-specific invokers for you in normal SDK usage. You create the invoker instances in your application, then register them in MemoryManagerConfig.

SDK Mode Example

from gllm_memory import MemoryManager, MemoryManagerConfig
from gllm_inference.lm_invoker.openai_lm_invoker import OpenAILMInvoker

lm_invoker = OpenAILMInvoker(
    model_name="gpt-5-nano",
    api_key="your_openai_api_key",
)


def build_em_invoker():
    from gllm_inference.em_invoker.openai_em_invoker import OpenAIEMInvoker

    return OpenAIEMInvoker(
        model_name="text-embedding-3-small",
        api_key="your_openai_api_key",
    )


em_invoker = build_em_invoker()

config = (
    MemoryManagerConfig.builder()
    .memory_store.elasticsearch(
        host="localhost",
        port=9200,
        collection_name="memories",
        embedding_model_dims=1536,
    )
    .embedding.register(
        em_invoker,
        model="text-embedding-3-small",
        embedding_dims=1536,
    )
    .llm.register(lm_invoker, model="gpt-5-nano")
    .reranker.llm_reranker(
        model="gpt-5-nano",
        api_key="your_openai_api_key",
        top_k=5,
    )
    .build()
)

memory_manager = MemoryManager(config=config)

gllm-memory does not require a provider-specific helper import for this step. You only need to pass an LM Invoker instance and an EM Invoker instance. The reranker is optional; when configured with llm_reranker, the builder emits the Mem0-compatible reranker section for SDK search calls that use rerank=True. If your installed gllm_inference version still has a circular import on OpenAIEMInvoker, instantiate the EM invoker with a local lazy import like the example above.

SDK Mode With Default Config

If you want to use the default SDK setup, you can build an empty config:

from gllm_memory import MemoryManager, MemoryManagerConfig

config = MemoryManagerConfig.builder().build()
memory_manager = MemoryManager(config=config)

Default SDK behavior:

  1. memory store uses Elasticsearch
  2. embedding uses gllm-inference: EM Invoker with OpenAI defaults
  3. llm uses gllm-inference: LM Invoker with OpenAI defaults
  4. reranker is omitted unless configured explicitly

Required environment variables for the default SDK config:

  1. ELASTICSEARCH_HOST
  2. ELASTICSEARCH_PORT
  3. ELASTICSEARCH_COLLECTION_NAME
  4. ELASTICSEARCH_EMBEDDING_MODEL_DIMS
  5. OPENAI_API_KEY

Optional environment variables:

  1. ELASTICSEARCH_USER
  2. ELASTICSEARCH_PASSWORD
  3. OPENAI_BASE_URL
  4. OPENAI_MODEL_NAME (default SDK LLM model override)
  5. OPENAI_EMBEDDING_MODEL (used by examples/example_mem0_sdk_client.py)

SDK Mode With Another Memory Store

You can register another memory store with the same builder style:

config = (
    MemoryManagerConfig.builder()
    .memory_store.register(
        "pgvector",
        {
            "host": "localhost",
            "port": 5432,
            "dbname": "postgres",
            "user": "postgres",
            "password": "postgres",
            "collection_name": "memories",
        },
    )
    .embedding.register(em_invoker, embedding_dims=1536)
    .llm.register(lm_invoker)
    .build()
)

Notes:

  1. memory_store is the public config name
  2. you do not need to know the backend-native config structure for the built-in builder helpers
  3. non-Elasticsearch stores use the backend's native behavior unless gllm-memory adds custom handling for them

Core API methods

MemoryManager exposes async methods; query is required where noted.

Methods

  • add(user_id, agent_id, messages, scopes, metadata, infer, is_important) - Add new memories from message objects
  • search(query, user_id, agent_id, scopes, metadata, threshold, top_k, include_important, rerank) - Search and retrieve memories by query (query is required)
  • list_memories(user_id, agent_id, scopes, metadata, keywords, page, page_size) - Get all memories with pagination and keywords filtering
  • update(memory_id, new_content, metadata, user_id, agent_id, scopes, is_important) - Update an existing memory by ID
  • delete(memory_ids, user_id, agent_id, scopes, metadata) - Delete memories by IDs or by user/agent identifiers
  • delete_by_user_query(query, user_id, agent_id, scopes, metadata, threshold, top_k) - Delete memories by query ( query is required)

Example (HTTP API)

from gllm_memory import MemoryManager
from gllm_inference.schema.message import Message
from gllm_memory.enums import MemoryScope

memory_manager = MemoryManager(api_key="...", host="...")  # host optional

messages = [
    Message.user("I love pizza"),
    Message.assistant("Noted."),
]
await memory_manager.add(
    user_id="user_123",
    agent_id="agent_456",
    messages=messages,
    scopes={MemoryScope.USER},
    metadata={"conversation_id": "chat_001"},  # Optional
    infer=True,  # Optional, defaults to True
    is_important=False,  # Optional, defaults to False
)

memories = await memory_manager.search(
    query="What does the user like?",
    user_id="user_123",
    scopes={MemoryScope.USER},
    metadata=None,  # Optional
    threshold=0.3,  # Optional, defaults to 0.3
    top_k=10,  # Optional, defaults to 10
    include_important=False,  # Optional, defaults to False
    rerank=False,  # Optional, defaults to False; if True, applies re-ranking to results
)

await memory_manager.list_memories(
    user_id="user_123",
    scopes={MemoryScope.USER},
    metadata=None,  # Optional
    keywords="food",  # Optional
    page=1,  # Optional, defaults to 1
    page_size=100  # Optional, defaults to 100
)

await memory_manager.update(
    memory_id="memory_uuid_123",
    new_content="Updated text",
    user_id="user_123",
    agent_id="agent_456",
    scopes={MemoryScope.USER, MemoryScope.ASSISTANT},  # Optional
    is_important=None,  # Optional; None leaves existing flag unchanged
)

await memory_manager.delete_by_user_query(
    query="food preferences",
    user_id="user_123",
    scopes={MemoryScope.USER, MemoryScope.ASSISTANT},
    metadata=None,  # Optional
    threshold=0.3,  # Optional, defaults to 0.3
    top_k=10  # Optional, defaults to 10
)

# Delete memories by identifiers
delete_result = await memory_manager.delete(
    memory_ids=None,  # Optional
    user_id="user_123",
    scopes={MemoryScope.USER, MemoryScope.ASSISTANT},
    metadata=None  # Optional
)
# Then use await manager.add(...), search(...), etc.

Example (SDK Mode)

from gllm_memory import MemoryManager, MemoryManagerConfig
from gllm_memory.enums import MemoryScope
from gllm_inference.lm_invoker.openai_lm_invoker import OpenAILMInvoker
from gllm_inference.schema.message import Message

lm_invoker = OpenAILMInvoker(model_name="gpt-5-nano", api_key="...")


def build_em_invoker():
    from gllm_inference.em_invoker.openai_em_invoker import OpenAIEMInvoker

    return OpenAIEMInvoker(model_name="text-embedding-3-small", api_key="...")


em_invoker = build_em_invoker()

config = (
    MemoryManagerConfig.builder()
    .memory_store.elasticsearch(
        host="localhost",
        port=9200,
        collection_name="memories",
        embedding_model_dims=1536,
    )
    .embedding.register(em_invoker, embedding_dims=1536)
    .llm.register(lm_invoker)
    .reranker.llm_reranker(model="gpt-5-nano", api_key="...", top_k=5)
    .build()
)

memory_manager = MemoryManager(config=config)

messages = [
    Message.user("I love pizza"),
    Message.assistant("Noted."),
]

await memory_manager.add(
    user_id="user_123",
    agent_id="agent_456",
    messages=messages,
    scopes={MemoryScope.USER},
)

🔧 Code Quality

# Format code with ruff
ruff format gllm_memory/ tests/

# Check code quality
ruff check gllm_memory/ tests/

# Fix auto-fixable issues
ruff check gllm_memory/ tests/ --fix

Local Development Utilities

The following Makefile commands are available for quick operations:

Install uv

make install-uv

Install Pre-Commit

make install-pre-commit

Install Dependencies

make install

Update Dependencies

make update

Run Tests

make test

Contributing

Please refer to the Python Style Guide for information about code style, documentation standards, and SCA requirements.

Contributing Steps

  1. Fork and clone the repository

  2. Set up development environment:

    # Complete setup: installs uv, configures auth, installs packages, sets up pre-commit
    make setup
    
  3. Activate virtual environment:

    source .venv/bin/activate
    
  4. Run tests to ensure everything works:

    make test
    
  5. Make your changes and ensure tests pass:

    # Make your changes
    # Ensure tests pass
    make test
    
  6. Submit a pull request:

    # Submit a pull request
    git push origin your-branch
    

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

gllm_memory_binary-0.2.0-cp312-cp312-win_amd64.whl (888.9 kB view details)

Uploaded CPython 3.12Windows x86-64

gllm_memory_binary-0.2.0-cp312-cp312-manylinux_2_31_x86_64.whl (1.4 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.31+ x86-64

gllm_memory_binary-0.2.0-cp312-cp312-macosx_13_0_arm64.whl (1.1 MB view details)

Uploaded CPython 3.12macOS 13.0+ ARM64

gllm_memory_binary-0.2.0-cp311-cp311-win_amd64.whl (930.3 kB view details)

Uploaded CPython 3.11Windows x86-64

gllm_memory_binary-0.2.0-cp311-cp311-manylinux_2_31_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.31+ x86-64

gllm_memory_binary-0.2.0-cp311-cp311-macosx_13_0_arm64.whl (1.1 MB view details)

Uploaded CPython 3.11macOS 13.0+ ARM64

File details

Details for the file gllm_memory_binary-0.2.0-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for gllm_memory_binary-0.2.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 ad57d3bdd4a223254942081136a01e139cd9f8fd3f37cc7cb31ca53e7c9077c4
MD5 b7da31995357d174e920513e1859ea11
BLAKE2b-256 9c93dc98d9b93987f97c22689264ecf47046ffe212eaa163e9d7b3d1b5a5031b

See more details on using hashes here.

Provenance

The following attestation bundles were made for gllm_memory_binary-0.2.0-cp312-cp312-win_amd64.whl:

Publisher: build-binary.yml on GDP-ADMIN/gl-sdk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file gllm_memory_binary-0.2.0-cp312-cp312-manylinux_2_31_x86_64.whl.

File metadata

File hashes

Hashes for gllm_memory_binary-0.2.0-cp312-cp312-manylinux_2_31_x86_64.whl
Algorithm Hash digest
SHA256 d6b73ecc7c8822dddee9d47a80a1105af49049cb4a2448407db1c2928a0504a4
MD5 a7da8b0d6b9e38ba4321fbc51fb1a55c
BLAKE2b-256 385994b42ff674522e422aebe68e88dbf96033760955959831e6161d2fc32d65

See more details on using hashes here.

File details

Details for the file gllm_memory_binary-0.2.0-cp312-cp312-macosx_13_0_arm64.whl.

File metadata

File hashes

Hashes for gllm_memory_binary-0.2.0-cp312-cp312-macosx_13_0_arm64.whl
Algorithm Hash digest
SHA256 730920c06eb93f995643a6855da7ea6b6dc266f695118461b0b16efae56afbf4
MD5 dadae38457f8de2dfa4e8df50e624b81
BLAKE2b-256 02b1c6be960bb6c5168e93278e4d92b0fd2ad24ed0ebe0b123192b6560da70f1

See more details on using hashes here.

Provenance

The following attestation bundles were made for gllm_memory_binary-0.2.0-cp312-cp312-macosx_13_0_arm64.whl:

Publisher: build-binary.yml on GDP-ADMIN/gl-sdk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file gllm_memory_binary-0.2.0-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for gllm_memory_binary-0.2.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 60f2468a12aee80ed20a0eece29e1e83289995f3ac396261cb935a20fc02489e
MD5 19fad7d88e25b4e505ad9b75d858ec36
BLAKE2b-256 9f644fff56b8a5b8ef6ae0fc8d7cef9236349cc6a45cdcc6e52cd61086e30158

See more details on using hashes here.

Provenance

The following attestation bundles were made for gllm_memory_binary-0.2.0-cp311-cp311-win_amd64.whl:

Publisher: build-binary.yml on GDP-ADMIN/gl-sdk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file gllm_memory_binary-0.2.0-cp311-cp311-manylinux_2_31_x86_64.whl.

File metadata

File hashes

Hashes for gllm_memory_binary-0.2.0-cp311-cp311-manylinux_2_31_x86_64.whl
Algorithm Hash digest
SHA256 46eef9b8dbe13672e9d6a574f4dead4f2e4a77a2a27e4c9d9b6cf909dce47cfc
MD5 fb1dd230b0a2af2482a23f877e04cf26
BLAKE2b-256 cdffb7fd5d546b7727e3ce516c31561315aad1005460e7e60eae7a4c6a5decb5

See more details on using hashes here.

File details

Details for the file gllm_memory_binary-0.2.0-cp311-cp311-macosx_13_0_arm64.whl.

File metadata

File hashes

Hashes for gllm_memory_binary-0.2.0-cp311-cp311-macosx_13_0_arm64.whl
Algorithm Hash digest
SHA256 78a7e9028d35aa759a303650577d1521a3bba6d20cc647264398d737519d2eb1
MD5 7540ecdbc09517151031bdce16130b6c
BLAKE2b-256 ffc3c700e3f7aac99aebd8f8587c1a0b38a696ab4ddfc907e3cfb2f07d0bfacd

See more details on using hashes here.

Provenance

The following attestation bundles were made for gllm_memory_binary-0.2.0-cp311-cp311-macosx_13_0_arm64.whl:

Publisher: build-binary.yml on GDP-ADMIN/gl-sdk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page