Skip to main content

Python SDK for Awareness Memory Cloud

Project description

Awareness Memory SDK — Python

PyPI LongMemEval R@5 Discord

Python SDK for adding persistent memory to AI agents and apps. 95.6% Recall@5 on LongMemEval (ICLR 2025).

Online docs: https://awareness.market/docs?doc=python

Benchmark: LongMemEval (ICLR 2025)

Awareness Memory is evaluated on LongMemEval — the industry standard benchmark for long-term conversational memory, published at ICLR 2025. 500 human-curated questions across 5 core capabilities.

╔══════════════════════════════════════════════════════════════╗
║                                                              ║
║   Awareness Memory — LongMemEval Benchmark Results           ║
║   ─────────────────────────────────────────────────           ║
║                                                              ║
║   Benchmark:  LongMemEval (ICLR 2025)                       ║
║   Dataset:    500 human-curated questions                    ║
║   Variant:    LongMemEval_S (~115k tokens per question)      ║
║                                                              ║
║   ┌─────────────────────────────────────────────────┐        ║
║   │                                                 │        ║
║   │   Recall@1    77.6%    (388 / 500)              │        ║
║   │   Recall@3    91.8%    (459 / 500)              │        ║
║   │   Recall@5    95.6%    (478 / 500)  ◀ PRIMARY   │        ║
║   │   Recall@10   97.4%    (487 / 500)              │        ║
║   │                                                 │        ║
║   └─────────────────────────────────────────────────┘        ║
║                                                              ║
║   Method:     Hybrid RRF (BM25 + Semantic Vector Search)     ║
║   Embedding:  all-MiniLM-L6-v2 (384d)                       ║
║   LLM Calls:  0  (pure retrieval, no generation cost)        ║
║   Hardware:   Apple M1, 8GB RAM — 14 min total               ║
║                                                              ║
╚══════════════════════════════════════════════════════════════╝

Leaderboard

┌─────────────────────────────────────────────────────────────┐
│          Long-Term Memory Retrieval — R@5 Leaderboard       │
│          LongMemEval (ICLR 2025, 500 questions)             │
├─────────────────────────────────┬───────────┬───────────────┤
│  System                         │  R@5      │  Note         │
├─────────────────────────────────┼───────────┼───────────────┤
│  MemPalace (ChromaDB raw)       │  96.6%    │  R@5 only *   │
│  ★ Awareness Memory (Hybrid)    │  95.6%    │  Hybrid RRF   │
│  OMEGA                          │  95.4%    │  QA Accuracy  │
│  Mastra (GPT-5-mini)            │  94.9%    │  QA Accuracy  │
│  Mastra (GPT-4o)                │  84.2%    │  QA Accuracy  │
│  Supermemory                    │  81.6%    │  QA Accuracy  │
│  Zep / Graphiti                 │  71.2%    │  QA Accuracy  │
│  GPT-4o (full context)          │  60.6%    │  QA Accuracy  │
├─────────────────────────────────┴───────────┴───────────────┤
│  * MemPalace 96.6% is Recall@5 only, not QA Accuracy.      │
│    Palace hierarchy was NOT used in the evaluation.         │
└─────────────────────────────────────────────────────────────┘

Accuracy by Question Type

┌─────────────────────────────────────────────────────────────┐
│     Awareness Memory — R@5 by Question Type                 │
│                                                             │
│  knowledge-update        ████████████████████████████ 100%  │
│  multi-session           ███████████████████████████▋  98.5%│
│  single-session-asst     ███████████████████████████▌  98.2%│
│  temporal-reasoning      █████████████████████████▊    94.7%│
│  single-session-user     ████████████████████████▎     88.6%│
│  single-session-pref     ███████████████████████▏      86.7%│
│                                                             │
│  Overall                 █████████████████████████▉    95.6%│
│                                                             │
│  ┌───────────────────────────────────────────────┐          │
│  │  Ablation Study                               │          │
│  │  ─────────────────────────────────────────    │          │
│  │  Vector-only:   92.6%  ▓▓▓▓▓▓▓▓▓▓▓▓▓░░░     │          │
│  │  BM25-only:     91.4%  ▓▓▓▓▓▓▓▓▓▓▓▓▓░░░     │          │
│  │  Hybrid RRF:    95.6%  ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓░  ★  │          │
│  │                        Hybrid = +3% over any  │          │
│  │                        single method alone    │          │
│  └───────────────────────────────────────────────┘          │
│                                                             │
│  arxiv.org/abs/2410.10813          awareness.market         │
└─────────────────────────────────────────────────────────────┘

Zero LLM calls. Runs on Apple M1 8GB in 14 minutes. Reproducible benchmark scripts →


Install

pip install awareness-memory-cloud

Framework extras:

pip install -e ".[langchain]"   # LangChain adapter
pip install -e ".[crewai]"      # CrewAI adapter
pip install -e ".[autogen]"     # AutoGen adapter
pip install -e ".[frameworks]"  # All frameworks

Zero-Code Interceptor

The fastest way to add memory. One line — no changes to your AI logic.

Local mode (no API key needed)

from openai import OpenAI
from memory_cloud import MemoryCloudClient, AwarenessInterceptor

client = MemoryCloudClient(mode="local")  # data stays on your machine
interceptor = AwarenessInterceptor(client=client, memory_id="my-project")

openai_client = OpenAI()
interceptor.wrap_openai(openai_client)  # one line — all conversations remembered

response = openai_client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Refactor the auth module"}],
)

Cloud mode (team collaboration, semantic search, sync)

from openai import OpenAI
from anthropic import Anthropic
from memory_cloud import MemoryCloudClient, AwarenessInterceptor

client = MemoryCloudClient(api_key="aw_...")
interceptor = AwarenessInterceptor(client=client, memory_id="memory_123")

# Wrap OpenAI
openai_client = OpenAI()
interceptor.wrap_openai(openai_client)

# Or wrap Anthropic
anthropic_client = Anthropic()
interceptor.wrap_anthropic(anthropic_client)

Direct API Quickstart

Local mode

from memory_cloud import MemoryCloudClient

client = MemoryCloudClient(mode="local")  # connects to local daemon at localhost:8765

client.record(content="Refactored auth middleware.")
result = client.retrieve(query="What did we refactor?")
print(result["results"])

Cloud mode

import os
from memory_cloud import MemoryCloudClient

client = MemoryCloudClient(
    base_url=os.getenv("AWARENESS_API_BASE_URL", "https://awareness.market/api/v1"),
    api_key="YOUR_API_KEY",
)

client.write(
    memory_id="memory_123",
    content="Customer asked for SOC2 evidence and retention policy.",
    kwargs={"source": "python-sdk", "session_id": "demo-session"},
)

result = client.retrieve(
    memory_id="memory_123",
    query="What did customer ask for?",
    custom_kwargs={"k": 3},
)
print(result["results"])

MCP-style Helpers

Local mode

client = MemoryCloudClient(mode="local")
client.record(content="Refactored auth middleware.")
ctx = client.recall_for_task(task="summarize auth changes", limit=8)
print(ctx["results"])

Cloud mode

client = MemoryCloudClient(
    base_url="https://awareness.market/api/v1",
    api_key="YOUR_API_KEY",
)

# Record a single step
client.record(memory_id="memory_123", content="Refactored auth middleware and added tests.")

# Record multiple steps at once
client.record(
    memory_id="memory_123",
    content=[
        "Completed migration patch for user aliases.",
        "Risk: API key owner mismatch can cause tenant leakage.",
    ],
)

# Record knowledge-scoped content
client.record(memory_id="memory_123", content="JWT decision doc", scope="knowledge")

ctx = client.recall_for_task(memory_id="memory_123", task="summarize latest auth changes", limit=8)
print(ctx["results"])

Framework Integrations

LangChain

from memory_cloud import MemoryCloudClient
from memory_cloud.integrations.langchain import MemoryCloudLangChain
import openai

# Local mode (no API key needed)
client = MemoryCloudClient(mode="local")
mc = MemoryCloudLangChain(client=client)

# Cloud mode (team collaboration, semantic search, multi-device sync)
client = MemoryCloudClient(base_url="https://awareness.market/api/v1", api_key="YOUR_API_KEY")
mc = MemoryCloudLangChain(client=client, memory_id="memory_123")

mc.wrap_llm(openai.OpenAI())
retriever = mc.as_retriever()
docs = retriever._get_relevant_documents("What did we decide yesterday?")

CrewAI

from memory_cloud import MemoryCloudClient
from memory_cloud.integrations.crewai import MemoryCloudCrewAI
import openai

# Local mode (no API key needed)
client = MemoryCloudClient(mode="local")
mc = MemoryCloudCrewAI(client=client)

# Cloud mode (team collaboration, semantic search, multi-device sync)
client = MemoryCloudClient(base_url="https://awareness.market/api/v1", api_key="YOUR_API_KEY")
mc = MemoryCloudCrewAI(client=client, memory_id="memory_123")

mc.wrap_llm(openai.OpenAI())
result = mc.memory_search("What happened?")

PraisonAI

from memory_cloud import MemoryCloudClient
from memory_cloud.integrations.praisonai import MemoryCloudPraisonAI
import openai

# Local mode (no API key needed)
client = MemoryCloudClient(mode="local")
mc = MemoryCloudPraisonAI(client=client)

# Cloud mode (team collaboration, semantic search, multi-device sync)
client = MemoryCloudClient(base_url="https://awareness.market/api/v1", api_key="YOUR_API_KEY")
mc = MemoryCloudPraisonAI(client=client, memory_id="memory_123")

mc.wrap_llm(openai.OpenAI())
tools = mc.build_tools()

AutoGen / AG2

from memory_cloud import MemoryCloudClient
from memory_cloud.integrations.autogen import MemoryCloudAutoGen

# Local mode (no API key needed)
client = MemoryCloudClient(mode="local")
mc = MemoryCloudAutoGen(client=client)

# Cloud mode (team collaboration, semantic search, multi-device sync)
client = MemoryCloudClient(base_url="https://awareness.market/api/v1", api_key="YOUR_API_KEY")
mc = MemoryCloudAutoGen(client=client, memory_id="memory_123")

mc.inject_into_agent(assistant)
mc.register_tools(caller=assistant, executor=user_proxy)

Perception (Record-Time Signals)

When you call record(), the response may include a perception array -- automatic signals the system surfaces without you asking. These are computed from pure DB queries (no LLM calls), adding less than 50ms of latency.

Signal types:

Type Description
contradiction New content conflicts with an existing knowledge card
resonance Similar past experience found in memory
pattern Recurring theme detected (e.g., same category appearing often)
staleness A related knowledge card hasn't been updated in a long time
related_decision A past decision is relevant to what you just recorded
result = client.record(memory_id, content="Decided to use RS256 for JWT signing", insights={
    "knowledge_cards": [{"title": "JWT signing", "category": "decision", "summary": "Use RS256"}]
})
if result.get("perception"):
    for signal in result["perception"]:
        print(f"[{signal['type']}] {signal['message']}")
        # [pattern] This is the 4th 'decision' card -- recurring theme
        # [resonance] Similar past experience: "JWT auth migration"

API Coverage

MemoryCloudClient includes:

  • Memory: create_memory, list_memories, get_memory, update_memory, delete_memory
  • Content: write, list_memory_content, delete_memory_content
  • Retrieval/Chat: retrieve, chat, chat_stream, memory_timeline
  • MCP ingest: ingest_events, record
  • Export: export_memory_package, save_export_memory_package
  • Async jobs & upload: get_async_job_status, upload_file, get_upload_job_status
  • Insights/API keys/wizard: insights, create_api_key, list_api_keys, revoke_api_key, memory_wizard

Read Exported Packages

from memory_cloud import read_export_package

parsed = read_export_package("memory_export.zip")
print(parsed["manifest"])
print(len(parsed["chunks"]))
print(bool(parsed["safetensors"]))
print(parsed.get("kv_summary"))

Readers: read_export_package(path), read_export_package_bytes(bytes), parse_jsonl_bytes(bytes)


Examples

  • Basic flow: examples/basic_flow.py
  • Export + read package: examples/export_and_read.py
  • LangChain e2e (real cloud API): examples/e2e_langchain_cloud.py
  • CrewAI e2e (real cloud API): examples/e2e_crewai_cloud.py
  • PraisonAI e2e (real cloud API): examples/e2e_praisonai_cloud.py
  • AutoGen e2e (real cloud API): examples/e2e_autogen_cloud.py

End-to-End (Real Cloud API)

export AWARENESS_API_BASE_URL="https://awareness.market/api/v1"
export AWARENESS_API_KEY="aw_xxx"
export AWARENESS_OWNER_ID="your-owner-id"

python examples/e2e_langchain_cloud.py

What makes Awareness different

Most memory systems pick one extraction strategy. Awareness combines them:

  • Hybrid retrieval by default — BM25 full-text + vector cosine + knowledge-graph 1-hop expansion, fused with Reciprocal Rank Fusion. 95.6% R@5 on LongMemEval, zero LLM calls on the retrieval side.
  • Salience-aware extraction — the client's own LLM self-scores every card on novelty / durability / specificity; cards below 0.4 on novelty or durability are dropped server-side. Framework metadata (Sender (untrusted metadata), turn_brief) is filtered before extraction runs.
  • Project isolationX-Awareness-Project-Dir header scopes memory per project.
  • Zero-LLM backend — all extraction runs on your LLM (Claude, GPT-4, Gemini, local Llama). The backend is a coordinator + storage layer; no inference costs pass through to you.
  • One memory, many clients — same data reachable via Claude Code, OpenClaw, npm / pip / ClawHub, MCP server.

See docs/analysis/MEMPALACE_COMPARISON_2026-04-17.md for the honest side-by-side against MemPalace.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

awareness_memory_cloud-2.6.0.tar.gz (86.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

awareness_memory_cloud-2.6.0-py3-none-any.whl (65.7 kB view details)

Uploaded Python 3

File details

Details for the file awareness_memory_cloud-2.6.0.tar.gz.

File metadata

  • Download URL: awareness_memory_cloud-2.6.0.tar.gz
  • Upload date:
  • Size: 86.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for awareness_memory_cloud-2.6.0.tar.gz
Algorithm Hash digest
SHA256 7378f8c9226ab89240f985071cdde1b275f1114aadc84b2cec09e01d6a38b2aa
MD5 85d034176e97339538b671d225e9b478
BLAKE2b-256 224fb77402792498d11d1678def720dd15ecdc7428e98bd5213e0dad3a4e2472

See more details on using hashes here.

File details

Details for the file awareness_memory_cloud-2.6.0-py3-none-any.whl.

File metadata

File hashes

Hashes for awareness_memory_cloud-2.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cc887f0cc5c9a385a5565302883922cff47d4ede80b478834f971e0169c731c1
MD5 7ef5dc5f7cb439b41cfb56e6f9a85140
BLAKE2b-256 28427e9bcaa754637bf6961d2690a03fe73bd59f7862445c38682b56b51eb542

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page