Skip to main content

镜我 · Your personal Pensieve. Turn scattered Chinese data into ready-to-use deliverables.

Project description

Memexa · 镜我

English · 中文

Memory layer for AI agents and humans, on Chinese-native data. Self-hosted memory graph over WeChat / QQ / 飞书 / 钉钉 group chats, Chinese email, and Chinese audio. Verbatim storage plus structured extraction; queries return cards with per-claim citations back to the original sentence.

🤖 AI-agent compatible by design. Most real usage is an AI agent (Claude Code, Cursor, Cline, or one you wrote yourself) invoking memexa as a subprocess to answer questions on the user's behalf. The fourteen query subcommands are a small protocol; the contract agents follow is in docs/for_agents.md. Native MCP integration arrives in v0.5; the current first-class path is shell subprocess with --json output.

CI CodeQL License: Apache 2.0 Python 3.10+ PyPI PII scan

Quickstart

Two starting points; pick whichever describes you.

Humans — 30-second visual

pip install memexa
memexa demo

You will see a synthetic conversation set ingested from six sources with the stub extractor, followed by five example queries printed to your terminal — quick, arc, timeline, pending, topic. No backend, no LLM, no configuration. This is the honest first look at what the project does.

AI agents — subprocess CLI today, MCP in v0.5

# Agents already work today via subprocess:
pip install memexa
memexa quick "<your question>" --json   # structured output for agent parsing
memexa arc "<person>" --json
# ... fourteen subcommands total, all with --json mode (v0.1.x)

The fourteen subcommands plus seven hard rules in docs/for_agents.md are the agent contract. Native MCP integration (memexa-mcp server + .mcp.json snippet) arrives in v0.5; until then shell subprocess is the first-class path and it works in any agent that has a shell tool.

What's next for both

To ingest your own data, configure an LLM provider and pick one source. docs/quickstart.md walks through Tier 1 (5 minutes, one source) and Tier 2 (30 minutes, full production deployment with cron + dashboard).

What you can ask

Question pattern Subcommand Returns
Who is Alice? arc "Alice" Relationship arc, 8 fan-out variants across sources
What was the whole story behind X? topic "Mac purchase" 80–200 cards with citations
What did Y professor want? person "Y professor" Profile article + recent events
What is project X across all sources? project "X" Cross-source pulse, 4 source groups
What is on my plate? pending Active commitments from calendar
What did this period look like? timeline --start ... --end ... Chronological card list
Synthesise an answer reflect "question" LLM-synthesised Markdown

Fourteen subcommands total. Decision table and composition patterns are in docs/usage_guide.md. See also docs/5_phase_query.md for the state- inference workflow used on yes/no questions.

Why memexa instead of OpenHuman / MemPalace / ReMe?

In short: verbatim raw storage + LLM-extracted V2 envelope + per-claim evidence_quotes citation + cross-alias canonical id, all on Chinese-IM-native data sources the adjacent projects do not target.

The full per-capability comparison and the five user scenarios memexa serves live in docs/why.md.

Architecture, one screen

   WeChat ─┐                                              ┌─► "Who is X?"           (arc + quick)
   QQ     ─┤                                              ├─► "Group activity last week?" (topic + trends)
   Email  ─┼──► two-LLM extract ──► PG + pgvector ──┤
   Browser─┤    (gate+extract)      memory graph        ├─► "Project X status?"    (project + timeline)
   AI chat─┤                                              ├─► "What does Y want?"   (person)
   Audio  ─┘                                              └─► "My pending actions?" (pending)
        ↑                                                       ↑
   your raw data                                          14 query subcommands
   (local, fully self-hosted)                             (cross-source composable)

Full architecture in docs/architecture.md.

Documentation

Topic Link
Quickstart (3-tier path: 30 s → 5 min → 30 min) docs/quickstart.md
Architecture docs/architecture.md
Why memexa (vs OpenHuman / MemPalace; 5 user scenarios) docs/why.md
Cost estimation (DeepSeek / GPT-4o / Claude monthly) docs/cost.md
14 query subcommands in depth docs/usage_guide.md
5-phase state inference docs/5_phase_query.md
Full environment variables docs/configuration.md
FAQ / troubleshooting docs/faq.md · docs/troubleshooting.md
Per-source onboarding docs/integrations/
Cross-platform deployment docs/deployment/
Example walkthroughs (synthetic data) examples/demo_dataset/walkthroughs/
Case studies docs/case_studies/
For AI agents (MCP / integration spec) docs/for_agents.md
Roadmap ROADMAP.md
Contributing CONTRIBUTING.md
Security policy SECURITY.md
Governance GOVERNANCE.md

Two ways to run the LLM

memexa's core is a two-LLM gate-extract pipeline. The OSS ships everything you need to run it locally with any OpenAI-compatible endpoint.

# Default: bundled prompts + your own LLM provider
export MEMEXA_EXTRACTOR_TIER=bundled

# BYO: bring your own prompt for advanced tuning
export MEMEXA_EXTRACTOR_TIER=byo
export MEMEXA_PROMPT_PATH=/path/to/your_prompts.py

Recommended provider for Chinese workloads is DeepSeek V4 Flash (gate)

  • V4 Pro (extractor) — typical cost is ¥0.30 per 1 000 messages. GPT-4o and Claude 4.x are supported but cost 5–10× more. See docs/cost.md for the full breakdown.

License

Apache 2.0. See LICENSE. OSS core stays Apache 2.0 forever.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

memexa-0.1.0.tar.gz (1.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

memexa-0.1.0-py3-none-any.whl (1.4 MB view details)

Uploaded Python 3

File details

Details for the file memexa-0.1.0.tar.gz.

File metadata

  • Download URL: memexa-0.1.0.tar.gz
  • Upload date:
  • Size: 1.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for memexa-0.1.0.tar.gz
Algorithm Hash digest
SHA256 aab3fd4818243b4ac5b942952c1712da42c2d5e3a073a8965abbc9b6be0299e8
MD5 7429fb3e8fad336482d7d37c228159eb
BLAKE2b-256 6dd4635264650a10b20e51f9299e5790b5b72092c207e8e4d7e0cccf3a1913b0

See more details on using hashes here.

Provenance

The following attestation bundles were made for memexa-0.1.0.tar.gz:

Publisher: publish.yml on labazhou2024/memexa

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file memexa-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: memexa-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 1.4 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for memexa-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4cd589cc24ba941d95e11e9b7d8fdf84357ac4c9581f0d94d2d1153658608102
MD5 1ee4e8d1db097b73e3c17c6074a36c88
BLAKE2b-256 df4ddbf25cf15a0224769ad7d80b54903003b78df3225a956ac7af1160992e40

See more details on using hashes here.

Provenance

The following attestation bundles were made for memexa-0.1.0-py3-none-any.whl:

Publisher: publish.yml on labazhou2024/memexa

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page