Skip to main content

镜我 · Your personal Pensieve. Turn scattered Chinese data into ready-to-use deliverables.

Project description

Memexa · 镜我

English · 中文

Memory layer for AI agents and humans, on Chinese-native data. Self-hosted memory graph over WeChat / QQ / 飞书 / 钉钉 group chats, Chinese email, and Chinese audio. Verbatim storage plus structured extraction; queries return cards with per-claim citations back to the original sentence.

🤖 AI-agent compatible by design. Most real usage is an AI agent (Claude Code, Cursor, Cline, or one you wrote yourself) invoking memexa as a subprocess to answer questions on the user's behalf. The fourteen query subcommands are a small protocol; the contract agents follow is in docs/for_agents.md. Native MCP integration arrives in v0.5; the current first-class path is shell subprocess with --json output.

CI CodeQL License: Apache 2.0 Python 3.10+ PyPI PII scan

Quickstart

Two starting points; pick whichever describes you.

Humans — 30-second visual

pip install --pre memexa
memexa demo

You will see a synthetic conversation set ingested from six sources with the stub extractor, followed by five example queries printed to your terminal — quick, arc, timeline, pending, topic. No backend, no LLM, no configuration. This is the honest first look at what the project does.

AI agents — subprocess CLI today, MCP in v0.5

# Agents already work today via subprocess:
pip install --pre memexa
memexa quick "<your question>" --json   # structured output for agent parsing
memexa arc "<person>" --json
# ... fourteen subcommands total, all with --json mode (v0.1.x)

The fourteen subcommands plus seven hard rules in docs/for_agents.md are the agent contract. Native MCP integration (memexa-mcp server + .mcp.json snippet) arrives in v0.5; until then shell subprocess is the first-class path and it works in any agent that has a shell tool.

What's next for both

To ingest your own data, configure an LLM provider and pick one source. docs/quickstart.md walks through Tier 1 (5 minutes, one source) and Tier 2 (30 minutes, full production deployment with cron + dashboard).

What you can ask

Question pattern Subcommand Returns
Who is Alice? arc "Alice" Relationship arc, 8 fan-out variants across sources
What was the whole story behind X? topic "Mac purchase" 80–200 cards with citations
What did Y professor want? person "Y professor" Profile article + recent events
What is project X across all sources? project "X" Cross-source pulse, 4 source groups
What is on my plate? pending Active commitments from calendar
What did this period look like? timeline --start ... --end ... Chronological card list
Synthesise an answer reflect "question" LLM-synthesised Markdown

Fourteen subcommands total. Decision table and composition patterns are in docs/usage_guide.md. See also docs/5_phase_query.md for the state- inference workflow used on yes/no questions.

Why memexa instead of OpenHuman / MemPalace / ReMe?

In short: verbatim raw storage + LLM-extracted V2 envelope + per-claim evidence_quotes citation + cross-alias canonical id, all on Chinese-IM-native data sources the adjacent projects do not target.

The full per-capability comparison and the five user scenarios memexa serves live in docs/why.md.

Architecture, one screen

   WeChat ─┐                                              ┌─► "Who is X?"           (arc + quick)
   QQ     ─┤                                              ├─► "Group activity last week?" (topic + trends)
   Email  ─┼──► two-LLM extract ──► PG + pgvector ──┤
   Browser─┤    (gate+extract)      memory graph        ├─► "Project X status?"    (project + timeline)
   AI chat─┤                                              ├─► "What does Y want?"   (person)
   Audio  ─┘                                              └─► "My pending actions?" (pending)
        ↑                                                       ↑
   your raw data                                          14 query subcommands
   (local, fully self-hosted)                             (cross-source composable)

Full architecture in docs/architecture.md.

Documentation

Topic Link
Quickstart (3-tier path: 30 s → 5 min → 30 min) docs/quickstart.md
Architecture docs/architecture.md
Why memexa (vs OpenHuman / MemPalace; 5 user scenarios) docs/why.md
Cost estimation (DeepSeek / GPT-4o / Claude monthly) docs/cost.md
14 query subcommands in depth docs/usage_guide.md
5-phase state inference docs/5_phase_query.md
Full environment variables docs/configuration.md
FAQ / troubleshooting docs/faq.md · docs/troubleshooting.md
Per-source onboarding docs/integrations/
Cross-platform deployment docs/deployment/
Example walkthroughs (synthetic data) examples/demo_dataset/walkthroughs/
Case studies docs/case_studies/
For AI agents (MCP / integration spec) docs/for_agents.md
Roadmap ROADMAP.md
Contributing CONTRIBUTING.md
Security policy SECURITY.md
Governance GOVERNANCE.md

Two ways to run the LLM

memexa's core is a two-LLM gate-extract pipeline. The OSS ships everything you need to run it locally with any OpenAI-compatible endpoint.

# Default: bundled prompts + your own LLM provider
export MEMEXA_EXTRACTOR_TIER=bundled

# BYO: bring your own prompt for advanced tuning
export MEMEXA_EXTRACTOR_TIER=byo
export MEMEXA_PROMPT_PATH=/path/to/your_prompts.py

Recommended provider for Chinese workloads is DeepSeek V4 Flash (gate)

  • V4 Pro (extractor) — typical cost is ¥0.30 per 1 000 messages. GPT-4o and Claude 4.x are supported but cost 5–10× more. See docs/cost.md for the full breakdown.

License

Apache 2.0. See LICENSE. OSS core stays Apache 2.0 forever.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

memexa-0.1.0rc4.tar.gz (1.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

memexa-0.1.0rc4-py3-none-any.whl (1.4 MB view details)

Uploaded Python 3

File details

Details for the file memexa-0.1.0rc4.tar.gz.

File metadata

  • Download URL: memexa-0.1.0rc4.tar.gz
  • Upload date:
  • Size: 1.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for memexa-0.1.0rc4.tar.gz
Algorithm Hash digest
SHA256 5683c436fe29eb5840681822b403686a21adc43af50f814e3706d5e581a7cecf
MD5 6c2a6bb694cfdf2d6a04c9351d824c7c
BLAKE2b-256 3fcff7f21b15458eecd2a82a8008a240b16989679ae7c249377d50af68cb5f6e

See more details on using hashes here.

Provenance

The following attestation bundles were made for memexa-0.1.0rc4.tar.gz:

Publisher: publish.yml on labazhou2024/memexa

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file memexa-0.1.0rc4-py3-none-any.whl.

File metadata

  • Download URL: memexa-0.1.0rc4-py3-none-any.whl
  • Upload date:
  • Size: 1.4 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for memexa-0.1.0rc4-py3-none-any.whl
Algorithm Hash digest
SHA256 6f73934ba07a4ff957d40deb386deafd7016a6eafef6fbd64f36425ea3b00967
MD5 d6cc8d413abdca95671d65048ad7507f
BLAKE2b-256 08b076a89f242eb5ecee24a56d1399aedc4ed6559dd5aaf23f739db61595ca88

See more details on using hashes here.

Provenance

The following attestation bundles were made for memexa-0.1.0rc4-py3-none-any.whl:

Publisher: publish.yml on labazhou2024/memexa

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page