Skip to main content

Local-first digital memory management library with semantic AI and privacy controls.

Project description

Memoria: Local-First Digital Memory Management Library Tagline: Capture, Recall, Replay: A Privacy-First AI for Your Digital Life Overview Memoria is a Python library designed to manage your digital memories—text, files, emails, chats, browser history, and more—in a secure, local-first manner. It combines semantic search, timeline playback, and advanced AI features to help you instantly recall information (e.g., "Find the PDF about EV flywheels from last month") and retrace your digital activities (e.g., "What did I do last Tuesday 3–5 PM?"). Built for integration into other applications (e.g., note-takers, AI agents, productivity tools), Memoria offers a modular, extensible API with optional cloud enhancements for cutting-edge AI capabilities. What Memoria Does Memoria acts as a "digital brain," capturing and organizing your interactions across various sources:

Capture: Automatically ingests data from files, browser history, emails (Gmail), chats (Slack, WhatsApp), and custom sources. Recall: Enables natural language search (e.g., "What did Ramesh send about land costs?") using local semantic embeddings or cloud AI. Replay: Provides a chronological timeline of activities with playback and AI-generated insights (e.g., summaries, recommendations). Export: Outputs memories to JSON or Notion for integration with other tools. Privacy-First: Stores data locally with AES encryption; optional cloud APIs include clear privacy warnings.

Unlike tools like Evernote or Notion, Memoria auto-collects data across apps without manual input, prioritizing local processing for privacy. It stands out from alternatives (e.g., Windows Recall, ActivityWatch) with cross-source semantic search, advanced AI features, and developer-friendly integration. Features Core Features

Encrypted Memory Vault: Stores memories (text, timestamps, sources) in a local SQLite database with AES encryption. Semantic Search: Query memories using natural language (e.g., "EV flywheels PDF") with local embeddings (sentence-transformers) or cloud APIs (Gemini, OpenAI, Anthropic, Perplexity, Grok). Timeline Playback: View and replay activities in a chronological timeline, with options to filter by time range or source. Auto-Ingestion: Monitors files (watchdog), fetches browser history (browserhistory), and ingests emails (Gmail) and chats (Slack, WhatsApp). Privacy Controls: Exclude sensitive sources/folders; audit access logs for transparency. Export Capabilities: Export memories to JSON or Notion for external use.

Advanced Features

AI-Enhanced Insights: Generate summaries of activities using local models (transformers) or cloud APIs. Question Answering: Answer specific questions about memories (e.g., "When was the meeting?") using local DistilBERT or cloud models. Topic Modeling: Cluster memories into topics (e.g., "EV Project") using local KMeans or cloud APIs. Sentiment Analysis: Analyze emotional tone of memories (local with NLTK or cloud). Context-Aware Recommendations: Suggest next actions based on recent activities (e.g., "Review this document"). Cutting-Edge Integrations: Multimodal support (e.g., image OCR via Gemini/OpenAI). Function calling for structured outputs (Gemini, OpenAI). Long context handling (up to 1M tokens with Gemini 2.5). Streaming responses for real-time insights.

Extensibility

Custom Ingestors: Subclass Ingestor to add new data sources (e.g., WhatsApp, Telegram). Custom AI Providers: Extend AIEnhancer or TimelineManager for additional cloud or local models. Plugin System: Modular design allows apps to integrate Memoria as a backend.

How It Works Workflow

Data Ingestion:

Memoria runs locally, connecting to user-specified sources (files, browser, Gmail, Slack, etc.). It snapshots activities (e.g., file changes, emails, chats) in real-time or on-demand. Data is encrypted and stored in a local SQLite database with semantic embeddings (via sentence-transformers).

Indexing and Storage:

Builds an encrypted local index with metadata (timestamps, sources) and vector embeddings (FAISS for similarity search). Supports local processing only; cloud APIs are optional for advanced features.

Query and Search:

Users query in natural language (e.g., "Show me the PDF about EV flywheels"). Local semantic search (FAISS) or cloud APIs (e.g., Gemini) return relevant memories with context.

Timeline Playback:

Presents activities as a chronological timeline, filterable by time or source. Playback mode replays events at variable speeds (data for apps to render). AI insights summarize activities or answer questions.

Management and Export:

Auto-organizes memories into topical "stacks" (via clustering). Users can tag, delete, or export memories (JSON, Notion). Continuous updates ensure new data is indexed in real-time.

Example Workflow

Ingestion: Memoria captures a PDF download, a Slack message from Ramesh about land costs, and a browser visit. Query: User asks, "What did Ramesh send about land costs last month?" Search: AI retrieves the Slack message, linked PDF, and related browser history. Playback: Timeline shows sequence: "Opened PDF → Sent Slack message → Visited site." Insight: AI summarizes: "Key points: Land cost $X/sqft; see PDF." Export: User exports to Notion for project notes.

Installation Prerequisites

Python 3.10 or higher. Virtual environment (recommended): python -m venv venv; source venv/bin/activate (Windows: venv\Scripts\activate). Git (optional for cloning).

Steps

Clone or download the repository:git clone <repo_url> cd memoria-tool

Install core dependencies:pip install .

For cloud features (Gemini, OpenAI, Anthropic, Perplexity, Grok, Gmail, Slack, Notion):pip install .[cloud]

Dependencies

Core: faiss-cpu, sentence-transformers, cryptography, sqlalchemy, watchdog, pandas, numpy, browserhistory, transformers. Cloud/Advanced: google-generativeai, openai, anthropic, slack-sdk, google-api-python-client, google-auth-oauthlib, google-auth, notion-client, whatsapp-python, nltk, requests.

API Setup For cloud features, generate API keys from official consoles and set as environment variables (recommended) or pass explicitly.

Gemini (Google AI Studio): Visit ai.google.dev, create a project, and generate an API key. Set: export GEMINI_API_KEY=your_key

OpenAI (ChatGPT): Visit platform.openai.com, create an API key. Set: export OPENAI_API_KEY=your_key

Anthropic (Claude): Visit console.anthropic.com, generate an API key. Set: export ANTHROPIC_API_KEY=your_key

Perplexity: Visit api.perplexity.ai, generate an API key. Set: export PERPLEXITY_API_KEY=your_key

Grok (xAI): Visit x.ai/api, request an API key. Set: export XAI_API_KEY=your_key

Gmail: Create a Google Cloud project, enable Gmail API, download OAuth credentials (JSON). See: Google Gmail API Quickstart

Slack: Create a bot at api.slack.com, get bot token and channel ID.

Notion: Create an integration at developers.notion.com, get token and parent page ID.

Privacy and Security

Local-First: All data is stored and processed locally with AES encryption (cryptography). No data leaves your device unless cloud APIs are enabled. Cloud Warnings: Methods using cloud APIs (e.g., generate_insight, question_answering) issue warnings:UserWarning: Using cloud provider sends data externally. Ensure compliance with privacy policies.

Privacy Controls: Exclude sensitive sources/folders via PrivacyManager.set_privacy_rules. Audit Logging: Tracks memory accesses/additions in an encrypted log file (audit.log). GDPR-Inspired: Supports data export (Exporter.export_to_json) and deletion.

Usage Example from datetime import datetime, timedelta from cryptography.fernet import Fernet from memoria import MemoryVault, SemanticSearcher, TimelineManager, Ingestor, Exporter, AIEnhancer

Initialize with encryption key

key = Fernet.generate_key() vault = MemoryVault(key=key)

Add a manual memory

vault.add_memory("Meeting at 3 PM about EV flywheels", datetime.now(), "manual")

Search memories

searcher = SemanticSearcher(vault) results = searcher.query("EV flywheels") print("Search Results:", results)

Ingest files and browser history

ingestor = Ingestor(vault) ingestor.start_monitoring(paths=["~/Documents"], sources=["files", "browser"])

Ingest Gmail and Slack

ingestor.ingest_gmail("credentials.json") ingestor.ingest_slack("your_slack_token", "channel_id")

Timeline playback

timeline = TimelineManager(vault) playback = timeline.get_playback(datetime.now() - timedelta(days=1), datetime.now()) print("Playback:", playback)

Advanced AI features

enhancer = AIEnhancer(vault) insight = timeline.generate_insight(playback, provider='gemini', api_key='your_gemini_key', model='gemini-1.5-pro') print("Insight:", insight) answer = enhancer.question_answering("When is the meeting?", playback[0]['content'], provider='local') print("Answer:", answer) clusters = enhancer.topic_modeling([e['content'] for e in playback], provider='openai', api_key='your_openai_key') print("Clusters:", clusters) recommendation = enhancer.context_aware_recommendation(playback, provider='local') print("Recommendation:", recommendation)

Export to Notion

exporter = Exporter(vault) exporter.export_to_notion("your_notion_token", "parent_page_id")

Stop monitoring

ingestor.stop_monitoring()

API Reference MemoryVault

init(db_path: str = 'memoria.db', key: bytes): Initialize encrypted database. add_memory(content: str, timestamp: datetime, source: str): Add an encrypted memory. get_memories(ids: list[int]) -> list[dict]: Retrieve decrypted memories by ID.

SemanticSearcher

init(vault: MemoryVault): Link to vault. query(natural_query: str, top_k: int = 10) -> list[dict]: Search memories with natural language.

TimelineManager

init(vault: MemoryVault): Link to vault. get_playback(start_time: datetime, end_time: datetime) -> list[dict]: Get chronological events. generate_insight(events: list[dict], provider: str = 'local', api_key: str = None, model: str = None) -> str: Summarize events (local or cloud).

Ingestor

init(vault: MemoryVault): Link to vault. start_monitoring(paths: list[str], sources: list[str]): Monitor files/browser. ingest_from_source(source_type: str, data): Ingest custom data. ingest_gmail(credentials_file: str, labels: list[str], max_results: int): Ingest Gmail emails. ingest_slack(token: str, channel_id: str, limit: int): Ingest Slack messages. ingest_browser_history(): Ingest browser history. stop_monitoring(): Stop file monitoring.

AIEnhancer

init(vault: MemoryVault): Link to vault. sentiment_analysis(text: str, provider: str, api_key: str, model: str) -> dict: Analyze sentiment. entity_extraction(text: str, provider: str, api_key: str, model: str) -> list[str]: Extract entities. question_answering(question: str, context: str, provider: str, api_key: str, model: str) -> str: Answer questions. topic_modeling(texts: list[str], provider: str, api_key: str, model: str) -> dict: Cluster texts into topics. context_aware_recommendation(memories: list[dict], provider: str, api_key: str, model: str) -> str: Suggest actions.

PrivacyManager

init(vault: MemoryVault, log_file: str): Initialize with audit log file. set_privacy_rules(rules: dict): Set exclusion rules. audit_access_log() -> list[str]: Retrieve audit logs. log_action(action: str, details: str): Log an action.

Exporter

init(vault: MemoryVault): Link to vault. export_to_json(filepath: str): Export memories to JSON. export_to_notion(token: str, parent_page_id: str): Export to Notion.

Extensibility Custom Ingestors Create a subclass of Ingestor for new sources. Example for WhatsApp: from memoria.ingestor import Ingestor from whatsapp import WhatsApp

class WhatsAppIngestor(Ingestor): def ingest_whatsapp(self, token: str, phone_id: str, limit: int = 10): whatsapp = WhatsApp(token, phone_id) messages = whatsapp.get_messages(limit=limit) for msg in messages: self.vault.add_memory(msg['body'], datetime.fromisoformat(msg['timestamp']), f"whatsapp:{msg['id']}")

Custom AI Providers Extend AIEnhancer or TimelineManager.generate_insight to add new providers (e.g., Mistral). Example: def custom_insight(self, events, provider='mistral', api_key=None, model=None): prompt = f"Summarize: {events}" response = requests.post('https://api.mixtral.ai/v1/completions', json={'prompt': prompt}, headers={'Authorization': f'Bearer {api_key}'}) return response.json()['text']

Testing

Install pytest: pip install pytest Run tests: pytest tests/ Tests cover core functionality, cloud APIs (mocked), rate limit handling, and advanced features (e.g., question answering, topic modeling). Rate limits are handled with exponential backoff (3 retries).

Limitations and Future Enhancements

Current Limitations: Local models (e.g., T5, DistilBERT) are less powerful than cloud APIs. Gmail/Slack ingestion requires manual OAuth/token setup. Audit logging is basic; lacks advanced encryption. Multimodal support (images) is cloud-only (extend with local OCR).

Future Enhancements: Add local multimodal support (e.g., Hugging Face vision models). Implement streaming for large datasets. Enhance audit logging with encryption and DB integration. Support more cloud providers (e.g., Mistral, Cohere).

Contributing

Fork the repo on GitHub. Follow PEP 8 for code style. Add tests in tests/ for new features. Submit pull requests with clear descriptions.

License MIT License. See LICENSE file. Notes

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

memoria_tool-1.0.0.tar.gz (22.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

memoria_tool-1.0.0-py3-none-any.whl (18.1 kB view details)

Uploaded Python 3

File details

Details for the file memoria_tool-1.0.0.tar.gz.

File metadata

  • Download URL: memoria_tool-1.0.0.tar.gz
  • Upload date:
  • Size: 22.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.8

File hashes

Hashes for memoria_tool-1.0.0.tar.gz
Algorithm Hash digest
SHA256 e4f399eea22bd87ac18c38c0ea5ff657d97f88d57babdb4c1d7cb346d35bc7f8
MD5 581307da3829450cb50be5647fc0c78f
BLAKE2b-256 b84077a9a941e6b0fda9f75e03e5fdc6f90426563528b9d0bf8bad3181f86d10

See more details on using hashes here.

File details

Details for the file memoria_tool-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: memoria_tool-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 18.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.8

File hashes

Hashes for memoria_tool-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bbde7902e558a697d135a2a42cdabbfb83c41522a5ebb9080074167f05dfefe4
MD5 35cc3f85876215baa2a4d4a501ee8ece
BLAKE2b-256 3290daf47f4b8d68063811ad84e094c3a6ca75b22347b5940b815ddc191e7301

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page