Skip to main content

Forward anything to Telegram, get a tagged, linked, deduplicated Obsidian note back. A single-tenant LLM capture bot for your second brain.

Project description

Engram

Forward anything to Telegram. Get a tagged, linked, deduplicated Obsidian note back.

CI Python 3.11+ License: MIT

An engram is the physical trace a memory leaves in the brain — the durable scar left behind after an experience. Engram is a single-tenant Telegram bot that does the same thing for your chat stream: drop in a tweet, voice memo, PDF, YouTube link, or photo of a whiteboard, and Claude classifies it, summarises it, tags it, finds related notes already in your Obsidian vault, and writes a Markdown file with proper frontmatter and [[backlinks]]. The forgettable river of messages becomes durable, indexed memory.

Built as a personal second-brain pipeline; published in case it's useful to anyone else.

Deploy on Railway


What it does

  • Capture anything — text, URLs, photos, voice notes, PDFs, Word docs (.docx/.doc), plain-text/code files, forwarded posts, YouTube links. Media groups (multi-photo posts) are debounced and stitched into a single note.
  • Read the contents, not just the message — Claude vision OCR for photos, OpenAI Whisper for voice, PyPDF/python-docx for documents, YouTube Transcript API for videos, plain HTTP fetcher for web pages.
  • AI routing — Claude (Sonnet) picks a folder, writes a title, summarises the body, generates tags, and proposes up to 5 related notes from your existing vault. Falls back to an "Other" bucket and an inbox queue when confidence is low.
  • Smart dedupe — incoming notes that match by URL, title, or semantic similarity (cosine on OpenAI embeddings) get appended to the existing note instead of creating a duplicate.
  • Vault-grounded Q&A/ask runs a hybrid keyword + embedding retrieval over your notes and answers with Claude, with multi-turn follow-ups via Telegram reply threads.
  • Manual override — every capture shows an inline-keyboard folder picker; misroutes are one tap away. /redo, /edit, /undo, and /relink cover the rest.
  • Single-tenant by design — an ALLOWED_USER_IDS allowlist gates every handler. Nobody else who finds your bot can use it.

How it works

Telegram message
     │
     ├─ photo?       → Claude vision OCR
     ├─ voice/audio? → OpenAI Whisper
     ├─ PDF / DOCX / DOC / text? → text extraction
     └─ URLs?        → page fetch · YouTube transcript
     ↓
Claude enrichment: title · summary · tags · folder · related notes · confidence
     ↓
Dedupe check (URL → title → semantic) → append to existing OR create new
     ↓
<vault>/<Category>/<Title>.md   with YAML frontmatter + [[backlinks]] + attachments/

Requirements

  • Python 3.11+
  • uv (or plain pip if you prefer)
  • A Telegram bot token from @BotFather
  • An Anthropic API key
  • (optional) An OpenAI API key for voice transcription, semantic dedupe, semantic /search, and /relink
  • An existing Obsidian vault (or any folder you want filled with .md files)

Setup

git clone https://github.com/mishablank/Engram.git
cd Engram
uv sync
cp .env.example .env
# edit .env — see Configuration below
uv run python -m engram.bot

If you uv pip install -e . you'll also get an engram console entry point.

One-click deploy (Railway)

Click the Deploy on Railway button above. Railway will build the project, prompt you for the env vars below, and run uv run python -m engram.bot as a long-lived process.

The catch: your Obsidian vault is local, but Railway runs in the cloud. Two ways to make this work:

  1. Recommended — attach a Railway volume mounted at e.g. /data, set BASE_DIR=/data, and use Obsidian Sync, Syncthing, or rclone to mirror that volume into your local Obsidian vault. The bot writes to the cloud copy; your desktop reads it via sync.
  2. Quick test — point BASE_DIR at the container filesystem and treat it as ephemeral. Notes survive restarts but vanish if Railway redeploys without a volume. Fine for kicking the tyres, not for real use.

If you don't want any of that, just run it on your laptop or a home server. See "Running it as a daemon" below.

Configuration

All config is via environment variables (loaded from .env if present). See .env.example.

Variable Required Purpose
TELEGRAM_BOT_TOKEN yes Bot token from @BotFather
ANTHROPIC_API_KEY yes Used for enrichment, vision OCR, /ask
ALLOWED_USER_IDS yes Comma-separated Telegram numeric IDs. Only these users can talk to the bot. Find yours via @userinfobot.
BASE_DIR yes Absolute path to your Obsidian vault root
OPENAI_API_KEY no Enables Whisper voice transcription, semantic dedupe, /relink, and semantic ranking inside /search and /ask
CATEGORIES no Comma-separated folder names. Default: AI,Crypto,Startups/YC,Personal,Health,Reading,Other
LOG_FILE no Defaults to ~/.engram.log (rotated, 5 MB × 3 backups)

Commands

Command Effect
/start Show usage hint with the current category list
/search <query> Hybrid (keyword + embedding) search across titles, tags, and bodies
/ask <question> RAG over your vault. Reply to my answer to continue the thread (up to 6 turns)
/inbox List notes flagged for review (low-confidence routing)
/review Walk pending notes one at a time with move / mark-reviewed / delete buttons
/relink [folder] Refresh related-note backlinks. No arg = last capture; with arg = entire folder
/redo Reply with /redo to regenerate a capture using the higher-quality Opus model
/edit <text> Replace the source of the last capture and re-enrich
/undo Delete the last capture in this chat
/refresh Rescan the vault index (also runs automatically every 10 minutes)

The plain message path: send a message → tap a folder button → done. Send a photo without a caption and the bot OCRs it first so it can route by content.

What a note looks like

---
title: "Notes on bitter-lesson scaling laws"
created: 2026-05-11T18:32:04
source: https://example.com/post
source_type: article
tags: [scaling-laws, rich-sutton, ai]
forwarded_from: "@somechannel"
forwarded_at: 2026-05-11T18:30:00
---

Sutton argues that the only methods that consistently win across decades
are those that scale with compute and data — search and learning — and that
hand-tuned domain knowledge tends to be a local optimum at best.

## Related
- [[The bitter lesson, revisited]]
- [[Compute overhang and capability surprises]]

![[attachments/2026-05-11_18-32-04-1.jpg]]

Running it as a daemon

The bot is a long-lived process. Some lightweight options:

  • macOS (launchd): drop a ~/Library/LaunchAgents/com.you.engram.plist that runs uv run python -m engram.bot with KeepAlive=true.
  • Linux (systemd user unit): a one-screen ~/.config/systemd/user/engram.service with ExecStart=… and Restart=on-failure, then systemctl --user enable --now engram.
  • Quick and dirty: tmux new -d -s engram "uv run python -m engram.bot".

Logs go to LOG_FILE (default ~/.engram.log).

Development

uv sync
uv run pytest -v

15 test modules cover the bot handlers, vault indexing, embeddings, dedupe, link enrichment, vision/whisper/youtube/pdf adapters, and the inbox/review flow. pytest-asyncio is in auto mode. CI runs on push and PR against Python 3.11 / 3.12 / 3.13.

Roadmap

  • Local-model support — swap Claude / OpenAI for Ollama or llama.cpp so the bot can run end-to-end without paid API keys. Embeddings first (cheapest win), then enrichment. Tracked in #1 — help welcome.

Security model

This bot is single-tenant on purpose. It does exactly one thing to keep you safe:

  • Every handler checks update.effective_user.id against ALLOWED_USER_IDS before doing anything. Unauthorised users get a flat "Unauthorized." reply.

That's it. There is no per-user vault, no row-level auth, no rate limiting. Don't share your bot token. Don't add other users to ALLOWED_USER_IDS unless you want them writing into the same vault you do.

.env is gitignored. Treat your TELEGRAM_BOT_TOKEN and ANTHROPIC_API_KEY like passwords — if either leaks, rotate immediately (BotFather → /revoke, Anthropic console → revoke key).

License

MIT © 2026 Mikhail Blank

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

engram_bot-0.1.0.tar.gz (100.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

engram_bot-0.1.0-py3-none-any.whl (39.5 kB view details)

Uploaded Python 3

File details

Details for the file engram_bot-0.1.0.tar.gz.

File metadata

  • Download URL: engram_bot-0.1.0.tar.gz
  • Upload date:
  • Size: 100.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for engram_bot-0.1.0.tar.gz
Algorithm Hash digest
SHA256 d30b9dd25bf454d504729052aba6f47641034d3506a0d6350f1f6e43d04863a4
MD5 914ef9d11ca4a6b1857a7160258bebbf
BLAKE2b-256 978bf56b673eacc4e3739c50904cb00068ecb0260047602f4745ee764c893c78

See more details on using hashes here.

File details

Details for the file engram_bot-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: engram_bot-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 39.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for engram_bot-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 98a128ec3a9fa08937461a5bbe6ffd6fd16cd0a41cd88fcb7e28e168d73ffc3e
MD5 89a37e6970caf2476e18e80d1aa80f8f
BLAKE2b-256 b85ea86e9ba9facdb0734761ab7cff13d8afc4a48d4b476daf4aeb75c180f170

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page