Convert almost anything to Obsidian-flavored Markdown for a knowledge graph.
Project description
Any2MD
Free, open-source CLI that converts almost anything — local files (PDF, DOCX, XLSX, images…) and online links (YouTube, Reddit, GitHub, arXiv, Wikipedia, Hacker News, Stack Overflow, Twitter/X, web articles) — into Obsidian-flavored Markdown for a knowledge graph. Every input is summarized. No external APIs, no API keys, ever.
Install
One command, anywhere (recommended — isolated, no venv to manage):
pipx install any2md-cli
Then just run it:
any2md
The first run asks one thing — where to save your .md files — and then gets out of the
way. Summaries run locally: if Ollama is running it's used automatically,
otherwise a built-in zero-setup extractive summarizer is used. Nothing else to configure.
From source (dev)
python3 -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"
Usage
One-shot
any2md convert https://github.com/karpathy/nanoGPT
any2md convert ~/notes/paper.pdf -o ~/ObsidianVault/inbox
any2md convert https://arxiv.org/abs/1706.03762 --depth high --provider extractive
any2md convert --batch links.txt # one target per line
Re-converting the same link refreshes the existing note instead of making a duplicate (tracking
params like utm_* are stripped, so the same article always maps to one note). Pages that extract
to nothing — paywalled or JavaScript-only — are skipped with a warning rather than written as
empty notes.
Interactive REPL
any2md # opens the REPL
Inside the REPL, paste a URL or file path to convert it. Commands:
| Command | Effect |
|---|---|
/output <dir> |
set output folder |
/provider <name> |
set summarizer: extractive (default) · ollama · none |
/depth |
how much to keep: low · medium · high · raw (◀ ▶ live picker) |
/batch <file> |
submit every line in a file |
/jobs |
list jobs + status |
/last |
path of the last written .md |
/open [last] |
open the output folder (or the last note) in your file viewer |
/rename <name> |
rename the file you just made (slug auto-cleaned) |
/help · /quit |
help / exit |
While a conversion runs you get a live spinner with an estimated time (it learns your real timings per source) and a rotating tip. Drag a file straight into the terminal to convert it.
Config
any2md config set output ~/ObsidianVault/inbox
any2md config set provider extractive
any2md config show
Precedence: CLI flag > env var (ANY2MD_OUTPUT_DIR, ANY2MD_PROVIDER, …) > ~/.any2md/config.toml > default.
Summarizers (all free, offline)
extractive(default): pure-Python TextRank-style. Zero setup, no network.ollama: local model viaOLLAMA_URL(defaulthttp://localhost:11434),OLLAMA_MODEL(defaultllama3.2). Unreachable → falls back to extraction-only.none: extraction only, no summary.
Serve mode (HTTP)
any2md serve --port 8000
Routes:
# submit a conversion → returns {"id": "..."}
curl -X POST localhost:8000/convert -H 'Content-Type: application/json' \
-d '{"target":"https://github.com/karpathy/nanoGPT"}'
curl localhost:8000/jobs/<id> # status + progress
curl localhost:8000/jobs/<id>/download # the rendered .md
Set ANY2MD_TOKEN to gate access — clients then send Authorization: Bearer <token>.
Deploy
Docker
docker build -t any2md .
docker run -p 8000:8000 -e ANY2MD_TOKEN=secret -v "$PWD/data:/data" any2md
Railway
Push the repo; Railway builds the Dockerfile and runs any2md serve on $PORT
(see railway.toml). Set ANY2MD_TOKEN and ANY2MD_PROVIDER=extractive in the dashboard.
No API keys required — the stack is fully free/offline.
Develop
pytest -q # tests (no live network)
ruff check . # lint
See CONTRIBUTING.md for the full workflow (TDD, fixtures, adding a source). CI runs the suite + lint on every push and PR.
Publish to PyPI (maintainer)
pipx install any2md-cli works once the package is on PyPI. To cut a release:
python -m build # builds dist/*.whl and dist/*.tar.gz
twine upload dist/* # needs your PyPI account / API token
Bump __version__ in any2md/__init__.py first (pyproject.toml reads it dynamically).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file any2md_cli-0.1.0.tar.gz.
File metadata
- Download URL: any2md_cli-0.1.0.tar.gz
- Upload date:
- Size: 64.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
68eb5ee4ed5c7b17a55f0c3443980003b9c37aa6bf603ea880ab492f9b21a0c5
|
|
| MD5 |
a804995c49cf64c56b50efb3d2221544
|
|
| BLAKE2b-256 |
55192029574326fe5036699386ee579220d832549de3481220baba61c567b7ac
|
File details
Details for the file any2md_cli-0.1.0-py3-none-any.whl.
File metadata
- Download URL: any2md_cli-0.1.0-py3-none-any.whl
- Upload date:
- Size: 51.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3fab4bd972286ba3a1c3f280ace681ea29d7e64082317a9b3b8102e92f2896da
|
|
| MD5 |
4ceb4db466615ba3871690a7d98dc814
|
|
| BLAKE2b-256 |
41f9ace3c494cc144ba297ad0a6caa88fffebbec7a6578dc11ca2e50c5e48905
|