Markdown CORE AI - Classification, Organisation, Retrieval & Entry for your personal markdown knowledge base
Project description
mdcore
Markdown CORE AI - Classification, Organisation, Retrieval & Entry
mdcore is a local, LLM-agnostic knowledge base engine for anyone with a folder of markdown notes. It reads and writes your vault intelligently - retrieve context on demand, ingest new knowledge with automatic classification and routing, all from the terminal or a TUI.
PyPI: markdowncore-ai | CLI: mdcore | Version: 1.0.4
Screenshots
What It Does
Retrieval (mdcore search) - Ask a question or give a topic. mdcore searches your vault semantically, stitches the most relevant chunks, and synthesises a coherent cited briefing. Output lands in <vault>/mdcore-output/ - ready to copy into any LLM conversation.
Ingestion (mdcore ingest) - Feed any document into mdcore - an LLM session summary, a research note, a strategy doc, an article. It classifies the content against your existing vault, routes it to the right folder, detects conflicts with existing notes, generates a proposal, and writes only after your explicit approval.
Both flows work fully local with Ollama. No subscription LLM API calls. No always-on server.
Installation
# Recommended
uv tool install markdowncore-ai
# With TUI
uv tool install "markdowncore-ai[gui]"
# pipx
pipx install markdowncore-ai
Ollama models (local inference)
ollama pull nomic-embed-text # embeddings
ollama pull qwen3.5:4b # classification, routing, proposals
ollama pull phi4-mini # synthesis (fast, non-thinking)
First run
mdcore init # interactive setup -> writes ~/.mdcore/config.yaml
mdcore index # scan and index your vault
Quick Start
# Search your vault
mdcore search "kubernetes ingress routing"
# -> synthesised briefing written to <vault>/mdcore-output/
# -> copy contents, paste into Claude / ChatGPT / Gemini
# Ingest a document
mdcore ingest --file my-session-summary.md
# -> classifies, routes to right folder, proposes changes -> approve to write
# Launch TUI
mdcore gui
Commands
mdcore init # Interactive setup wizard
mdcore index # Delta index - scan, diff, confirm, index
mdcore index --force # Wipe everything and reindex from scratch
mdcore search <topic> # Retrieve + synthesise briefing (Flow A)
mdcore search <topic> --raw # Retrieve raw excerpts, skip synthesis
mdcore search <topic> --verbose # Show similarity scores
mdcore ingest # Paste document - classify, route, propose (Flow B)
mdcore ingest --file <path> # Ingest from file
mdcore map # Generate vault folder map for routing
mdcore map --repair # Remove stale folder entries
mdcore gui # Launch TUI (requires [gui] extra)
mdcore status # Index health, drift warnings
mdcore eval [topic] # Retrieval quality checklist
mdcore config # Open config in editor
mdcore config --validate # Validate config
Multiple vaults / config profiles
mdcore search "istio auth" --config ~/.mdcore/config-work.yaml
mdcore search "career goals" --config ~/.mdcore/config-personal.yaml
mdcore search "topic" --models ~/.mdcore/models-aggregator.yaml
Backends
mdcore supports local and API-backed models. Mix and match per use case.
| Backend | LLM | Embeddings | Extra needed |
|---|---|---|---|
| Ollama (local) | any pulled model | nomic-embed-text, bge-m3 |
none |
| Gemini | gemini-2.5-flash-lite |
models/gemini-embedding-001 |
none (bundled) |
| OpenAI | gpt-4o-mini |
text-embedding-3-small |
[openai] |
| Anthropic | claude-haiku-4-5 |
use Ollama or OpenAI | [anthropic] |
| Aggregator | free-tier key pool | free-tier key pool | [aggregator] |
uv tool install "markdowncore-ai[openai]"
uv tool install "markdowncore-ai[anthropic]"
uv tool install "markdowncore-ai[all]" # every backend
Aggregator backend
aggregator routes calls through llm-aggregator - a local SQLite-backed key pool that round-robins free-tier API keys with automatic 429 cooldown. No api_key needed in mdcore config.
Install separately (not on PyPI):
pip install git+https://github.com/piyush-tyagi-13/llm-aggregator
# or if installed via uv tool:
uv tool install markdowncore-ai --with "llm-aggregator @ git+https://github.com/piyush-tyagi-13/llm-aggregator"
llm:
backend: aggregator
aggregator_category: general_purpose # key pool category (optional)
aggregator_rotate_every: 5 # requests per key before rotation
embeddings:
backend: aggregator
aggregator_category: general_purpose
Hardware guidance
| Hardware | LLM | Embeddings |
|---|---|---|
| Apple M2 16GB+ | qwen3.5:4b |
nomic-embed-text |
| i5 + RTX 4070 | qwen3:8b |
bge-m3 |
| Low-end / no GPU | gemini-2.5-flash-lite or gpt-4o-mini |
models/gemini-embedding-001 |
Configuration
Config lives at ~/.mdcore/config.yaml. Generated by mdcore init.
| Section | Key fields | Purpose |
|---|---|---|
vault |
path, owner_name |
Vault root, owner name for multi-person vaults |
embeddings |
backend, api_model / local_model, api_key |
Embedding model |
llm |
backend, model, api_key, synthesise_model |
Primary LLM + synthesis model |
indexer |
chunk_size, heading_aware_splitting |
Chunking strategy |
retriever |
top_k, similarity_threshold |
Retrieval tuning |
ingester |
similarity_threshold_high/low |
Classification thresholds |
writer |
append_position, backup |
Write behaviour + backups |
See config.yaml.example for the full annotated reference.
Separate models config
Keep model choices in a separate ~/.mdcore/models.yaml - useful for switching backends without touching main config. Values here override llm and embeddings sections in config.yaml.
# ~/.mdcore/models.yaml
llm:
backend: aggregator
aggregator_category: general_purpose
embeddings:
backend: ollama
local_model: nomic-embed-text
Pass explicitly with --models:
mdcore search "topic" --models ~/.mdcore/models-work.yaml
mdcore ingest --file note.md --models ~/.mdcore/models-cheap.yaml
Where LLM Calls Happen
mdcore search (Flow A)
| Phase | LLM? | Notes |
|---|---|---|
| Keyword pre-filter | No | BM25 scoring |
| Vector search | No | Embedding lookup |
| Chunk assembly | No | Pure text |
| Synthesis | Yes - synthesise_model |
Skip with --raw for zero LLM calls |
mdcore ingest (Flow B)
| Phase | LLM? | Condition |
|---|---|---|
| Embedding + search | No | Always |
| Classification | Conditional - llm.model |
Only in ambiguous similarity range (0.65-0.82) |
| Folder routing | Yes - llm.model |
NEW files only |
| Proposal | Yes - llm.model |
Always before write |
mdcore map and mdcore index make no LLM calls.
Observability
Token usage logged after every call to ~/.mdcore/logs/:
INFO llm - tokens [gemini-2.5-flash-lite] in=312 out=89 total=401
LangSmith tracing (optional) - add to ~/.mdcore/config.yaml:
llm:
langsmith_api_key: <your-key>
langsmith_project: mdcore
mdcore - Markdown CORE AI v1.0.4
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file markdowncore_ai-1.0.7.tar.gz.
File metadata
- Download URL: markdowncore_ai-1.0.7.tar.gz
- Upload date:
- Size: 80.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dc7b71ac9f6d04a53b5afe0d2edbe052710b4e73e5861762c8676b8ab7775da4
|
|
| MD5 |
da4bf9cc101ca016baf46f1b5146fa3f
|
|
| BLAKE2b-256 |
c785b039c4fb498695efd0ed2fbc6aa36e7e45bc578f0458f6b8d45f7ad40290
|
File details
Details for the file markdowncore_ai-1.0.7-py3-none-any.whl.
File metadata
- Download URL: markdowncore_ai-1.0.7-py3-none-any.whl
- Upload date:
- Size: 95.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0aaeb95be57c8e113f7e1bac1c12f4f9509ca3282f3add94b7011e87a634cc8f
|
|
| MD5 |
73d523a850e711983f3d6302dbe9d4e3
|
|
| BLAKE2b-256 |
f292396295d098dedc4e1583e78f6c86001725979ba34aae817a8397ce764d56
|