Skip to main content

Markdown CORE AI - Classification, Organisation, Retrieval & Entry for your personal markdown knowledge base

Project description

mdcore

Markdown CORE AI - Classification, Organisation, Retrieval & Entry

mdcore is a local, LLM-agnostic knowledge base engine for anyone with a folder of markdown notes. It reads and writes your vault intelligently - retrieve context on demand, ingest new knowledge with automatic classification and routing, all from the terminal or a TUI.

PyPI: markdowncore-ai | CLI: mdcore | Version: 1.0.2


Screenshots

mdcore home

mdcore search

mdcore index

mdcore status


What It Does

Retrieval (mdcore search) - Ask a question or give a topic. mdcore searches your vault semantically, stitches the most relevant chunks, and synthesises a coherent cited briefing. Output lands in <vault>/mdcore-output/ - ready to copy into any LLM conversation.

Ingestion (mdcore ingest) - Feed any document into mdcore - an LLM session summary, a research note, a strategy doc, an article. It classifies the content against your existing vault, routes it to the right folder, detects conflicts with existing notes, generates a proposal, and writes only after your explicit approval.

Both flows work fully local with Ollama. No subscription LLM API calls. No always-on server.


Installation

# Recommended
uv tool install markdowncore-ai

# With TUI
uv tool install "markdowncore-ai[gui]"

# pipx
pipx install markdowncore-ai

Ollama models (local inference)

ollama pull nomic-embed-text   # embeddings
ollama pull qwen3.5:4b         # classification, routing, proposals
ollama pull phi4-mini          # synthesis (fast, non-thinking)

First run

mdcore init     # interactive setup -> writes ~/.mdcore/config.yaml
mdcore index    # scan and index your vault

Quick Start

# Search your vault
mdcore search "kubernetes ingress routing"
# -> synthesised briefing written to <vault>/mdcore-output/
# -> copy contents, paste into Claude / ChatGPT / Gemini

# Ingest a document
mdcore ingest --file my-session-summary.md
# -> classifies, routes to right folder, proposes changes -> approve to write

# Launch TUI
mdcore gui

Commands

mdcore init                        # Interactive setup wizard
mdcore index                       # Delta index - scan, diff, confirm, index
mdcore index --force               # Wipe everything and reindex from scratch
mdcore search <topic>              # Retrieve + synthesise briefing (Flow A)
mdcore search <topic> --raw        # Retrieve raw excerpts, skip synthesis
mdcore search <topic> --verbose    # Show similarity scores
mdcore ingest                      # Paste document - classify, route, propose (Flow B)
mdcore ingest --file <path>        # Ingest from file
mdcore map                         # Generate vault folder map for routing
mdcore map --repair                # Remove stale folder entries
mdcore gui                         # Launch TUI (requires [gui] extra)
mdcore status                      # Index health, drift warnings
mdcore eval [topic]                # Retrieval quality checklist
mdcore config                      # Open config in editor
mdcore config --validate           # Validate config

Multiple vaults / config profiles

mdcore search "istio auth"     --config ~/.mdcore/config-work.yaml
mdcore search "career goals"   --config ~/.mdcore/config-personal.yaml
mdcore search "topic"          --models ~/.mdcore/models-aggregator.yaml

Backends

mdcore supports local and API-backed models. Mix and match per use case.

Backend LLM Embeddings Extra needed
Ollama (local) any pulled model nomic-embed-text, bge-m3 none
Gemini gemini-2.5-flash-lite models/gemini-embedding-001 none (bundled)
OpenAI gpt-4o-mini text-embedding-3-small [openai]
Anthropic claude-haiku-4-5 use Ollama or OpenAI [anthropic]
Aggregator free-tier key pool free-tier key pool [aggregator]
uv tool install "markdowncore-ai[openai]"
uv tool install "markdowncore-ai[anthropic]"
uv tool install "markdowncore-ai[all]"    # every backend

Aggregator backend

aggregator routes calls through llm-aggregator - a local SQLite-backed key pool that round-robins free-tier API keys with automatic 429 cooldown. No api_key needed in mdcore config.

Install separately (not on PyPI):

pip install git+https://github.com/piyush-tyagi-13/llm-aggregator
# or if installed via uv tool:
uv tool install markdowncore-ai --with "llm-aggregator @ git+https://github.com/piyush-tyagi-13/llm-aggregator"
llm:
  backend: aggregator
  aggregator_category: fast       # key pool category (optional)
  aggregator_rotate_every: 5      # requests per key before rotation

embeddings:
  backend: aggregator
  aggregator_category: embeddings

Hardware guidance

Hardware LLM Embeddings
Apple M2 16GB+ qwen3.5:4b nomic-embed-text
i5 + RTX 4070 qwen3:8b bge-m3
Low-end / no GPU gemini-2.5-flash-lite or gpt-4o-mini models/gemini-embedding-001

Configuration

Config lives at ~/.mdcore/config.yaml. Generated by mdcore init.

Section Key fields Purpose
vault path, owner_name Vault root, owner name for multi-person vaults
embeddings backend, api_model / local_model, api_key Embedding model
llm backend, model, api_key, synthesise_model Primary LLM + synthesis model
indexer chunk_size, heading_aware_splitting Chunking strategy
retriever top_k, similarity_threshold Retrieval tuning
ingester similarity_threshold_high/low Classification thresholds
writer append_position, backup Write behaviour + backups

See config.yaml.example for the full annotated reference.

Separate models config

Keep model choices in a separate ~/.mdcore/models.yaml - useful for switching backends without touching main config. Values here override llm and embeddings sections in config.yaml.

# ~/.mdcore/models.yaml
llm:
  backend: aggregator
  aggregator_category: fast

embeddings:
  backend: ollama
  local_model: nomic-embed-text

Pass explicitly with --models:

mdcore search "topic" --models ~/.mdcore/models-work.yaml
mdcore ingest --file note.md --models ~/.mdcore/models-cheap.yaml

Where LLM Calls Happen

mdcore search (Flow A)

Phase LLM? Notes
Keyword pre-filter No BM25 scoring
Vector search No Embedding lookup
Chunk assembly No Pure text
Synthesis Yes - synthesise_model Skip with --raw for zero LLM calls

mdcore ingest (Flow B)

Phase LLM? Condition
Embedding + search No Always
Classification Conditional - llm.model Only in ambiguous similarity range (0.65-0.82)
Folder routing Yes - llm.model NEW files only
Proposal Yes - llm.model Always before write

mdcore map and mdcore index make no LLM calls.


Observability

Token usage logged after every call to ~/.mdcore/logs/:

INFO llm - tokens [gemini-2.5-flash-lite] in=312 out=89 total=401

LangSmith tracing (optional) - add to ~/.mdcore/config.yaml:

llm:
  langsmith_api_key: <your-key>
  langsmith_project: mdcore

mdcore - Markdown CORE AI v1.0.2

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

markdowncore_ai-1.0.2.tar.gz (77.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

markdowncore_ai-1.0.2-py3-none-any.whl (92.0 kB view details)

Uploaded Python 3

File details

Details for the file markdowncore_ai-1.0.2.tar.gz.

File metadata

  • Download URL: markdowncore_ai-1.0.2.tar.gz
  • Upload date:
  • Size: 77.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for markdowncore_ai-1.0.2.tar.gz
Algorithm Hash digest
SHA256 30c06e8daf42f0fb8c8262ec7fc5dd465d7c62b19c55e828a7923aed48a24d52
MD5 464c90dd080f7accc060fe4afefecedf
BLAKE2b-256 d0ef6785cde437ebebedfc96305d481dcfd3121b52dbd87f8e69b7b0509a7298

See more details on using hashes here.

File details

Details for the file markdowncore_ai-1.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for markdowncore_ai-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 4bc0c83248eab64078bb3ed4b7d7ab45beeb7d56363b9e39a0feb2055c2457b5
MD5 cf27f4a47f00d8558d15df03ba8d14a0
BLAKE2b-256 9787aed7d9b0eebff4ef4b17f00819642ffe9ba5a81d901411b832f57cefc82f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page