Skip to main content

Markdown CORE AI - Classification, Organisation, Retrieval & Entry for your personal markdown knowledge base

Project description

mdcore

Markdown CORE AI - Classification, Organisation, Retrieval & Entry

mdcore is a local, LLM-agnostic knowledge base engine for anyone with a folder of markdown notes. It reads and writes your vault intelligently - retrieve context on demand, ingest new knowledge with automatic classification and routing, all from the terminal or a TUI.

PyPI: markdowncore-ai | CLI: mdcore | Version: 1.0.4


Screenshots

mdcore home

mdcore search

mdcore index

mdcore status


What It Does

Retrieval (mdcore search) - Ask a question or give a topic. mdcore searches your vault semantically, stitches the most relevant chunks, and synthesises a coherent cited briefing. Output lands in <vault>/mdcore-output/ - ready to copy into any LLM conversation.

Ingestion (mdcore ingest) - Feed any document into mdcore - an LLM session summary, a research note, a strategy doc, an article. It classifies the content against your existing vault, routes it to the right folder, detects conflicts with existing notes, generates a proposal, and writes only after your explicit approval.

Both flows work fully local with Ollama. No subscription LLM API calls. No always-on server.


Installation

# Recommended
uv tool install markdowncore-ai

# With TUI
uv tool install "markdowncore-ai[gui]"

# pipx
pipx install markdowncore-ai

Ollama models (local inference)

ollama pull nomic-embed-text   # embeddings
ollama pull qwen3.5:4b         # classification, routing, proposals
ollama pull phi4-mini          # synthesis (fast, non-thinking)

First run

mdcore init     # interactive setup -> writes ~/.mdcore/config.yaml
mdcore index    # scan and index your vault

Quick Start

# Search your vault
mdcore search "kubernetes ingress routing"
# -> synthesised briefing written to <vault>/mdcore-output/
# -> copy contents, paste into Claude / ChatGPT / Gemini

# Ingest a document
mdcore ingest --file my-session-summary.md
# -> classifies, routes to right folder, proposes changes -> approve to write

# Launch TUI
mdcore gui

Commands

mdcore init                        # Interactive setup wizard
mdcore index                       # Delta index - scan, diff, confirm, index
mdcore index --force               # Wipe everything and reindex from scratch
mdcore search <topic>              # Retrieve + synthesise briefing (Flow A)
mdcore search <topic> --raw        # Retrieve raw excerpts, skip synthesis
mdcore search <topic> --verbose    # Show similarity scores
mdcore ingest                      # Paste document - classify, route, propose (Flow B)
mdcore ingest --file <path>        # Ingest from file
mdcore map                         # Generate vault folder map for routing
mdcore map --repair                # Remove stale folder entries
mdcore gui                         # Launch TUI (requires [gui] extra)
mdcore status                      # Index health, drift warnings
mdcore eval [topic]                # Retrieval quality checklist
mdcore config                      # Open config in editor
mdcore config --validate           # Validate config

Multiple vaults / config profiles

mdcore search "istio auth"     --config ~/.mdcore/config-work.yaml
mdcore search "career goals"   --config ~/.mdcore/config-personal.yaml
mdcore search "topic"          --models ~/.mdcore/models-aggregator.yaml

Backends

mdcore supports local and API-backed models. Mix and match per use case.

Backend LLM Embeddings Extra needed
Ollama (local) any pulled model nomic-embed-text, bge-m3 none
Gemini gemini-2.5-flash-lite models/gemini-embedding-001 none (bundled)
OpenAI gpt-4o-mini text-embedding-3-small [openai]
Anthropic claude-haiku-4-5 use Ollama or OpenAI [anthropic]
Aggregator free-tier key pool free-tier key pool [aggregator]
uv tool install "markdowncore-ai[openai]"
uv tool install "markdowncore-ai[anthropic]"
uv tool install "markdowncore-ai[all]"    # every backend

Aggregator backend

aggregator routes calls through llm-aggregator - a local SQLite-backed key pool that round-robins free-tier API keys with automatic 429 cooldown. No api_key needed in mdcore config.

Install separately (not on PyPI):

pip install git+https://github.com/piyush-tyagi-13/llm-aggregator
# or if installed via uv tool:
uv tool install markdowncore-ai --with "llm-aggregator @ git+https://github.com/piyush-tyagi-13/llm-aggregator"
llm:
  backend: aggregator
  aggregator_category: general_purpose       # key pool category (optional)
  aggregator_rotate_every: 5      # requests per key before rotation

embeddings:
  backend: aggregator
  aggregator_category: general_purpose

Hardware guidance

Hardware LLM Embeddings
Apple M2 16GB+ qwen3.5:4b nomic-embed-text
i5 + RTX 4070 qwen3:8b bge-m3
Low-end / no GPU gemini-2.5-flash-lite or gpt-4o-mini models/gemini-embedding-001

Configuration

Config lives at ~/.mdcore/config.yaml. Generated by mdcore init.

Section Key fields Purpose
vault path, owner_name Vault root, owner name for multi-person vaults
embeddings backend, api_model / local_model, api_key Embedding model
llm backend, model, api_key, synthesise_model Primary LLM + synthesis model
indexer chunk_size, heading_aware_splitting Chunking strategy
retriever top_k, similarity_threshold Retrieval tuning
ingester similarity_threshold_high/low Classification thresholds
writer append_position, backup Write behaviour + backups

See config.yaml.example for the full annotated reference.

Separate models config

Keep model choices in a separate ~/.mdcore/models.yaml - useful for switching backends without touching main config. Values here override llm and embeddings sections in config.yaml.

# ~/.mdcore/models.yaml
llm:
  backend: aggregator
  aggregator_category: general_purpose

embeddings:
  backend: ollama
  local_model: nomic-embed-text

Pass explicitly with --models:

mdcore search "topic" --models ~/.mdcore/models-work.yaml
mdcore ingest --file note.md --models ~/.mdcore/models-cheap.yaml

Where LLM Calls Happen

mdcore search (Flow A)

Phase LLM? Notes
Keyword pre-filter No BM25 scoring
Vector search No Embedding lookup
Chunk assembly No Pure text
Synthesis Yes - synthesise_model Skip with --raw for zero LLM calls

mdcore ingest (Flow B)

Phase LLM? Condition
Embedding + search No Always
Classification Conditional - llm.model Only in ambiguous similarity range (0.65-0.82)
Folder routing Yes - llm.model NEW files only
Proposal Yes - llm.model Always before write

mdcore map and mdcore index make no LLM calls.


Observability

Token usage logged after every call to ~/.mdcore/logs/:

INFO llm - tokens [gemini-2.5-flash-lite] in=312 out=89 total=401

LangSmith tracing (optional) - add to ~/.mdcore/config.yaml:

llm:
  langsmith_api_key: <your-key>
  langsmith_project: mdcore

mdcore - Markdown CORE AI v1.0.4

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

markdowncore_ai-1.0.6.tar.gz (78.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

markdowncore_ai-1.0.6-py3-none-any.whl (93.6 kB view details)

Uploaded Python 3

File details

Details for the file markdowncore_ai-1.0.6.tar.gz.

File metadata

  • Download URL: markdowncore_ai-1.0.6.tar.gz
  • Upload date:
  • Size: 78.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for markdowncore_ai-1.0.6.tar.gz
Algorithm Hash digest
SHA256 c347ca301bff303d05b53c1a3be642314a2d324c83f470ce2c59d6e19e52e64e
MD5 dda68d0df552b22dedb7cb247476411e
BLAKE2b-256 28dbfda64219baf2ae17f049d9c1c05761a24b7ea85a15ba6a3edf49e28c4690

See more details on using hashes here.

File details

Details for the file markdowncore_ai-1.0.6-py3-none-any.whl.

File metadata

File hashes

Hashes for markdowncore_ai-1.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 f5c0f7cea818f64305a6a3218cc04fd8ed39e2c3a7fa31b3f51d1be6bbcf332a
MD5 980e0d29d5be7b0c0548aba9de9186d4
BLAKE2b-256 fe2cfcfe39d56ada651030726158b1bc4ba8ff055d1a107e1032cb5bc76552e0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page