Context portability and knowledge ingestion tool — bridge your markdown vault to any subscription LLM

These details have not been verified by PyPI

Project links

Project description

ctxkit — Context Portability and Knowledge Ingestion Tool

Version: 1.2.0 | Status: Implementation

A local, LLM-agnostic context delivery and knowledge sync tool for users who live across multiple subscription LLMs.

What It Does

ctxkit sits between your personal markdown knowledge base and whichever subscription LLM you are using at any given moment (Claude, ChatGPT, Gemini, or any other). It does two things:

Flow A — Context Retrieval: Given a topic, it retrieves relevant chunks from your vault, synthesises them into a coherent, cited briefing (using a local LLM), and writes the result to <vault>/ctxkit-output/. You open the file, copy it, and paste it as the opening context of your LLM conversation. Zero API calls to your subscription LLM. Zero cost beyond your existing subscription.

Flow B — Knowledge Ingestion: Given a session summary from a completed LLM conversation, it classifies the content against your existing knowledge base, detects conflicts, generates a proposal, and — only after your explicit approval — writes the changes to your vault and reindexes.

Documentation

Getting Started — installation, first index, first search, first ingest, daily workflow
Config Reference — every config field documented with defaults, valid values, and tuning guidance
Architecture — component design and data flow
Retrieval & Eval Guide — symptom-to-fix guide for tuning retrieval quality

Installation

macOS — Homebrew

brew tap piyush-tyagi-13/ctxkit
brew install ctxkit

Any platform — shell script (installs uv + ctxkit)

curl -fsSL https://raw.githubusercontent.com/piyush-tyagi-13/context-portability-tool/master/install/install.sh | bash

Manual (if you already have uv or pipx)

uv tool install ctxkit-ai       # preferred
# or
pipx install ctxkit-ai

Ollama models (for local inference)

ollama pull nomic-embed-text   # embeddings
ollama pull qwen3.5:4b         # primary LLM — classification + proposals
ollama pull phi4-mini          # synthesis — fast, non-thinking

After install

ctxkit init       # interactive setup wizard
ctxkit deps install   # install any backend packages not yet present
ctxkit index      # index your vault

Commands

ctxkit init                           # Interactive setup wizard — create config
ctxkit index                          # Scan vault, show diff, confirm, index delta
ctxkit search <topic>                 # Synthesise briefing → write to <vault>/ctxkit-output/ (Flow A)
ctxkit search <topic> --raw           # Raw excerpts only — skip synthesis
ctxkit search <topic> --verbose       # Show chunk scores alongside results
ctxkit ingest                         # Accept session summary, classify, propose (Flow B)
ctxkit ingest --file summary.md       # Ingest from a file
ctxkit status                         # Show index health and drift warnings
ctxkit eval [topic]                   # Run quality evaluation checklist
ctxkit config                         # Open config file in editor
ctxkit config --validate              # Validate config and report errors

Multiple config profiles

ctxkit search "istio auth" --config ~/.ctxkit/config-technical.yaml
ctxkit search "career goals" --config ~/.ctxkit/config-personal.yaml

Quick Start

# 1. Configure (interactive wizard)
ctxkit init
# → asks for vault path, owner name, LLM backend, models
# → detects Ollama + pulled models, gives hardware-appropriate suggestions
# → writes ~/.ctxkit/config.yaml

# 2. Index your vault
ctxkit index

# 3. Retrieve context for an LLM conversation
ctxkit search "Bruno ingress path adaptor"
# → writes <vault>/ctxkit-output/2026-04-25-bruno-ingress-path-adaptor.md
# → open file, copy contents → paste into Claude/ChatGPT/Gemini

# 4. After an LLM session, ingest the summary
ctxkit ingest --file my-session-summary.md
# → review the proposal → approve

Architecture

YOUR MARKDOWN VAULT
        │
        ▼
   ctxkit core
   ┌──────────┐  ┌──────────┐  ┌────────────┐
   │ Indexer  │  │Retriever │  │  Ingester  │
   └──────────┘  └──────────┘  └────────────┘
   ┌──────────┐  ┌──────────┐  ┌────────────┐
   │  Writer  │  │LLM Layer │  │VectorStore │
   └──────────┘  └──────────┘  └────────────┘
        │
   (copy-paste by user)
        │
        ▼
ANY SUBSCRIPTION LLM (Claude · ChatGPT · Gemini · Others)

ctxkit never talks to your subscription LLM directly. It prepares context (Flow A) and processes output from it (Flow B). The user is the bridge.

LLM calls in Flow A (retrieval): one local call to synthesise_model (default: phi4-mini via Ollama) to reformat raw excerpts into a coherent briefing. No calls to your subscription LLM. Skip with --raw to make Flow A fully LLM-free.

LLM calls in Flow B (ingestion): only when classification is ambiguous (score between low and high thresholds). Clear-match updates and clear new-file cases require no LLM call.

Configuration Reference

See config.yaml.example for the full annotated config. Key sections:

Section	Key fields	Purpose
`vault`	`path`, `owner_name`	Vault root path, owner identity for multi-person vaults
`indexer`	`chunk_size`, `heading_levels`	Chunking strategy and quality filters
`embeddings`	`backend`, `local_model`	Local (Ollama) or API-backed embeddings
`retriever`	`top_k`, `similarity_threshold`	Candidate retrieval, assembly, signposting
`ingester`	`similarity_threshold_high/low`	Classification thresholds, conflict detection
`writer`	`append_position`, `backup`	Append position, frontmatter injection, backups
`llm`	`model`, `synthesise_model`	Primary LLM (classify/propose) + synthesis model (search)
`cli`	`theme`, `verbose`	Terminal UI behaviour

Hardware Tiers

Hardware	LLM Model	Embedding Model
Apple M2 Air 16GB	`qwen3.5:4b`	`nomic-embed-text`
i5 + RTX 4070	`qwen3:8b`	`bge-m3`
Low-end / no GPU	`gpt-4o-mini` / `claude-haiku-4-5`	`text-embedding-3-small`

Project Structure

ctxkit/
├── cli/commands.py              # Typer commands, Rich rendering
├── core/
│   ├── indexer/                 # VaultScanner, ManifestManager, TextSplitter, ...
│   ├── retriever/               # KeywordPreFilter, VectorSearcher, ChunkStitcher, ...
│   ├── ingester/                # ClassificationEngine, ConflictDetector, ...
│   └── writer/                  # BackupManager, FrontmatterInjector, FileWriter, ...
├── llm/llm_layer.py             # classify() and propose() — single LLM abstraction
├── store/vector_store.py        # ChromaDB wrapper — 4 operations
├── config/                      # Pydantic models + YAML loader
└── utils/                       # Logging, file utilities

What ctxkit Is Not

Not a chatbot or RAG question-answering agent
Not an API wrapper around subscription LLMs
Not a note-taking application
Not an always-on background service
Never writes anything without your explicit approval

ctxkit — Context Portability and Knowledge Ingestion Tool v1.2.0

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.3.3

Apr 26, 2026

1.3.2

Apr 26, 2026

1.3.1

Apr 26, 2026

This version

1.3.0

Apr 26, 2026

1.2.4

Apr 26, 2026

1.2.3

Apr 26, 2026

1.2.2

Apr 26, 2026

1.2.1

Apr 26, 2026

1.2.0

Apr 26, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ctxkit_ai-1.3.0.tar.gz (71.5 kB view details)

Uploaded Apr 26, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ctxkit_ai-1.3.0-py3-none-any.whl (82.8 kB view details)

Uploaded Apr 26, 2026 Python 3

File details

Details for the file ctxkit_ai-1.3.0.tar.gz.

File metadata

Download URL: ctxkit_ai-1.3.0.tar.gz
Upload date: Apr 26, 2026
Size: 71.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for ctxkit_ai-1.3.0.tar.gz
Algorithm	Hash digest
SHA256	`84c74dbe2d0f33d1c4d23a9f6674adb1ed280437f2cb06a2054d5f04e590af2e`
MD5	`c472a99b682a78e16c95b2ad26a73440`
BLAKE2b-256	`2f2a9af37826659f1563e68be47298da6b707e78dfbeab3028ca16252a3fe87e`

See more details on using hashes here.

File details

Details for the file ctxkit_ai-1.3.0-py3-none-any.whl.

File metadata

Download URL: ctxkit_ai-1.3.0-py3-none-any.whl
Upload date: Apr 26, 2026
Size: 82.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for ctxkit_ai-1.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5ffbc9a25c1e4b11d2490eff35f6651b370ca980ce461bc0a41fa23322737d8e`
MD5	`47ea1ae353eb847e70f0e76aaf790dda`
BLAKE2b-256	`047e42a905d6127a8bce2bed6caa376048f88db2920ce947876538cd1fb6fa7d`

See more details on using hashes here.

ctxkit-ai 1.3.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

ctxkit — Context Portability and Knowledge Ingestion Tool

What It Does

Documentation

Installation

macOS — Homebrew

Any platform — shell script (installs uv + ctxkit)

Manual (if you already have uv or pipx)

Ollama models (for local inference)

After install

Commands

Multiple config profiles

Quick Start

Architecture

Configuration Reference

Hardware Tiers

Project Structure

What ctxkit Is Not

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes