Skip to main content

intentic Knowledge Engine — context-efficient knowledge serving for AI agents

Project description

ike — intentic Knowledge Engine

CI PyPI Downloads Python 3.12+ License: MIT LinkedIn

Make your knowledge base and source code queryable by any AI tool. One command.

pip install intentic-ike[mcp]
ike init --kb-root ./docs

ike scans your docs, fixes quality issues, and generates config files so that Claude Code, Cursor, Codex, Windsurf, and every other AI coding tool can query your knowledge base with precision. With the [source] extra, ike also indexes source code. Routing via PageRank, fetching at symbol-level.

Before ike: Your AI tool reads entire files, wastes context, gets confused. After ike: Your AI tool asks for exactly the section or function it needs. 80-95% fewer tokens.

# Without ike: 17,500 tokens (4 full files dumped into context)
$ cat company/vision.md company/icp.md company/positioning.md company/messaging.md | wc -w

# With ike: 3,088 tokens (2 targeted sections)
$ ike route "ICP definition"
→ company/icp-definition.md#ideal-customer-profile-icp (2,836 tokens)
$ ike fetch company/icp-definition.md --section ideal-customer-profile-icp

Why ike Exists

I run multiple AI coding tools across several repos. 81 knowledge documents. Strategy, ICP definitions, architecture decisions, operational workflows. My AI tools already had access to all of it through a structured knowledge base. The problem wasn't access. It was efficiency.

Every time an AI tool needed context, it loaded entire files. Full documents dumped into the context window when only one section was relevant. The right knowledge existed, but there was no way to deliver the right amount of context at the right time to the right agent.

When an AI tool needed my ICP definition, it read the entire file. 7,306 tokens. The relevant section is 807 tokens. 89% waste. Across every knowledge lookup in every session across every tool, that adds up fast. Context window spent on content the AI doesn't need is context window unavailable for reasoning.

ike solved this. I measured the difference across my real knowledge base:

Scenario Without ike With ike Savings
Signal analysis (4 company files) 17,500 tokens 3,088 tokens 82%
Ripple analysis (2 architecture files) 7,715 tokens 1,334 tokens 83%
Architecture lookup (1 section) 4,825 tokens 259 tokens 95%
Vision section (from 7K doc) 7,306 tokens 807 tokens 89%

Real queries. Real knowledge base. Not a synthetic benchmark.

What ike actually does

ike sits between your AI tools and your documentation. Instead of reading files, your AI tool asks ike what's relevant, gets back a ranked list with token counts, and fetches only the sections it needs.

Two steps. Route first (~100 tokens), then fetch what matters.

Route: "What do you have about deployment?"
→ engineering/architecture.md#deployment (842 tokens)

Fetch: "Give me that section."
→ The actual content, nothing else.

The AI sees the token cost before loading anything. It decides what's worth the context budget. No full-file dumps.

For source code, ike parses your codebase into an AST, builds a dependency graph between files, ranks them by importance using PageRank, and lets your AI tool fetch a single function by name. Not the file. The function.

The setup

One command. ike init --kb-root ./docs. ike scans your Markdown files, builds a search index, and generates config files that Claude Code, Cursor, Codex, Windsurf, and 10+ other tools auto-discover. No manual wiring.

Docs need headers (H2+) and ideally YAML frontmatter. If they don't have frontmatter, ike doctor --yes infers it from the content. Title from H1, domain from directory, summary from first paragraph. Non-destructive, metadata only.

Why this matters

Context window is the bottleneck. Not model intelligence, not speed, not cost per token. When an AI tool runs out of context, it starts dropping information or hallucinating. Every token wasted on irrelevant content is a token unavailable for reasoning.

80-95% savings per lookup means your AI tools can work with your entire knowledge base instead of running out of context after two files. Write your knowledge once, query it from any AI tool.

Quick Start

Option A: Let your AI tool do it

Open your AI editor and paste this prompt:

Copy this prompt into Claude Code, Cursor, Codex, or any AI coding tool
I need to set up ike (intentic Knowledge Engine) to make this repository's
documentation queryable by AI tools.

ike is a CLI + MCP server that indexes Markdown files and serves them via
2-step retrieval: route (find relevant sections, ~100 tokens) then fetch
(load specific content). This is much more efficient than reading entire
files. 77-94% token savings.

Repo: https://github.com/pedrams/ike
Docs: https://github.com/pedrams/ike/blob/main/docs/build-with-ai.md

Please do the following:

1. INSTALL: Run `pip install intentic-ike[mcp]`
   - Requires Python 3.12+
   - The [mcp] extra includes the MCP server for IDE integration

2. FIND MY DOCS: Look for the directory containing my Markdown documentation.
   Common locations: ./docs, ./knowledge, ./wiki, or the repo root.
   List the .md files you find so I can confirm.

3. INITIALIZE: Run `ike init --kb-root <path-to-docs>`
   This will:
   - Scan all .md files and report quality (ready/fixable/needs review)
   - Generate .mcp.json (MCP server config for Claude Code, Cursor, etc.)
   - Generate AGENTS.md (universal AI tool discovery file)
   If .mcp.json already exists, ike merges. It won't overwrite your config.

4. FIX DOCS: Run `ike doctor --yes`
   This auto-adds missing YAML frontmatter (title, domain, summary) to docs
   that need it. Only metadata is added. Document content is never changed.
   If you prefer to review each fix: run `ike doctor` without --yes.

5. FIX CROSS-REFERENCES: Run `ike doctor --cross-refs`
   This analyzes orphaned documents (no incoming links) and suggests which
   docs should link to each other based on shared domains and keywords.
   For each suggestion:
   - Read both documents to understand their relationship
   - Add a `## Related` section at the bottom with Markdown links,
     or add inline links where they fit naturally in the content
   - Only add links that make semantic sense. Skip weak matches
   After adding links, run `ike lint` to verify orphan count decreased.

6. VERIFY: Run these commands and show me the output:
   - `ike route "test"` (should return matching sections with token counts)
   - `ike lint` (should show remaining quality issues, if any)
   - `cat .mcp.json` (should show the MCP server config)
   - `cat AGENTS.md` (should show the AI tool discovery file)

7. DONE: Tell me to restart my AI tool so the MCP connection activates.

If any step fails:
- "ike: command not found" → pip install didn't work, try: python -m ike --help
- "ike serve fails with ImportError" → install with MCP: pip install intentic-ike[mcp]
- "route returns no results" → run: ike index --rebuild
- "doctor changes too much" → use: ike doctor (interactive) instead of --yes

Option B: Manual setup (4 commands)

pip install intentic-ike[mcp]       # 1. Install
cd ~/my-repo
ike init --kb-root ./docs           # 2. Scan docs, generate .mcp.json + AGENTS.md
ike doctor --yes                    # 3. Fix missing frontmatter
ike route "test"                    # 4. Verify
# Restart your AI tool

How It Works

2-step retrieval: Route first (~100 tokens), then fetch only what you need.

# Step 1: Find relevant sections
$ ike route "deployment strategy"
{
  "chunks": [
    {"file_path": "engineering/architecture.md",
     "section_id": "deployment", "score": 3.0, "token_count": 842}
  ]
}

# Step 2: Load only what you need
$ ike fetch engineering/architecture.md --section deployment
## Deployment
CI/CD pipeline deploys to Hetzner Cloud...

The AI tool decides what to load based on token counts. No context wasted.

Source Code Intelligence

Index source code for symbol-level retrieval. Same 2-step flow, but for code.

pip install intentic-ike[source]        # adds tree-sitter + networkx
ike index --source ./src                # index source code via AST
ike route "auth middleware" --source     # PageRank-weighted code routing
ike fetch src/auth.py --symbol AuthService.validate_token  # exact method
ike symbols --file auth.py              # list all symbols

How it works:

  1. tree-sitter parses source files into AST, extracts functions/classes/methods as chunks
  2. A name-based Def/Ref graph links files that share symbol names (same language only)
  3. PageRank ranks files by importance. Defining files score higher than files that merely reference them
  4. At query time: score = 0.7 * pagerank + 0.3 * keyword_match

Language support:

Tier Languages Extraction
Tier 1 Python, TypeScript, JavaScript, Go, Rust AST-based, qualified names (Class.method), decorators
Tier 2 25+ languages (Java, C/C++, Ruby, PHP, Swift, Kotlin, Scala, Lua, R, …) Text-chunking fallback (~50 lines/chunk)

MCP tools: route(query, source=True), fetch(file_path, symbol="Class.method"), query(text, source=True)

AI Tool Compatibility

ike generates discovery files for every major AI coding tool:

AI Tool How it discovers ike Generated by
Claude Code .mcp.json + AGENTS.md ike init
Cursor .mcp.json + AGENTS.md ike init
Codex (OpenAI) AGENTS.md ike init
Windsurf .mcp.json + AGENTS.md ike init
Claude Desktop .mcp.json ike init
OpenCode AGENTS.md ike init
Hermes .mcp.json + AGENTS.md ike init
VS Code + Copilot .mcp.json ike init
Zed .mcp.json ike init
Aider AGENTS.md ike init

MCP tools (for MCP-capable editors): route, fetch, query, lint, with source and symbol parameters for code CLI commands (for everything else): ike route, ike fetch, ike query, ike lint, ike symbols

Commands

Setup

Command What it does
ike init --kb-root ./docs Scan docs, generate .mcp.json + AGENTS.md
ike doctor --yes Auto-fix missing frontmatter
ike doctor Interactive frontmatter fix (review each)
ike doctor --cross-refs Suggest missing cross-references for orphans
ike serve Start MCP server (stdio transport)
ike migrate-mcp Migrate old .mcp.json to portable format

Query

Command What it does
ike route "query" Find relevant sections (~100 tokens response)
ike route "query" --source Find relevant source code (PageRank-weighted)
ike fetch path/file.md Load entire file
ike fetch path/file.md --section id Load specific section
ike fetch path/file.py --symbol Class.method Load specific function/method
ike query "text" --depth deep Route + fetch in one step
ike symbols --file pattern List indexed source symbols
ike index --source ./src Index source code via tree-sitter AST

Maintenance

Command What it does
ike lint Check for missing frontmatter, broken refs, orphans
ike lint --freshness 30 Also flag docs older than 30 days
ike list List all indexed sections
ike index --rebuild Rebuild search index

Global options

ike --kb-root /path/to/kb route "query"   # custom KB root
export IKE_KB_ROOT=/path/to/kb            # or via env var
ike --version                             # show version

Context Efficiency

Tested against a real 81-doc knowledge base:

Scenario Without ike With ike Savings
Signal analysis (4 company files) 17,500 tokens 3,088 tokens -82%
Ripple analysis (2 architecture files) 7,715 tokens 1,334 tokens -83%
Architecture lookup (1 section) 4,825 tokens 259 tokens -95%
Vision section (from 7K doc) 7,306 tokens 807 tokens -89%

Every token saved is a token available for reasoning.

Document Format

ike works with any Markdown. For best results, docs should have YAML frontmatter:

---
title: "Authentication Architecture"
domain: "engineering"
summary: "OAuth2 + JWT auth flow for the API gateway."
---

Don't have frontmatter? Run ike doctor --yes. It infers title from H1, domain from directory path, summary from the first paragraph.

Install

pip install intentic-ike          # CLI only
pip install intentic-ike[mcp]     # CLI + MCP server
pip install intentic-ike[source]  # CLI + source code indexing (tree-sitter + networkx)
pip install intentic-ike[mcp,source]  # everything

Requires Python 3.12+. Works on Linux, macOS, Windows (WSL).

MCP Server Configuration

ike init generates .mcp.json automatically. Or configure manually:

{
  "mcpServers": {
    "ike": {
      "command": "ike",
      "args": ["serve"],
      "env": { "IKE_KB_ROOT": "/path/to/your/docs" }
    }
  }
}

Architecture

AI Tool (Claude Code, Cursor, Codex, ...)
        |
   ike CLI / MCP Server
        |
   Engine (facade)
        |
   ┌────┬────┬────┬────┬────┐
Parser Index Fetcher Writer Linter
   |    |                        CodeParser  CodeGraph
markdown-it  SQLite WAL          tree-sitter  NetworkX
  • Plugin CLI. Commands auto-discovered from ike/commands/*_cmd.py
  • Thread-safe. SQLite with per-thread connections (safe for MCP thread pool)
  • Lazy MCP. fastmcp only imported when ike serve runs
  • Lazy source. tree-sitter-language-pack and networkx only imported when --source is used

Development

git clone https://github.com/pedrams/ike && cd ike
python -m venv .venv && source .venv/bin/activate
pip install -e ".[dev,mcp,source]"
pytest                                  # 287 tests, ~10s

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

intentic_ike-0.6.0.tar.gz (76.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

intentic_ike-0.6.0-py3-none-any.whl (56.8 kB view details)

Uploaded Python 3

File details

Details for the file intentic_ike-0.6.0.tar.gz.

File metadata

  • Download URL: intentic_ike-0.6.0.tar.gz
  • Upload date:
  • Size: 76.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for intentic_ike-0.6.0.tar.gz
Algorithm Hash digest
SHA256 6057f6d8baae10dca7e802d28d0da30fbd9d013bc03a8809da23e1a312a610ba
MD5 0cf4901ea2cb0be265fece8b59474248
BLAKE2b-256 847c88bcda869ec9c3a1fe40fac317d6c95480d5e0bdc8e22853f6d36f1bc4a5

See more details on using hashes here.

Provenance

The following attestation bundles were made for intentic_ike-0.6.0.tar.gz:

Publisher: publish.yml on pedrams/ike

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file intentic_ike-0.6.0-py3-none-any.whl.

File metadata

  • Download URL: intentic_ike-0.6.0-py3-none-any.whl
  • Upload date:
  • Size: 56.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for intentic_ike-0.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a2e11642212923ab5b2b045c04b30e0116aefad0864df46ca0d0fb7335bdf056
MD5 f2b382eb84f04a23575c84db8d6613a3
BLAKE2b-256 b88448034760d0ab9df52c1850c2fa1488f012c0b91fd2ccd52c1aadfd030239

See more details on using hashes here.

Provenance

The following attestation bundles were made for intentic_ike-0.6.0-py3-none-any.whl:

Publisher: publish.yml on pedrams/ike

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page