Skip to main content

A lightweight knowledge graph engine for codebases with CLI, API, and MCP support

Project description

Code Knowledge Graph Engine

A lightweight, production-grade knowledge graph engine for codebases. Build and maintain an always-updated graph of code relationships (functions, classes, imports, calls) and query it via CLI, API, or MCP (Model Context Protocol) for AI-assisted development.

๐ŸŽฏ Goals

  • Reduce LLM token usage: Retrieve only relevant code relationships instead of entire files
  • Always-updated graph: Auto-update on file changes with built-in watcher
  • Plug-and-play: Easy integration into any project with a single command
  • Multiple interfaces: CLI, REST API, and MCP server for AI agents
  • Extensible: Simple architecture for enterprise customization

๐Ÿ—๏ธ Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                     Code Knowledge Graph                     โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                               โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”      โ”‚
โ”‚  โ”‚     CLI     โ”‚    โ”‚  FastAPI    โ”‚    โ”‚  MCP Server โ”‚      โ”‚
โ”‚  โ”‚  (typer)    โ”‚    โ”‚   Server    โ”‚    โ”‚  (stdio)    โ”‚      โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”˜      โ”‚
โ”‚         โ”‚                  โ”‚                  โ”‚              โ”‚
โ”‚         โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜              โ”‚
โ”‚                            โ”‚                                 โ”‚
โ”‚                   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                      โ”‚
โ”‚                   โ”‚   Graph Retriever  โ”‚                      โ”‚
โ”‚                   โ”‚   (query engine)   โ”‚                      โ”‚
โ”‚                   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                      โ”‚
โ”‚                             โ”‚                                 โ”‚
โ”‚                   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                      โ”‚
โ”‚                   โ”‚  Knowledge Graph   โ”‚                      โ”‚
โ”‚                   โ”‚  (nodes + edges)   โ”‚                      โ”‚
โ”‚                   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                      โ”‚
โ”‚                             โ”‚                                 โ”‚
โ”‚                   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                      โ”‚
โ”‚                   โ”‚  Graph Builder    โ”‚                      โ”‚
โ”‚                   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                      โ”‚
โ”‚                             โ”‚                                 โ”‚
โ”‚                   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                      โ”‚
โ”‚                   โ”‚  Python Parser    โ”‚                      โ”‚
โ”‚                   โ”‚    (AST-based)    โ”‚                      โ”‚
โ”‚                   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                      โ”‚
โ”‚                             โ”‚                                 โ”‚
โ”‚                   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                      โ”‚
โ”‚                   โ”‚  File Watcher     โ”‚                      โ”‚
โ”‚                   โ”‚   (watchdog)      โ”‚                      โ”‚
โ”‚                   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                      โ”‚
โ”‚                             โ”‚                                 โ”‚
โ”‚                    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                       โ”‚
โ”‚                    โ”‚  Python Files   โ”‚                       โ”‚
โ”‚                    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                       โ”‚
โ”‚                                                               โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿ“ฆ Installation

From source

# Clone the repository
git clone https://github.com/yourusername/code-knowledge-graph.git
cd code-knowledge-graph

# Install with pip
pip install .

Development installation

pip install -e ".[dev]"

๐Ÿš€ Quick Start

1. Build the knowledge graph

# Build graph for current directory
kg build

# Build graph for specific directory
kg build /path/to/project

# Exclude certain directories
kg build . --exclude "venv,node_modules,.git"

2. Query dependencies

# Query dependencies for a function
kg query my_function

# Query with custom depth
kg query my_function --depth 3

# Save output to file
kg query my_function --output deps.json

3. Start the watcher (auto-update)

# Watch current directory
kg watch

# Watch specific directory
kg watch /path/to/project

4. Check status

kg status

5. Search nodes

# Search for nodes
kg search "user"

# Filter by type
kg search "user" --node-type function

๐ŸŒ API Server

Start the server

kg serve --host 0.0.0.0 --port 8000

API Endpoints

Health Check

GET /health

Get Dependencies

GET /dependencies?target=function_name&depth=5

Get Full Graph

GET /graph

Get Statistics

GET /stats

Search Nodes

GET /search?query=search_term&node_type=function

Get Function Calls

GET /functions/{function_name}/calls

Get File Imports

GET /files/{file_path}/imports

Example using curl

# Get dependencies
curl "http://localhost:8000/dependencies?target=my_function"

# Get stats
curl http://localhost:8000/stats

# Search
curl "http://localhost:8000/search?query=user"

๐Ÿ”Œ MCP Server (Model Context Protocol)

The MCP server allows AI agents (like Windsurf, Claude Desktop) to query the knowledge graph directly.

Start MCP server

kg-mcp --graph-path storage/graph.json

MCP Methods

dependencies

Get upstream and downstream dependencies for a target.

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "dependencies",
  "params": {
    "target": "my_function",
    "depth": 5
  }
}

search

Search for nodes in the graph.

{
  "jsonrpc": "2.0",
  "id": 2,
  "method": "search",
  "params": {
    "query": "user",
    "node_type": "function"
  }
}

stats

Get graph statistics.

{
  "jsonrpc": "2.0",
  "id": 3,
  "method": "stats"
}

graph

Get the complete graph.

{
  "jsonrpc": "2.0",
  "id": 4,
  "method": "graph"
}

Integration with AI Tools

Windsurf Configuration

Add to your Windsurf config:

{
  "mcpServers": {
    "knowledge-graph": {
      "command": "kg-mcp",
      "args": ["--graph-path", "storage/graph.json"]
    }
  }
}

๐Ÿ“Š Graph Schema

Node Types

  • file: Represents a Python source file
  • function: Represents a function or method
  • class: Represents a class

Edge Types

  • calls: Function A calls Function B
  • imports: File A imports File B
  • defines: File defines Function/Class

Example Graph

{
  "nodes": {
    "file:main.py": {
      "id": "file:main.py",
      "type": "file",
      "name": "main.py",
      "file_path": "main.py",
      "line_number": 1
    },
    "function:process_data:main.py": {
      "id": "function:process_data:main.py",
      "type": "function",
      "name": "process_data",
      "file_path": "main.py",
      "line_number": 10
    }
  },
  "edges": [
    {
      "source": "file:main.py",
      "target": "function:process_data:main.py",
      "type": "defines"
    }
  ]
}

๐Ÿงช Testing

# Run tests
pytest

# Run with coverage
pytest --cov=core --cov=api --cov=cli --cov=mcp

๐Ÿ”ง Configuration

Storage

The graph is stored in JSON format at storage/graph.json by default. You can customize this path:

kg build . --output /custom/path/graph.json
kg query my_function --graph-path /custom/path/graph.json

Exclude Directories

When building the graph, certain directories are excluded by default:

  • venv, env, .git, __pycache__, .pytest_cache, node_modules

You can customize this:

kg build . --exclude "custom_dir,another_dir"

๐ŸŽจ Use Cases

1. Code Navigation

Quickly find what functions a specific function calls, or what functions call it.

kg query authenticate_user

2. Impact Analysis

Before refactoring, understand the downstream impact of changing a function.

kg query process_payment --depth 10

3. Code Review

Understand the relationships in a new codebase.

kg build /path/to/new/project
kg status

4. AI-Assisted Development

Provide AI agents with structured context about code relationships instead of raw files.

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "dependencies",
  "params": {"target": "main"}
}

๐Ÿ”’ Security

  • The graph stores file paths and code structure, not actual code content
  • No external network calls
  • All operations are local

๐Ÿค Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

๐Ÿ“„ License

MIT License - see LICENSE file for details

๐Ÿ™ Acknowledgments

  • Built with Python AST for reliable parsing
  • Uses watchdog for file system monitoring
  • FastAPI for the REST API
  • Typer for the CLI
  • MCP protocol for AI agent integration

๐Ÿ“ž Support

For issues, questions, or suggestions, please open an issue on GitHub.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

code_knowledge_graph_hv-1.0.0.tar.gz (21.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

code_knowledge_graph_hv-1.0.0-py3-none-any.whl (19.3 kB view details)

Uploaded Python 3

File details

Details for the file code_knowledge_graph_hv-1.0.0.tar.gz.

File metadata

  • Download URL: code_knowledge_graph_hv-1.0.0.tar.gz
  • Upload date:
  • Size: 21.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.10

File hashes

Hashes for code_knowledge_graph_hv-1.0.0.tar.gz
Algorithm Hash digest
SHA256 ead6ee946a32a0c488140fd3942e9a05eac0080feaf96258606d0b4faaddb6cb
MD5 f0b374598a4e522b4211284c0b969365
BLAKE2b-256 c5b61beb6edbb18a7d159ec987346eadf00dd67aefa74ec9b8b23f7b18dc93d1

See more details on using hashes here.

File details

Details for the file code_knowledge_graph_hv-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for code_knowledge_graph_hv-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f75e4b43d1eb82db7f298723207170511b79f82d86f87b354b927e233fec5133
MD5 34d2ccc89ba5f02e19ed296517c26606
BLAKE2b-256 84f4e17178cd86229d4d871384d6f51f1283f04dfe1e147e8f5152a32b68a2ea

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page