Skip to main content

A lightweight knowledge graph engine for codebases with CLI, API, and MCP support

Project description

Code Knowledge Graph Engine

A lightweight, production-grade knowledge graph engine for codebases. Build and maintain an always-updated graph of code relationships (functions, classes, imports, calls) and query it via CLI, API, or MCP (Model Context Protocol) for AI-assisted development.

๐ŸŽฏ Goals

  • Reduce LLM token usage: Retrieve only relevant code relationships instead of entire files
  • Always-updated graph: Auto-update on file changes with built-in watcher
  • Plug-and-play: Easy integration into any project with a single command
  • Multiple interfaces: CLI, REST API, and MCP server for AI agents
  • Extensible: Simple architecture for enterprise customization

๐Ÿ—๏ธ Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                     Code Knowledge Graph                     โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                               โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”      โ”‚
โ”‚  โ”‚     CLI     โ”‚    โ”‚  FastAPI    โ”‚    โ”‚  MCP Server โ”‚      โ”‚
โ”‚  โ”‚  (typer)    โ”‚    โ”‚   Server    โ”‚    โ”‚  (stdio)    โ”‚      โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”˜      โ”‚
โ”‚         โ”‚                  โ”‚                  โ”‚              โ”‚
โ”‚         โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜              โ”‚
โ”‚                            โ”‚                                 โ”‚
โ”‚                   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                      โ”‚
โ”‚                   โ”‚   Graph Retriever  โ”‚                      โ”‚
โ”‚                   โ”‚   (query engine)   โ”‚                      โ”‚
โ”‚                   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                      โ”‚
โ”‚                             โ”‚                                 โ”‚
โ”‚                   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                      โ”‚
โ”‚                   โ”‚  Knowledge Graph   โ”‚                      โ”‚
โ”‚                   โ”‚  (nodes + edges)   โ”‚                      โ”‚
โ”‚                   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                      โ”‚
โ”‚                             โ”‚                                 โ”‚
โ”‚                   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                      โ”‚
โ”‚                   โ”‚  Graph Builder    โ”‚                      โ”‚
โ”‚                   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                      โ”‚
โ”‚                             โ”‚                                 โ”‚
โ”‚                   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                      โ”‚
โ”‚                   โ”‚  Python Parser    โ”‚                      โ”‚
โ”‚                   โ”‚    (AST-based)    โ”‚                      โ”‚
โ”‚                   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                      โ”‚
โ”‚                             โ”‚                                 โ”‚
โ”‚                   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                      โ”‚
โ”‚                   โ”‚  File Watcher     โ”‚                      โ”‚
โ”‚                   โ”‚   (watchdog)      โ”‚                      โ”‚
โ”‚                   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                      โ”‚
โ”‚                             โ”‚                                 โ”‚
โ”‚                    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                       โ”‚
โ”‚                    โ”‚  Python Files   โ”‚                       โ”‚
โ”‚                    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                       โ”‚
โ”‚                                                               โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿ“ฆ Installation

From source

# Clone the repository
git clone https://github.com/yourusername/code-knowledge-graph.git
cd code-knowledge-graph

# Install with pip
pip install .

Development installation

pip install -e ".[dev]"

๐Ÿš€ Quick Start

1. Build the knowledge graph

# Build graph for current directory
kg build

# Build graph for specific directory
kg build /path/to/project

# Exclude certain directories
kg build . --exclude "venv,node_modules,.git"

2. Query dependencies

# Query dependencies for a function
kg query my_function

# Query with custom depth
kg query my_function --depth 3

# Save output to file
kg query my_function --output deps.json

3. Start the watcher (auto-update)

# Watch current directory
kg watch

# Watch specific directory
kg watch /path/to/project

4. Check status

kg status

5. Search nodes

# Search for nodes
kg search "user"

# Filter by type
kg search "user" --node-type function

๐ŸŒ API Server

Start the server

kg serve --host 0.0.0.0 --port 8000

API Endpoints

Health Check

GET /health

Get Dependencies

GET /dependencies?target=function_name&depth=5

Get Full Graph

GET /graph

Get Statistics

GET /stats

Search Nodes

GET /search?query=search_term&node_type=function

Get Function Calls

GET /functions/{function_name}/calls

Get File Imports

GET /files/{file_path}/imports

Example using curl

# Get dependencies
curl "http://localhost:8000/dependencies?target=my_function"

# Get stats
curl http://localhost:8000/stats

# Search
curl "http://localhost:8000/search?query=user"

๐Ÿ”Œ MCP Server (Model Context Protocol)

The MCP server allows AI agents (like Windsurf, Claude Desktop) to query the knowledge graph directly.

Start MCP server

kg-mcp --graph-path storage/graph.json

MCP Methods

dependencies

Get upstream and downstream dependencies for a target.

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "dependencies",
  "params": {
    "target": "my_function",
    "depth": 5
  }
}

search

Search for nodes in the graph.

{
  "jsonrpc": "2.0",
  "id": 2,
  "method": "search",
  "params": {
    "query": "user",
    "node_type": "function"
  }
}

stats

Get graph statistics.

{
  "jsonrpc": "2.0",
  "id": 3,
  "method": "stats"
}

graph

Get the complete graph.

{
  "jsonrpc": "2.0",
  "id": 4,
  "method": "graph"
}

Integration with AI Tools

Windsurf Configuration

Add to your Windsurf config:

{
  "mcpServers": {
    "knowledge-graph": {
      "command": "kg-mcp",
      "args": ["--graph-path", "storage/graph.json"]
    }
  }
}

๐Ÿ“Š Graph Schema

Node Types

  • file: Represents a Python source file
  • function: Represents a function or method
  • class: Represents a class

Edge Types

  • calls: Function A calls Function B
  • imports: File A imports File B
  • defines: File defines Function/Class

Example Graph

{
  "nodes": {
    "file:main.py": {
      "id": "file:main.py",
      "type": "file",
      "name": "main.py",
      "file_path": "main.py",
      "line_number": 1
    },
    "function:process_data:main.py": {
      "id": "function:process_data:main.py",
      "type": "function",
      "name": "process_data",
      "file_path": "main.py",
      "line_number": 10
    }
  },
  "edges": [
    {
      "source": "file:main.py",
      "target": "function:process_data:main.py",
      "type": "defines"
    }
  ]
}

๐Ÿงช Testing

# Run tests
pytest

# Run with coverage
pytest --cov=core --cov=api --cov=cli --cov=mcp

๐Ÿ”ง Configuration

Storage

The graph is stored in JSON format at storage/graph.json by default. You can customize this path:

kg build . --output /custom/path/graph.json
kg query my_function --graph-path /custom/path/graph.json

Exclude Directories

When building the graph, certain directories are excluded by default:

  • venv, env, .git, __pycache__, .pytest_cache, node_modules

You can customize this:

kg build . --exclude "custom_dir,another_dir"

๐ŸŽจ Use Cases

1. Code Navigation

Quickly find what functions a specific function calls, or what functions call it.

kg query authenticate_user

2. Impact Analysis

Before refactoring, understand the downstream impact of changing a function.

kg query process_payment --depth 10

3. Code Review

Understand the relationships in a new codebase.

kg build /path/to/new/project
kg status

4. AI-Assisted Development

Provide AI agents with structured context about code relationships instead of raw files.

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "dependencies",
  "params": {"target": "main"}
}

๐Ÿ”’ Security

  • The graph stores file paths and code structure, not actual code content
  • No external network calls
  • All operations are local

๐Ÿค Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

๐Ÿ“„ License

MIT License - see LICENSE file for details

๐Ÿ™ Acknowledgments

  • Built with Python AST for reliable parsing
  • Uses watchdog for file system monitoring
  • FastAPI for the REST API
  • Typer for the CLI
  • MCP protocol for AI agent integration

๐Ÿ“ž Support

For issues, questions, or suggestions, please open an issue on GitHub.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

code_knowledge_graph_hv-1.0.1.tar.gz (22.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

code_knowledge_graph_hv-1.0.1-py3-none-any.whl (20.0 kB view details)

Uploaded Python 3

File details

Details for the file code_knowledge_graph_hv-1.0.1.tar.gz.

File metadata

  • Download URL: code_knowledge_graph_hv-1.0.1.tar.gz
  • Upload date:
  • Size: 22.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.10

File hashes

Hashes for code_knowledge_graph_hv-1.0.1.tar.gz
Algorithm Hash digest
SHA256 3caac84de028dcfe5b05b25ddc8ab3ef992f24601dc4d21c71b5ffbf0066096b
MD5 702c4c386ca6ae8bfa57b9cb88619d18
BLAKE2b-256 606da3e9f279a8827eaf654177ba24b053202692e028806ed8c9bc58bd971c39

See more details on using hashes here.

File details

Details for the file code_knowledge_graph_hv-1.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for code_knowledge_graph_hv-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 3cde498c0c99134bb535a9f9b27f8ea48bd89db77c65af66cfebdc530a387945
MD5 f243a6e72bf222a11d264faff6d8bb45
BLAKE2b-256 a08155f16761bbc4f43bd5e6669e38933852bd2c45f743bd647f62ee42bc318d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page