Programmatic data access layer for LLM agents - navigate context, don't memorize it
Project description
Context Graph Connector (CGC)
Programmatic data access layer for LLM agents — navigate context, don't memorize it.
CGC connects AI agents to your databases, files, and vector stores. Instead of stuffing everything into the context window, your agent explores data on-demand: discovering schemas, sampling rows, chunking documents, extracting knowledge graphs, and querying across sources.
Features
- Connect to PostgreSQL, MySQL, SQLite, filesystems, Qdrant, Pinecone, pgvector, MongoDB
- Discover schemas and structure automatically
- Sample data to understand what's in each table or file
- Chunk large documents into LLM-friendly pieces
- Search across files and databases with pattern matching
- Extract knowledge graphs from text and structured data (GliNER + GLiREL + 17 industry packs)
- Store extracted triplets in Neo4j, PostgreSQL AGE, or KuzuDB
- Query graph sinks with Cypher
- MCP server for Claude Desktop / Claude Code / Cursor / Windsurf
- HTTP API for integration with any tool or platform
- CLI for quick tasks and scripting
Quick Start
Install
pip install context-graph-connector
With graph extraction (requires ML models):
pip install context-graph-connector[extraction]
With everything:
pip install context-graph-connector[all]
CLI
# Discover what's in a database
cgc discover sqlite ./mydata.db
# Sample rows from a table
cgc sample sqlite ./mydata.db users --n 10
# Run a SQL query
cgc sql sqlite ./mydata.db "SELECT * FROM orders WHERE total > 100"
# Extract knowledge graph from text
cgc extract "Steve Jobs co-founded Apple with Steve Wozniak in 1976"
# Extract from a file and store in a graph database
cgc extract-file ./report.pdf --sink kuzudb://./my_graph
# Chunk a large PDF for processing
cgc chunk filesystem ./docs report.pdf --strategy tokens:2000
# Start the HTTP API server
cgc serve
# Start the MCP server (for Claude integration)
cgc mcp
MCP Integration
Claude Code (VS Code):
claude mcp add cgc -- cgc mcp
Claude Desktop / Cursor / Windsurf — add to your config:
{
"mcpServers": {
"cgc": {
"command": "cgc",
"args": ["mcp"]
}
}
}
Python API
from cgc import Connector
connector = Connector()
# Add a data source
from cgc.adapters.sql import SqlAdapter
connector.add_source(SqlAdapter("mydb", "sqlite:///data.db"))
# Discover schema
schema = await connector.discover("mydb")
# Sample data
rows = await connector.sample("mydb", "users", 5)
# Extract triplets
triplets = connector.extract_triplets("Elon Musk founded SpaceX in 2002")
Optional Dependencies
CGC has a minimal core with optional extras for specific integrations:
| Extra | What it adds |
|---|---|
extraction |
GliNER, GLiREL, spaCy, sentence-transformers (knowledge graph extraction) |
postgres |
asyncpg, pgvector (PostgreSQL support) |
mysql |
aiomysql (MySQL support) |
vector |
qdrant-client, pinecone-client, pymongo, motor (vector DB support) |
graph |
kuzu (embedded graph database) |
all |
Everything above |
dev |
pytest, ruff, mypy (development) |
Architecture
cgc/
├── connector.py # Main interface — Connector class
├── core/ # Types: Schema, Query, Chunk, Triplet, Graph
├── adapters/
│ ├── sql.py # PostgreSQL, MySQL, SQLite
│ ├── filesystem.py # Local files (PDF, DOCX, CSV, etc.)
│ ├── vector/ # Qdrant, Pinecone, pgvector, MongoDB
│ └── graph/ # Neo4j, PostgreSQL AGE, KuzuDB (sinks)
├── discovery/ # Schema inference, relationship detection
│ ├── extractor.py # Triplet extraction pipeline
│ ├── gliner.py # GliNER NER integration
│ ├── glirel.py # GLiREL relation extraction
│ ├── router.py # Industry pack routing (E5 embeddings)
│ ├── industry_packs.py # 17 domain-specific extraction configs
│ └── structured.py # Hub-and-spoke structured data extraction
├── cli/ # Typer CLI
├── api/ # FastAPI HTTP server
├── mcp/ # Model Context Protocol server
├── session/ # Session tracking
└── security/ # API key auth, rate limiting
Documentation
| Document | Description |
|---|---|
| API Reference | HTTP API endpoints |
| CLI Reference | Command-line interface |
| MCP Reference | Model Context Protocol for Claude |
| Security Guide | API keys, rate limiting, data protection |
| Technical Details | Architecture and internals |
Contributing
Contributions are welcome but this project is maintained on a best-effort basis. PRs may not be reviewed immediately. See CONTRIBUTING.md for guidelines.
License
Apache 2.0 — see LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file context_graph_connector-0.7.0.tar.gz.
File metadata
- Download URL: context_graph_connector-0.7.0.tar.gz
- Upload date:
- Size: 164.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3867df86e50ea254461f3fc0eaab0ab1c34e06f48ab5e64ebff038e29c88e18f
|
|
| MD5 |
d4f1a900916a9bc260d48095c5f10dbf
|
|
| BLAKE2b-256 |
1e4f9a5c58e0d33e095e71d508c5db3573dd6e790e580ea7544d405e077102b4
|
Provenance
The following attestation bundles were made for context_graph_connector-0.7.0.tar.gz:
Publisher:
ci.yml on anthonylee991/cgc
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
context_graph_connector-0.7.0.tar.gz -
Subject digest:
3867df86e50ea254461f3fc0eaab0ab1c34e06f48ab5e64ebff038e29c88e18f - Sigstore transparency entry: 1059632377
- Sigstore integration time:
-
Permalink:
anthonylee991/cgc@746d64e01c17d6d3839a93966e310f7da518652b -
Branch / Tag:
refs/tags/v0.7.0 - Owner: https://github.com/anthonylee991
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@746d64e01c17d6d3839a93966e310f7da518652b -
Trigger Event:
push
-
Statement type:
File details
Details for the file context_graph_connector-0.7.0-py3-none-any.whl.
File metadata
- Download URL: context_graph_connector-0.7.0-py3-none-any.whl
- Upload date:
- Size: 154.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6349dfd98778aaa7c8bdaa4ad93ac2255674fd25dee9e5b1520627879071c5ae
|
|
| MD5 |
977c4fb4151d001cd8739c314b7c56eb
|
|
| BLAKE2b-256 |
4b7052e256fb6acaa4ea889fb9329a325ae8178d498718264c856db20a1d4b5e
|
Provenance
The following attestation bundles were made for context_graph_connector-0.7.0-py3-none-any.whl:
Publisher:
ci.yml on anthonylee991/cgc
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
context_graph_connector-0.7.0-py3-none-any.whl -
Subject digest:
6349dfd98778aaa7c8bdaa4ad93ac2255674fd25dee9e5b1520627879071c5ae - Sigstore transparency entry: 1059632378
- Sigstore integration time:
-
Permalink:
anthonylee991/cgc@746d64e01c17d6d3839a93966e310f7da518652b -
Branch / Tag:
refs/tags/v0.7.0 - Owner: https://github.com/anthonylee991
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@746d64e01c17d6d3839a93966e310f7da518652b -
Trigger Event:
push
-
Statement type: