Skip to main content

Programmatic data access layer for LLM agents - navigate context, don't memorize it

Project description

Context Graph Connector (CGC)

License Python 3.10+

Programmatic data access layer for LLM agents — navigate context, don't memorize it.

CGC connects AI agents to your databases, files, and vector stores. Instead of stuffing everything into the context window, your agent explores data on-demand: discovering schemas, sampling rows, chunking documents, extracting knowledge graphs, and querying across sources.


Features

  • Connect to PostgreSQL, MySQL, SQLite, filesystems, Qdrant, Pinecone, pgvector, MongoDB
  • Discover schemas and structure automatically
  • Sample data to understand what's in each table or file
  • Chunk large documents into LLM-friendly pieces
  • Search across files and databases with pattern matching
  • Extract knowledge graphs from text and structured data (GliNER + GLiREL + 17 industry packs)
  • Store extracted triplets in Neo4j, PostgreSQL AGE, or KuzuDB
  • Query graph sinks with Cypher
  • MCP server for Claude Desktop / Claude Code / Cursor / Windsurf
  • HTTP API for integration with any tool or platform
  • CLI for quick tasks and scripting

Quick Start

Install

pip install context-graph-connector

With graph extraction (requires ML models):

pip install context-graph-connector[extraction]

With everything:

pip install context-graph-connector[all]

CLI

# Discover what's in a database
cgc discover sqlite ./mydata.db

# Sample rows from a table
cgc sample sqlite ./mydata.db users --n 10

# Run a SQL query
cgc sql sqlite ./mydata.db "SELECT * FROM orders WHERE total > 100"

# Extract knowledge graph from text
cgc extract "Steve Jobs co-founded Apple with Steve Wozniak in 1976"

# Extract from a file and store in a graph database
cgc extract-file ./report.pdf --sink kuzudb://./my_graph

# Chunk a large PDF for processing
cgc chunk filesystem ./docs report.pdf --strategy tokens:2000

# Start the HTTP API server
cgc serve

# Start the MCP server (for Claude integration)
cgc mcp

MCP Integration

Claude Code (VS Code):

claude mcp add cgc -- cgc mcp

Claude Desktop / Cursor / Windsurf — add to your config:

{
  "mcpServers": {
    "cgc": {
      "command": "cgc",
      "args": ["mcp"]
    }
  }
}

Python API

from cgc import Connector

connector = Connector()

# Add a data source
from cgc.adapters.sql import SqlAdapter
connector.add_source(SqlAdapter("mydb", "sqlite:///data.db"))

# Discover schema
schema = await connector.discover("mydb")

# Sample data
rows = await connector.sample("mydb", "users", 5)

# Extract triplets
triplets = connector.extract_triplets("Elon Musk founded SpaceX in 2002")

Optional Dependencies

CGC has a minimal core with optional extras for specific integrations:

Extra What it adds
extraction GliNER, GLiREL, spaCy, sentence-transformers (knowledge graph extraction)
postgres asyncpg, pgvector (PostgreSQL support)
mysql aiomysql (MySQL support)
vector qdrant-client, pinecone-client, pymongo, motor (vector DB support)
graph kuzu (embedded graph database)
all Everything above
dev pytest, ruff, mypy (development)

Architecture

cgc/
├── connector.py          # Main interface — Connector class
├── core/                 # Types: Schema, Query, Chunk, Triplet, Graph
├── adapters/
│   ├── sql.py            # PostgreSQL, MySQL, SQLite
│   ├── filesystem.py     # Local files (PDF, DOCX, CSV, etc.)
│   ├── vector/           # Qdrant, Pinecone, pgvector, MongoDB
│   └── graph/            # Neo4j, PostgreSQL AGE, KuzuDB (sinks)
├── discovery/            # Schema inference, relationship detection
│   ├── extractor.py      # Triplet extraction pipeline
│   ├── gliner.py         # GliNER NER integration
│   ├── glirel.py         # GLiREL relation extraction
│   ├── router.py         # Industry pack routing (E5 embeddings)
│   ├── industry_packs.py # 17 domain-specific extraction configs
│   └── structured.py     # Hub-and-spoke structured data extraction
├── cli/                  # Typer CLI
├── api/                  # FastAPI HTTP server
├── mcp/                  # Model Context Protocol server
├── session/              # Session tracking
└── security/             # API key auth, rate limiting

Documentation

Document Description
API Reference HTTP API endpoints
CLI Reference Command-line interface
MCP Reference Model Context Protocol for Claude
Security Guide API keys, rate limiting, data protection
Technical Details Architecture and internals

Contributing

Contributions are welcome but this project is maintained on a best-effort basis. PRs may not be reviewed immediately. See CONTRIBUTING.md for guidelines.

License

Apache 2.0 — see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

context_graph_connector-0.6.0.tar.gz (156.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

context_graph_connector-0.6.0-py3-none-any.whl (151.0 kB view details)

Uploaded Python 3

File details

Details for the file context_graph_connector-0.6.0.tar.gz.

File metadata

  • Download URL: context_graph_connector-0.6.0.tar.gz
  • Upload date:
  • Size: 156.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for context_graph_connector-0.6.0.tar.gz
Algorithm Hash digest
SHA256 e4287bf9a5064fc2a53dbcf74b500a02beab17a78ac100d86756703d43adc89e
MD5 e1516c87a0e50e6d76c7a4df301ab113
BLAKE2b-256 e971ff56a84e19f3f937a2bc67e1d01fbc9ba25b0b8c3c11816af9aeece58ab1

See more details on using hashes here.

Provenance

The following attestation bundles were made for context_graph_connector-0.6.0.tar.gz:

Publisher: ci.yml on anthonylee991/cgc

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file context_graph_connector-0.6.0-py3-none-any.whl.

File metadata

File hashes

Hashes for context_graph_connector-0.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ec08927aa6c22ec6ab34b816a774372e575bd07aae53cc98da3a71a0273ccf8c
MD5 ad2c5ef63e975b65415404bc22cc2ccd
BLAKE2b-256 29c046b0158738c1dcf3ea2da6e27ce7b15d1c5d0e0ef77890463ca3e8d384c1

See more details on using hashes here.

Provenance

The following attestation bundles were made for context_graph_connector-0.6.0-py3-none-any.whl:

Publisher: ci.yml on anthonylee991/cgc

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page