Skip to main content

Programmatic data access layer for LLM agents - navigate context, don't memorize it

Project description

Context Graph Connector (CGC)

License Python 3.10+

Programmatic data access layer for LLM agents — navigate context, don't memorize it.

CGC connects AI agents to your databases, files, and vector stores. Instead of stuffing everything into the context window, your agent explores data on-demand: discovering schemas, sampling rows, chunking documents, extracting knowledge graphs, and querying across sources.


Features

  • Connect to PostgreSQL, MySQL, SQLite, filesystems, Qdrant, Pinecone, pgvector, MongoDB
  • Discover schemas and structure automatically
  • Sample data to understand what's in each table or file
  • Chunk large documents into LLM-friendly pieces
  • Search across files and databases with pattern matching
  • Extract knowledge graphs from text and structured data (GliNER + GLiREL + 17 industry packs)
  • Store extracted triplets in Neo4j, PostgreSQL AGE, or KuzuDB
  • Query graph sinks with Cypher
  • MCP server for Claude Desktop / Claude Code / Cursor / Windsurf
  • HTTP API for integration with any tool or platform
  • CLI for quick tasks and scripting

Quick Start

Install

pip install context-graph-connector

With graph extraction (requires ML models):

pip install context-graph-connector[extraction]

With everything:

pip install context-graph-connector[all]

CLI

# Discover what's in a database
cgc discover sqlite ./mydata.db

# Sample rows from a table
cgc sample sqlite ./mydata.db users --n 10

# Run a SQL query
cgc sql sqlite ./mydata.db "SELECT * FROM orders WHERE total > 100"

# Extract knowledge graph from text
cgc extract "Steve Jobs co-founded Apple with Steve Wozniak in 1976"

# Extract from a file and store in a graph database
cgc extract-file ./report.pdf --sink kuzudb://./my_graph

# Chunk a large PDF for processing
cgc chunk filesystem ./docs report.pdf --strategy tokens:2000

# Start the HTTP API server
cgc serve

# Start the MCP server (for Claude integration)
cgc mcp

MCP Integration

Claude Code (VS Code):

claude mcp add cgc -- cgc mcp

Claude Desktop / Cursor / Windsurf — add to your config:

{
  "mcpServers": {
    "cgc": {
      "command": "cgc",
      "args": ["mcp"]
    }
  }
}

Python API

from cgc import Connector

connector = Connector()

# Add a data source
from cgc.adapters.sql import SqlAdapter
connector.add_source(SqlAdapter("mydb", "sqlite:///data.db"))

# Discover schema
schema = await connector.discover("mydb")

# Sample data
rows = await connector.sample("mydb", "users", 5)

# Extract triplets
triplets = connector.extract_triplets("Elon Musk founded SpaceX in 2002")

Optional Dependencies

CGC has a minimal core with optional extras for specific integrations:

Extra What it adds
extraction GliNER, GLiREL, spaCy, sentence-transformers (knowledge graph extraction)
postgres asyncpg, pgvector (PostgreSQL support)
mysql aiomysql (MySQL support)
vector qdrant-client, pinecone-client, pymongo, motor (vector DB support)
graph kuzu (embedded graph database)
all Everything above
dev pytest, ruff, mypy (development)

Architecture

cgc/
├── connector.py          # Main interface — Connector class
├── core/                 # Types: Schema, Query, Chunk, Triplet, Graph
├── adapters/
│   ├── sql.py            # PostgreSQL, MySQL, SQLite
│   ├── filesystem.py     # Local files (PDF, DOCX, CSV, etc.)
│   ├── vector/           # Qdrant, Pinecone, pgvector, MongoDB
│   └── graph/            # Neo4j, PostgreSQL AGE, KuzuDB (sinks)
├── discovery/            # Schema inference, relationship detection
│   ├── extractor.py      # Triplet extraction pipeline
│   ├── gliner.py         # GliNER NER integration
│   ├── glirel.py         # GLiREL relation extraction
│   ├── router.py         # Industry pack routing (E5 embeddings)
│   ├── industry_packs.py # 17 domain-specific extraction configs
│   └── structured.py     # Hub-and-spoke structured data extraction
├── cli/                  # Typer CLI
├── api/                  # FastAPI HTTP server
├── mcp/                  # Model Context Protocol server
├── session/              # Session tracking
└── security/             # API key auth, rate limiting

Documentation

Document Description
API Reference HTTP API endpoints
CLI Reference Command-line interface
MCP Reference Model Context Protocol for Claude
Security Guide API keys, rate limiting, data protection
Technical Details Architecture and internals

Contributing

Contributions are welcome but this project is maintained on a best-effort basis. PRs may not be reviewed immediately. See CONTRIBUTING.md for guidelines.

License

Apache 2.0 — see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

context_graph_connector-0.7.0.tar.gz (164.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

context_graph_connector-0.7.0-py3-none-any.whl (154.9 kB view details)

Uploaded Python 3

File details

Details for the file context_graph_connector-0.7.0.tar.gz.

File metadata

  • Download URL: context_graph_connector-0.7.0.tar.gz
  • Upload date:
  • Size: 164.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for context_graph_connector-0.7.0.tar.gz
Algorithm Hash digest
SHA256 3867df86e50ea254461f3fc0eaab0ab1c34e06f48ab5e64ebff038e29c88e18f
MD5 d4f1a900916a9bc260d48095c5f10dbf
BLAKE2b-256 1e4f9a5c58e0d33e095e71d508c5db3573dd6e790e580ea7544d405e077102b4

See more details on using hashes here.

Provenance

The following attestation bundles were made for context_graph_connector-0.7.0.tar.gz:

Publisher: ci.yml on anthonylee991/cgc

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file context_graph_connector-0.7.0-py3-none-any.whl.

File metadata

File hashes

Hashes for context_graph_connector-0.7.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6349dfd98778aaa7c8bdaa4ad93ac2255674fd25dee9e5b1520627879071c5ae
MD5 977c4fb4151d001cd8739c314b7c56eb
BLAKE2b-256 4b7052e256fb6acaa4ea889fb9329a325ae8178d498718264c856db20a1d4b5e

See more details on using hashes here.

Provenance

The following attestation bundles were made for context_graph_connector-0.7.0-py3-none-any.whl:

Publisher: ci.yml on anthonylee991/cgc

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page