MCP server for extracting entities from text chunks and creating entity graphs in Neo4j. Supports 100+ LLM providers via LiteLLM.

These details have not been verified by PyPI

Project links

Project description

MCP Neo4j Entity Graph Server

MCP server for extracting entities from graph nodes and creating entity graphs in Neo4j.

Supports 100+ LLM providers via LiteLLM (OpenAI, Anthropic, Google, Azure, Bedrock, Ollama, etc.)

Features

Multi-provider LLM support: Use any LLM via LiteLLM (OpenAI, Claude, Gemini, etc.)
Structured output: Uses JSON schema for reliable entity extraction
Direct graph creation: Entities created directly in Neo4j (no intermediate files)
Schema-driven: Define what entities/relationships to extract
Provenance tracking: EXTRACTED_FROM relationships link entities to source nodes
High parallelism: Default 20 concurrent extractions (configurable)
Batched writes: Optimized Neo4j writes (batch every 10 chunks by default)
Incremental: Only processes nodes without prior extraction (unless force=true)
Key normalization: Entity keys are normalized (lowercase) for better matching

Supported Models

Models must support structured output (JSON schema). Tested models include:

Provider	Models
OpenAI	`gpt-5`, `gpt-5-mini`, `gpt-5-nano`, `gpt-4o`, `gpt-4o-mini`
Anthropic	`claude-sonnet-4-20250514`, `claude-3-5-sonnet-20241022`
Google	`gemini/gemini-2.5-pro`, `gemini/gemini-2.5-flash`, `gemini/gemini-1.5-pro`
Azure OpenAI	`azure/gpt-4o`, `azure/gpt-4o-mini`
AWS Bedrock	`bedrock/anthropic.claude-3-5-sonnet-20241022-v2:0`

Note: If a model doesn't support structured output, you'll get a clear error message with suggestions.

Tools

`extract_entities_from_graph`

Extracts entities from source nodes and creates entity graph directly in Neo4j.

Parameters:

Parameter	Default	Description
`schema_json`	required	Path to JSON schema file or inline JSON string
`source_label`	"Chunk"	Label of source nodes to extract from
`source_text_property`	"text"	Property containing text to extract from
`force`	false	If true, reprocess all nodes
`parallel`	20	Concurrent extractions (reduce to 5-10 if hitting rate limits)
`batch_size`	10	Chunks to batch before writing to Neo4j
`model`	env var	LLM model to use (from EXTRACTION_MODEL env)

Workflow:

Queries all nodes with the specified label: MATCH (n:{source_label}) WHERE NOT (n)<-[:EXTRACTED_FROM]-()
Extracts entities using LLM with structured output (parallel)
Batches results and writes to Neo4j (optimized transactions)
Creates EXTRACTED_FROM relationships for provenance

Examples:

# Extract from Chunk nodes with default model
extract_entities_from_graph(schema_json="/path/to/schema.json")

# Use a specific model
extract_entities_from_graph(
    schema_json="/path/to/schema.json",
    model="claude-sonnet-4-20250514"
)

# Reduce parallelism if hitting rate limits
extract_entities_from_graph(
    schema_json="/path/to/schema.json",
    parallel=5
)

# Extract from Page nodes
extract_entities_from_graph(
    schema_json="/path/to/schema.json",
    source_label="Page",
    source_text_property="content"
)

# Force re-extraction of all nodes
extract_entities_from_graph(schema_json="/path/to/schema.json", force=True)

`convert_schema`

Converts data model output from the Data Modeling MCP to extraction schema format.

Parameters:

modeling_output: JSON output from the Data Modeling MCP server
output_path: Path to save the extraction schema JSON file

Outputs:

{output_path} - Extraction schema JSON
{output_path}.py - Generated Pydantic model with normalization validators

Schema Format

{
  "entity_types": [
    {
      "label": "Medication",
      "description": "A pharmaceutical drug or medication",
      "key_property": "name",
      "properties": [
        {"name": "medicationClass", "type": "STRING", "description": "Drug class"}
      ]
    }
  ],
  "relationship_types": [
    {
      "type": "TREATS",
      "description": "Drug treats a condition",
      "source_entity": "Medication",
      "target_entity": "MedicalCondition"
    }
  ]
}

Environment Variables

Variable	Default	Description
`NEO4J_URI`	bolt://localhost:7687	Neo4j connection URI
`NEO4J_USERNAME`	neo4j	Neo4j username
`NEO4J_PASSWORD`	(required)	Neo4j password
`NEO4J_DATABASE`	neo4j	Neo4j database name
`EXTRACTION_MODEL`	gpt-5-mini	Default LLM model for extraction
`OPENAI_API_KEY`	-	Required for OpenAI models
`ANTHROPIC_API_KEY`	-	Required for Anthropic models
`GEMINI_API_KEY`	-	Required for Google Gemini models

LLM Provider Configuration

LiteLLM supports 100+ providers. Set the appropriate API key for your provider:

OpenAI (default)

export OPENAI_API_KEY="sk-..."
export EXTRACTION_MODEL="gpt-5-mini"  # or gpt-4o-mini, gpt-4o

Anthropic Claude

export ANTHROPIC_API_KEY="sk-ant-..."
export EXTRACTION_MODEL="claude-sonnet-4-20250514"

Google Gemini

export GEMINI_API_KEY="..."
export EXTRACTION_MODEL="gemini/gemini-2.5-pro"

Azure OpenAI

export AZURE_API_KEY="..."
export AZURE_API_BASE="https://your-resource.openai.azure.com/"
export AZURE_API_VERSION="2024-02-15-preview"
export EXTRACTION_MODEL="azure/your-deployment-name"

AWS Bedrock

export AWS_ACCESS_KEY_ID="..."
export AWS_SECRET_ACCESS_KEY="..."
export AWS_REGION_NAME="us-east-1"
export EXTRACTION_MODEL="bedrock/anthropic.claude-3-5-sonnet-20241022-v2:0"

Local Models (Ollama)

export EXTRACTION_MODEL="ollama/llama3.1"
# Note: Local models may not support structured output

See LiteLLM docs for all providers.

Usage with Cursor

Add to your ~/.cursor/mcp.json:

{
  "mcpServers": {
    "neo4j-entity-graph": {
      "command": "uv",
      "args": ["--directory", "/path/to/mcp-neo4j-entity-graph", "run", "mcp-neo4j-entity-graph"],
      "env": {
        "NEO4J_URI": "bolt://localhost:7687",
        "NEO4J_USERNAME": "neo4j",
        "NEO4J_PASSWORD": "your-password",
        "OPENAI_API_KEY": "your-api-key",
        "EXTRACTION_MODEL": "gpt-5-mini"
      }
    }
  }
}

Rate Limits & Performance

Parallelism

The default parallelism is 20 concurrent extractions, optimized for fast processing. However, this may exceed rate limits for some providers.

If you see rate limit errors, reduce the parallel parameter:

# For rate-limited accounts
extract_entities_from_graph(
    schema_json="/path/to/schema.json",
    parallel=5  # Reduce from default 20
)

Batch Size

Extractions are batched before writing to Neo4j (default: 10 chunks per batch). This reduces Neo4j transactions while maintaining progress visibility.

# Larger batches for better Neo4j performance
extract_entities_from_graph(
    schema_json="/path/to/schema.json",
    batch_size=20
)

Usage Example

# 1. Convert schema from Data Modeling MCP
convert_schema(
    modeling_output='{"nodes": [...], "relationships": [...]}',
    output_path="/path/to/schema.json"
)
# Creates: schema.json + schema.py (Pydantic model)

# 2. Extract entities from all Chunk nodes
extract_entities_from_graph(
    schema_json="/path/to/schema.json"
)
# Default: parallel=20, batch_size=10, model=gpt-5-mini

# 3. Use a different model
extract_entities_from_graph(
    schema_json="/path/to/schema.json",
    model="claude-sonnet-4-20250514",
    parallel=10  # Claude may have stricter rate limits
)

Graph Schema

After extraction, your Neo4j database will contain:

(:Entity)-[:EXTRACTED_FROM]->(:Chunk)
(:Entity)-[relationship]->(:Entity)

Example query to explore extracted entities:

// Find all entities extracted from a document
MATCH (e)-[:EXTRACTED_FROM]->(c:Chunk)-[:PART_OF]->(d:Document {name: "my-document"})
RETURN labels(e)[0] as type, count(e) as count
ORDER BY count DESC

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.4.0

Apr 29, 2026

0.3.0

Apr 28, 2026

This version

0.2.0

Jan 13, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mcp_neo4j_entity_graph-0.2.0.tar.gz (180.3 kB view details)

Uploaded Jan 13, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mcp_neo4j_entity_graph-0.2.0-py3-none-any.whl (18.7 kB view details)

Uploaded Jan 13, 2026 Python 3

File details

Details for the file mcp_neo4j_entity_graph-0.2.0.tar.gz.

File metadata

Download URL: mcp_neo4j_entity_graph-0.2.0.tar.gz
Upload date: Jan 13, 2026
Size: 180.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.5.14

File hashes

Hashes for mcp_neo4j_entity_graph-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`2eada7f5c1dcb3f638459e1119c16ce603bccb10e7ad8c45272ee696b4a42186`
MD5	`8000ad776651c93f789de81505144379`
BLAKE2b-256	`1b5590ce2e6999ae2acd9f179e27594e2699afb8540a4c123ad9b01028734041`

See more details on using hashes here.

File details

Details for the file mcp_neo4j_entity_graph-0.2.0-py3-none-any.whl.

File metadata

Download URL: mcp_neo4j_entity_graph-0.2.0-py3-none-any.whl
Upload date: Jan 13, 2026
Size: 18.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.5.14

File hashes

Hashes for mcp_neo4j_entity_graph-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2f7462e5cc4074b76263f691ce93657c5c2e67ba1c4e900e0e8a2ed6fe2904c3`
MD5	`807c18aedb527b0f720e86d29cb46b90`
BLAKE2b-256	`30a58c612f91082a9ac02d8bd24ab521ca752e305d563420e10bec06d5692cff`

See more details on using hashes here.

mcp-neo4j-entity-graph 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

MCP Neo4j Entity Graph Server

Features

Supported Models

Tools

extract_entities_from_graph

convert_schema

Schema Format

Environment Variables

LLM Provider Configuration

OpenAI (default)

Anthropic Claude

Google Gemini

Azure OpenAI

AWS Bedrock

Local Models (Ollama)

Usage with Cursor

Rate Limits & Performance

Parallelism

Batch Size

Usage Example

Graph Schema

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`extract_entities_from_graph`

`convert_schema`