Hybrid Graph-NLP Intelligence Platform
Project description
๐ธ๏ธ graphnlp-intel
graphnlp-intel is an open-source Python library and REST API that transforms unstructured documents into rich, interactive knowledge graphs using state-of-the-art NLP, relationship extraction, and GNN-based sentiment propagation.
๐ Quickstart
Install the library and download the required spaCy model:
pip install graphnlp-intel
python -m spacy download en_core_web_sm
Run the pipeline in 6 lines of code:
from graphnlp import Pipeline
pipe = Pipeline(domain="finance")
result = pipe.run(["Goldman Sachs acquired a 5% stake in Microsoft for $2.3 billion."])
# Visualize, export, and summarize
result.graph.visualize("output.html") # Generates a Pyvis interactive HTML graph
result.export_json("output.json") # Exports D3 compatible JSON
print(result.summary()) # Output stats on nodes, edges, sentiment, and communities
๐ง How it works
The system processes unstructured text through a 5-stage pipeline:
๐ Ingestion ๐ Extraction ๐ธ๏ธ Graph Build ๐ง GNN ๐ Output
DocumentLoader โ NERExtractor โ GraphBuilder โ GraphGNN โ Pyvis HTML /
TextChunker RelationExtractor CommunityDetector D3 JSON /
EmailParser EmbeddingExtractor Neo4j / Redis
Standalone Extractor Usage
from graphnlp.extraction.ner import NERExtractor
from graphnlp.extraction.relations import RelationExtractor
ner = NERExtractor()
entities = ner.extract("Apple Inc reported revenue of $120 billion.")
rel_ext = RelationExtractor()
triples = rel_ext.extract("Apple Inc reported revenue of $120 billion.")
Standalone Graph Construction Usage
from graphnlp.graph.builder import GraphBuilder
from graphnlp.graph.community import CommunityDetector
import networkx as nx
builder = GraphBuilder()
graph = builder.build(triples, entities, embeddings_dict)
detector = CommunityDetector()
communities = detector.detect(graph)
๐งฉ Domain adapters
Domain adapters supply contextual logic like schema mappings, preprocessing, and post-processing steps tailored to specific industries.
| Adapter | Entity Types | Use Case |
|---|---|---|
finance |
TICKER, ORG, AMOUNT, DATE |
Parse fund records, expand ticker syms, build COMPETITOR_OF graphs |
email |
PERSON, MERCHANT, MONEY |
Strip HTML/headers, parse invoices, generate PAID_TO expense clusters |
feedback |
PRODUCT, SCORE, FEATURE |
Normalize 5-star ratings, cluster feature complaints, link reviews |
incidents |
SERVICE, ERROR, SEV |
Standardize P0/P1 flags, deduplicate logs, build AFFECTS topological graphs |
Using the Email Adapter
from graphnlp.adapters.base import get_adapter
from graphnlp.adapters.email import EmailAdapter
import networkx as nx
adapter = get_adapter("email")
clean_text = adapter.preprocess(raw_email_string)
# Graph integration
g = nx.DiGraph()
g.add_edge("$234.56", "Amazon", predicate="paid_to")
spend_clusters = EmailAdapter.monthly_spend_summary(g)
Custom Adapter Implementation
from graphnlp.adapters.base import DomainAdapter
class HealthcareAdapter(DomainAdapter):
@property
def domain(self) -> str:
return "healthcare"
@property
def entity_types(self) -> list[str]:
return ["PATIENT", "SYMPTOM", "DRUG"]
def preprocess(self, text: str) -> str:
return text.replace("Pt.", "Patient")
โก API Server
Deploy the multi-tenant REST API via Docker:
make docker-up
Endpoints
| Method | Path | Description |
|---|---|---|
GET |
/health |
Check service health and system status. |
POST |
/v1/analyze |
Submit documents for processing (sync or async). |
GET |
/v1/analyze/{job_id} |
Poll status of an async analysis job. |
GET |
/v1/graph/{graph_id} |
Retrieve D3.js compatible graph JSON by ID. |
GET |
/v1/graph/{graph_id}/summary |
Retrieve summarized stats of the graph. |
POST |
/v1/webhooks |
Register a new webhook endpoint for async complete events. |
GET |
/v1/webhooks |
List registered webhooks for the given tenant. |
Auth, Submit, and Poll
# Submit Sync
curl -X POST http://localhost:8000/v1/analyze \
-H "Authorization: Bearer sk-your-api-key" \
-H "Content-Type: application/json" \
-d '{"documents": ["Invoice 123 for $500 to AWS"], "domain": "finance", "async": false}'
# Submit Async
curl -X POST http://localhost:8000/v1/analyze \
-H "Authorization: Bearer sk-your-api-key" \
-H "Content-Type: application/json" \
-d '{"documents": ["Massive batch 1...", "Massive batch 2..."], "async": true}'
# Poll Async Status
curl -X GET http://localhost:8000/v1/analyze/job-1234 \
-H "Authorization: Bearer sk-your-api-key"
๐ฆ SDK Integration
Python SDK
pip install graphnlp-client
from graphnlp_client.client import GraphNLPClient
client = GraphNLPClient(api_key="sk-your-api-key", base_url="http://localhost:8000")
# Sync
result = client.analyze(["Azure bill $300"], domain="email")
print(result["graph_id"])
# Get Graph data
graph = client.get_graph(result["graph_id"])
TypeScript / JavaScript SDK
npm install graphnlp-client
import { GraphNLPClient } from 'graphnlp-client';
const client = new GraphNLPClient({ apiKey: 'sk-your-api-key' });
async function analyze() {
const result = await client.analyze(['Q4 earnings were up 12%'], { domain: 'finance' });
const graph = await client.getGraph(result.graph_id);
console.log(graph.nodes);
}
๐ช Webhooks
Register webhooks to receive JSON payloads upon async task completion.
curl -X POST http://localhost:8000/v1/webhooks \
-H "Authorization: Bearer sk-your-api-key" \
-H "Content-Type: application/json" \
-d '{"url": "https://yourapp.com/hook", "events": ["graph.ready"], "secret": "wh_sec_123"}'
Webhook Payload Example
{
"event": "graph.ready",
"job_id": "job-1234",
"graph_id": "graph-5678",
"tenant_id": "tenant-abc",
"timestamp": "2026-04-18T10:00:00Z",
"signature": "sha256=d2b8b9a..."
}
โ๏ธ Configuration
Configure the platform using config/default.yaml or environment variables:
# config/default.yaml
environment: production
neo4j:
uri: bolt://localhost:7687
redis:
url: redis://localhost:6379
api:
rate_limit_per_minute: 100
nlp:
ner_model: en_core_web_sm
embedding_model: all-MiniLM-L6-v2
# .env
GRAPHNLP_ENVIRONMENT=production
GRAPHNLP_NEO4J_URI=bolt://neo4j:7687
GRAPHNLP_NEO4J_USER=neo4j
GRAPHNLP_NEO4J_PASSWORD=supersecret
GRAPHNLP_REDIS_URL=redis://redis:6379
๐ ๏ธ CLI Reference
Manage the platform using the built-in Typer CLI:
graphnlp run --domain finance --file data.csv: Run pipeline on a local file.graphnlp serve --port 8000 --reload: Start the FastAPI server.graphnlp worker --concurrency 4: Start the Celery async worker.graphnlp generate-key -t my-tenant: Generate a new API key for the specified tenant.
๐๏ธ Architecture
graphnlp-intel/
โโโ graphnlp/
โ โโโ config.py # Pydantic Settings
โ โโโ pipeline.py # Main Orchestrator
โ โโโ ingestion/ # Loaders, Chunkers, Email Parsers
โ โโโ extraction/ # NER, Relations, SBERT Embeddings
โ โโโ graph/ # NetworkX Builder, PyG GNN, Diff, Louvain
โ โโโ adapters/ # Domain-specific logic
โ โโโ storage/ # Neo4j & Redis handlers
โ โโโ api/ # FastAPI routes, Auth, Tenant Middleware
โ โโโ queue/ # Celery workers & tasks
โ โโโ webhooks/ # HMAC Dispatcher
โโโ tests/
โ โโโ unit/ # Isolated logic blocks
โ โโโ integration/ # E2E API tests
โ โโโ fixtures/ # CSV/JSON samples
โโโ sdk/
โ โโโ python/ # PyPI API wrapper
โ โโโ js/ # NPM API wrapper
โโโ docker/
โ โโโ docker-compose.yml # Local orchestration
โ โโโ Dockerfile # API Container
โ โโโ Dockerfile.worker # Celery Container
โโโ pyproject.toml # Dependencies & metadata
๐ Open Source Stack
We stand on the shoulders of giants.
| Component | Library |
|---|---|
| NLP Base | spacy |
| Deep Learning | torch |
| Graph Neural Nets | torch-geometric |
| Language Models | transformers |
| Sentence Embeddings | sentence-transformers |
| Graph Analytics | networkx |
| Async Queue | celery |
| Web Framework | fastapi |
| Configuration | pydantic |
| Caching & Rate Limits | redis.asyncio |
| Graph Persistence | neo4j (async driver) |
| CLI Generation | typer |
๐บ๏ธ Roadmap
| Phase | Milestone | Expected |
|---|---|---|
| Phase 1 | Streaming Engine (Kafka integration, real-time diffing) | Q3 2026 |
| Phase 2 | Custom Model Fine-Tuning (LoRA automated pipeline) | Q4 2026 |
| Phase 3 | Visual Graph Dashboard (React SPA for interactive analytics) | Q1 2027 |
๐ผ Custom Builds & Enterprise
| Tier | Price | Features |
|---|---|---|
| Open Source | Free | Apache 2.0 ยท Self-hosted ยท All adapters ยท CLI |
| Custom NER | $800โ2,000 | Fine-tune NER ยท HF model delivery ยท Eval report |
| Hosted API | $2,500 + $400/mo | FEATURED ยท AWS/GCP/Azure deploy ยท Docker + TF ยท SDK |
| Enterprise | $8,000+ | Streaming ยท Dashboard ยท Alerting SLA ยท White-label |
Interested in Hosted API or Enterprise tiers? Get a quote on our site.
๐ค Contributing
We welcome contributions!
git clone https://github.com/samvardhan03/GraphNLP-Intel.git
cd GraphNLP-Intel
./setup_dev.sh
make test
๐ License
This project is licensed under the Apache License 2.0.
@software{graphnlpintel2026,
author = {GraphNLP Team},
title = {graphnlp-intel: Hybrid Graph-NLP Intelligence Platform},
year = {2026},
url = {https://github.com/samvardhan03/GraphNLP-Intel}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file graphnlp_intel-0.1.6.tar.gz.
File metadata
- Download URL: graphnlp_intel-0.1.6.tar.gz
- Upload date:
- Size: 53.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5639ac881fff7d3a52799aa68209240c52a4af563e130e7f53d722a405a0a12d
|
|
| MD5 |
81a69920fad10a4fd62df02c8b163762
|
|
| BLAKE2b-256 |
1bf1f7e1531058ee97c0849cb5b3555ebcfe12f618d1ab25491e7aec4e1fc9fa
|
Provenance
The following attestation bundles were made for graphnlp_intel-0.1.6.tar.gz:
Publisher:
publish.yml on samvardhan03/GraphNLP-Intel
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
graphnlp_intel-0.1.6.tar.gz -
Subject digest:
5639ac881fff7d3a52799aa68209240c52a4af563e130e7f53d722a405a0a12d - Sigstore transparency entry: 1339295933
- Sigstore integration time:
-
Permalink:
samvardhan03/GraphNLP-Intel@79a73f4ce5afd1055d1e19cc947a7a2c41e97256 -
Branch / Tag:
refs/tags/v0.1.6 - Owner: https://github.com/samvardhan03
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@79a73f4ce5afd1055d1e19cc947a7a2c41e97256 -
Trigger Event:
push
-
Statement type:
File details
Details for the file graphnlp_intel-0.1.6-py3-none-any.whl.
File metadata
- Download URL: graphnlp_intel-0.1.6-py3-none-any.whl
- Upload date:
- Size: 66.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
524e4f92ade73c59d1a917b62dee48521e71c87722e6224538efaf9aae650dd8
|
|
| MD5 |
480e1bfe5e8a138cc7713a95162bcf6f
|
|
| BLAKE2b-256 |
e68b4fa42dce46db6c2cc6962ceae70bfaea364d4fd8b27ff42c6b79b4ef249f
|
Provenance
The following attestation bundles were made for graphnlp_intel-0.1.6-py3-none-any.whl:
Publisher:
publish.yml on samvardhan03/GraphNLP-Intel
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
graphnlp_intel-0.1.6-py3-none-any.whl -
Subject digest:
524e4f92ade73c59d1a917b62dee48521e71c87722e6224538efaf9aae650dd8 - Sigstore transparency entry: 1339295935
- Sigstore integration time:
-
Permalink:
samvardhan03/GraphNLP-Intel@79a73f4ce5afd1055d1e19cc947a7a2c41e97256 -
Branch / Tag:
refs/tags/v0.1.6 - Owner: https://github.com/samvardhan03
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@79a73f4ce5afd1055d1e19cc947a7a2c41e97256 -
Trigger Event:
push
-
Statement type: