Skip to main content

Hybrid Graph-NLP Intelligence Platform

Project description

๐Ÿ•ธ๏ธ graphnlp-intel

PyPI - Version License Docs Python Version

graphnlp-intel is an open-source Python library and REST API that transforms unstructured documents into rich, interactive knowledge graphs using state-of-the-art NLP, relationship extraction, and GNN-based sentiment propagation.

๐Ÿš€ Quickstart

Install the library and download the required spaCy model:

pip install graphnlp-intel
python -m spacy download en_core_web_sm

Run the pipeline in 6 lines of code:

from graphnlp import Pipeline

pipe = Pipeline(domain="finance")
result = pipe.run(["Goldman Sachs acquired a 5% stake in Microsoft for $2.3 billion."])

# Visualize, export, and summarize
result.graph.visualize("output.html") # Generates a Pyvis interactive HTML graph
result.export_json("output.json")    # Exports D3 compatible JSON
print(result.summary())              # Output stats on nodes, edges, sentiment, and communities

๐Ÿง  How it works

The system processes unstructured text through a 5-stage pipeline:

 ๐Ÿ“„ Ingestion      ๐Ÿ” Extraction         ๐Ÿ•ธ๏ธ Graph Build         ๐Ÿง  GNN              ๐Ÿ“ˆ Output
 DocumentLoader โ†’ NERExtractor       โ†’ GraphBuilder       โ†’ GraphGNN          โ†’ Pyvis HTML /
 TextChunker      RelationExtractor    CommunityDetector                        D3 JSON /
 EmailParser      EmbeddingExtractor                                            Neo4j / Redis

Standalone Extractor Usage

from graphnlp.extraction.ner import NERExtractor
from graphnlp.extraction.relations import RelationExtractor

ner = NERExtractor()
entities = ner.extract("Apple Inc reported revenue of $120 billion.")

rel_ext = RelationExtractor()
triples = rel_ext.extract("Apple Inc reported revenue of $120 billion.")

Standalone Graph Construction Usage

from graphnlp.graph.builder import GraphBuilder
from graphnlp.graph.community import CommunityDetector
import networkx as nx

builder = GraphBuilder()
graph = builder.build(triples, entities, embeddings_dict)

detector = CommunityDetector()
communities = detector.detect(graph)

๐Ÿงฉ Domain adapters

Domain adapters supply contextual logic like schema mappings, preprocessing, and post-processing steps tailored to specific industries.

Adapter Entity Types Use Case
finance TICKER, ORG, AMOUNT, DATE Parse fund records, expand ticker syms, build COMPETITOR_OF graphs
email PERSON, MERCHANT, MONEY Strip HTML/headers, parse invoices, generate PAID_TO expense clusters
feedback PRODUCT, SCORE, FEATURE Normalize 5-star ratings, cluster feature complaints, link reviews
incidents SERVICE, ERROR, SEV Standardize P0/P1 flags, deduplicate logs, build AFFECTS topological graphs

Using the Email Adapter

from graphnlp.adapters.base import get_adapter
from graphnlp.adapters.email import EmailAdapter
import networkx as nx

adapter = get_adapter("email")
clean_text = adapter.preprocess(raw_email_string)

# Graph integration
g = nx.DiGraph()
g.add_edge("$234.56", "Amazon", predicate="paid_to")
spend_clusters = EmailAdapter.monthly_spend_summary(g)

Custom Adapter Implementation

from graphnlp.adapters.base import DomainAdapter

class HealthcareAdapter(DomainAdapter):
    @property
    def domain(self) -> str:
        return "healthcare"
        
    @property
    def entity_types(self) -> list[str]:
        return ["PATIENT", "SYMPTOM", "DRUG"]
        
    def preprocess(self, text: str) -> str:
        return text.replace("Pt.", "Patient")

โšก API Server

Deploy the multi-tenant REST API via Docker:

make docker-up

Endpoints

Method Path Description
GET /health Check service health and system status.
POST /v1/analyze Submit documents for processing (sync or async).
GET /v1/analyze/{job_id} Poll status of an async analysis job.
GET /v1/graph/{graph_id} Retrieve D3.js compatible graph JSON by ID.
GET /v1/graph/{graph_id}/summary Retrieve summarized stats of the graph.
POST /v1/webhooks Register a new webhook endpoint for async complete events.
GET /v1/webhooks List registered webhooks for the given tenant.

Auth, Submit, and Poll

# Submit Sync
curl -X POST http://localhost:8000/v1/analyze \
  -H "Authorization: Bearer sk-your-api-key" \
  -H "Content-Type: application/json" \
  -d '{"documents": ["Invoice 123 for $500 to AWS"], "domain": "finance", "async": false}'

# Submit Async
curl -X POST http://localhost:8000/v1/analyze \
  -H "Authorization: Bearer sk-your-api-key" \
  -H "Content-Type: application/json" \
  -d '{"documents": ["Massive batch 1...", "Massive batch 2..."], "async": true}'

# Poll Async Status
curl -X GET http://localhost:8000/v1/analyze/job-1234 \
  -H "Authorization: Bearer sk-your-api-key"

๐Ÿ“ฆ SDK Integration

Python SDK

pip install graphnlp-client
from graphnlp_client.client import GraphNLPClient

client = GraphNLPClient(api_key="sk-your-api-key", base_url="http://localhost:8000")

# Sync
result = client.analyze(["Azure bill $300"], domain="email")
print(result["graph_id"])

# Get Graph data
graph = client.get_graph(result["graph_id"])

TypeScript / JavaScript SDK

npm install graphnlp-client
import { GraphNLPClient } from 'graphnlp-client';

const client = new GraphNLPClient({ apiKey: 'sk-your-api-key' });

async function analyze() {
  const result = await client.analyze(['Q4 earnings were up 12%'], { domain: 'finance' });
  const graph = await client.getGraph(result.graph_id);
  console.log(graph.nodes);
}

๐Ÿช Webhooks

Register webhooks to receive JSON payloads upon async task completion.

curl -X POST http://localhost:8000/v1/webhooks \
  -H "Authorization: Bearer sk-your-api-key" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://yourapp.com/hook", "events": ["graph.ready"], "secret": "wh_sec_123"}'

Webhook Payload Example

{
  "event": "graph.ready",
  "job_id": "job-1234",
  "graph_id": "graph-5678",
  "tenant_id": "tenant-abc",
  "timestamp": "2026-04-18T10:00:00Z",
  "signature": "sha256=d2b8b9a..."
}

โš™๏ธ Configuration

Configure the platform using config/default.yaml or environment variables:

# config/default.yaml
environment: production
neo4j:
  uri: bolt://localhost:7687
redis:
  url: redis://localhost:6379
api:
  rate_limit_per_minute: 100
nlp:
  ner_model: en_core_web_sm
  embedding_model: all-MiniLM-L6-v2
# .env
GRAPHNLP_ENVIRONMENT=production
GRAPHNLP_NEO4J_URI=bolt://neo4j:7687
GRAPHNLP_NEO4J_USER=neo4j
GRAPHNLP_NEO4J_PASSWORD=supersecret
GRAPHNLP_REDIS_URL=redis://redis:6379

๐Ÿ› ๏ธ CLI Reference

Manage the platform using the built-in Typer CLI:

  • graphnlp run --domain finance --file data.csv : Run pipeline on a local file.
  • graphnlp serve --port 8000 --reload : Start the FastAPI server.
  • graphnlp worker --concurrency 4 : Start the Celery async worker.
  • graphnlp generate-key -t my-tenant : Generate a new API key for the specified tenant.

๐Ÿ—๏ธ Architecture

graphnlp-intel/
โ”œโ”€โ”€ graphnlp/
โ”‚   โ”œโ”€โ”€ config.py              # Pydantic Settings
โ”‚   โ”œโ”€โ”€ pipeline.py            # Main Orchestrator
โ”‚   โ”œโ”€โ”€ ingestion/             # Loaders, Chunkers, Email Parsers
โ”‚   โ”œโ”€โ”€ extraction/            # NER, Relations, SBERT Embeddings
โ”‚   โ”œโ”€โ”€ graph/                 # NetworkX Builder, PyG GNN, Diff, Louvain
โ”‚   โ”œโ”€โ”€ adapters/              # Domain-specific logic
โ”‚   โ”œโ”€โ”€ storage/               # Neo4j & Redis handlers
โ”‚   โ”œโ”€โ”€ api/                   # FastAPI routes, Auth, Tenant Middleware
โ”‚   โ”œโ”€โ”€ queue/                 # Celery workers & tasks
โ”‚   โ””โ”€โ”€ webhooks/              # HMAC Dispatcher
โ”œโ”€โ”€ tests/
โ”‚   โ”œโ”€โ”€ unit/                  # Isolated logic blocks
โ”‚   โ”œโ”€โ”€ integration/           # E2E API tests
โ”‚   โ””โ”€โ”€ fixtures/              # CSV/JSON samples
โ”œโ”€โ”€ sdk/
โ”‚   โ”œโ”€โ”€ python/                # PyPI API wrapper
โ”‚   โ””โ”€โ”€ js/                    # NPM API wrapper
โ”œโ”€โ”€ docker/
โ”‚   โ”œโ”€โ”€ docker-compose.yml     # Local orchestration
โ”‚   โ”œโ”€โ”€ Dockerfile             # API Container
โ”‚   โ””โ”€โ”€ Dockerfile.worker      # Celery Container
โ””โ”€โ”€ pyproject.toml             # Dependencies & metadata

๐Ÿ“š Open Source Stack

We stand on the shoulders of giants.

Component Library
NLP Base spacy
Deep Learning torch
Graph Neural Nets torch-geometric
Language Models transformers
Sentence Embeddings sentence-transformers
Graph Analytics networkx
Async Queue celery
Web Framework fastapi
Configuration pydantic
Caching & Rate Limits redis.asyncio
Graph Persistence neo4j (async driver)
CLI Generation typer

๐Ÿ—บ๏ธ Roadmap

Phase Milestone Expected
Phase 1 Streaming Engine (Kafka integration, real-time diffing) Q3 2026
Phase 2 Custom Model Fine-Tuning (LoRA automated pipeline) Q4 2026
Phase 3 Visual Graph Dashboard (React SPA for interactive analytics) Q1 2027

๐Ÿ’ผ Custom Builds & Enterprise

Tier Price Features
Open Source Free Apache 2.0 ยท Self-hosted ยท All adapters ยท CLI
Custom NER $800โ€“2,000 Fine-tune NER ยท HF model delivery ยท Eval report
Hosted API $2,500 + $400/mo FEATURED ยท AWS/GCP/Azure deploy ยท Docker + TF ยท SDK
Enterprise $8,000+ Streaming ยท Dashboard ยท Alerting SLA ยท White-label

Interested in Hosted API or Enterprise tiers? Get a quote on our site.

๐Ÿค Contributing

We welcome contributions!

git clone https://github.com/samvardhan03/GraphNLP-Intel.git
cd GraphNLP-Intel
./setup_dev.sh
make test

๐Ÿ“„ License

This project is licensed under the Apache License 2.0.

@software{graphnlpintel2026,
  author = {GraphNLP Team},
  title = {graphnlp-intel: Hybrid Graph-NLP Intelligence Platform},
  year = {2026},
  url = {https://github.com/samvardhan03/GraphNLP-Intel}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graphnlp_intel-0.1.6.tar.gz (53.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

graphnlp_intel-0.1.6-py3-none-any.whl (66.6 kB view details)

Uploaded Python 3

File details

Details for the file graphnlp_intel-0.1.6.tar.gz.

File metadata

  • Download URL: graphnlp_intel-0.1.6.tar.gz
  • Upload date:
  • Size: 53.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for graphnlp_intel-0.1.6.tar.gz
Algorithm Hash digest
SHA256 5639ac881fff7d3a52799aa68209240c52a4af563e130e7f53d722a405a0a12d
MD5 81a69920fad10a4fd62df02c8b163762
BLAKE2b-256 1bf1f7e1531058ee97c0849cb5b3555ebcfe12f618d1ab25491e7aec4e1fc9fa

See more details on using hashes here.

Provenance

The following attestation bundles were made for graphnlp_intel-0.1.6.tar.gz:

Publisher: publish.yml on samvardhan03/GraphNLP-Intel

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file graphnlp_intel-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: graphnlp_intel-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 66.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for graphnlp_intel-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 524e4f92ade73c59d1a917b62dee48521e71c87722e6224538efaf9aae650dd8
MD5 480e1bfe5e8a138cc7713a95162bcf6f
BLAKE2b-256 e68b4fa42dce46db6c2cc6962ceae70bfaea364d4fd8b27ff42c6b79b4ef249f

See more details on using hashes here.

Provenance

The following attestation bundles were made for graphnlp_intel-0.1.6-py3-none-any.whl:

Publisher: publish.yml on samvardhan03/GraphNLP-Intel

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page