Skip to main content

Transactional Graph + Vector retrieval system for InterSystems IRIS with hybrid search, openCypher, and GraphQL APIs

Project description

iris-vector-graph

Knowledge graph engine for InterSystems IRIS — vector search, openCypher, graph analytics, and PLAID multi-vector retrieval.

PyPI Python 3.10+ IRIS 2024.1+ License: MIT


Install

pip install iris-vector-graph              # Core: just intersystems-irispython
pip install iris-vector-graph[full]        # Full: + FastAPI, GraphQL, numpy, networkx
pip install iris-vector-graph[plaid]       # + sklearn for PLAID K-means build

ObjectScript Only (IPM)

zpm "install iris-vector-graph-core"

Pure ObjectScript — VecIndex, PLAIDSearch, PageRank, Subgraph, GraphIndex. No Python. Works on any IRIS 2024.1+, all license tiers.


What It Does

Capability Description
VecIndex RP-tree ANN vector search — pure ObjectScript + $vectorop SIMD. Annoy-style two-means splitting.
PLAID Multi-vector retrieval (ColBERT-style) — centroid scoring → candidate gen → exact MaxSim. Single server-side call.
HNSW Native IRIS VECTOR index via kg_KNN_VEC. Sub-2ms search.
Cypher openCypher parser/translator — MATCH, WHERE, RETURN, CREATE, DELETE, WITH, named paths, CALL subqueries.
Graph Analytics PageRank, WCC, CDLP, PPR-guided subgraph — pure ObjectScript over ^KG globals.
FHIR Bridge ICD-10→MeSH mapping via UMLS for clinical-to-KG integration.
GraphQL Auto-generated schema from knowledge graph labels.

Quick Start

Python

import iris
from iris_vector_graph.engine import IRISGraphEngine

conn = iris.connect(hostname='localhost', port=1972, namespace='USER', username='_SYSTEM', password='SYS')
engine = IRISGraphEngine(conn)
engine.initialize_schema()

Vector Search (VecIndex)

engine.vec_create_index("drugs", 384, "cosine")
engine.vec_insert("drugs", "metformin", embedding_vector)
engine.vec_build("drugs")

results = engine.vec_search("drugs", query_vector, k=5)
# [{"id": "metformin", "score": 0.95}, ...]

PLAID Multi-Vector Search

# Build: Python K-means + ObjectScript inverted index
engine.plaid_build("colbert_idx", docs)  # docs = [{"id": "x", "tokens": [[f1,...], ...]}, ...]

# Search: single server-side call, pure $vectorop
results = engine.plaid_search("colbert_idx", query_tokens, k=10)
# [{"id": "doc_3", "score": 0.94}, ...]

Cypher

-- Named paths
MATCH p = (a:Protein)-[r:INTERACTS_WITH]->(b:Protein)
WHERE a.id = 'TP53'
RETURN p, length(p), nodes(p), relationships(p)

-- Subqueries
MATCH (p:Protein)
CALL {
    WITH p
    MATCH (p)-[:INTERACTS_WITH]->(partner)
    RETURN count(partner) AS degree
}
RETURN p.id, degree

-- Vector search in Cypher
CALL ivg.vector.search('Gene', 'embedding', [0.1, 0.2, ...], 5) YIELD node, score
RETURN node, score

Graph Analytics

from iris_vector_graph.operators import IRISGraphOperators

ops = IRISGraphOperators(conn)

# Personalized PageRank
scores = ops.kg_PAGERANK(seed_entities=["MeSH:D011014"], damping=0.85)

# Subgraph extraction
subgraph = ops.kg_SUBGRAPH(seed_ids=["TP53", "MDM2"], k_hops=3)

# PPR-guided subgraph (prevents D^k blowup)
guided = ops.kg_PPR_GUIDED_SUBGRAPH(seed_ids=["TP53"], top_k=50, max_hops=5)

# Community detection
communities = ops.kg_CDLP()
components = ops.kg_WCC()

FHIR Bridge

# Load ICD-10→MeSH mappings from UMLS MRCONSO
# python scripts/ingest/load_umls_bridges.py --mrconso /path/to/MRCONSO.RRF

# Query: ICD codes → KG anchors
anchors = engine.get_kg_anchors(icd_codes=["J18.0", "E11.9"])
# → ["MeSH:D001996", "MeSH:D003924"]  (filtered to nodes in KG)

ObjectScript Direct (no Python)

// VecIndex
Do ##class(Graph.KG.VecIndex).Create("myidx", 384, "cosine", 4, 50)
Do ##class(Graph.KG.VecIndex).InsertJSON("myidx", "doc1", "[0.1, 0.2, ...]")
Do ##class(Graph.KG.VecIndex).Build("myidx")
Set results = ##class(Graph.KG.VecIndex).SearchJSON("myidx", "[0.3, ...]", 10, 8)

// PLAID
Set results = ##class(Graph.KG.PLAIDSearch).Search("myidx", queryTokensJSON, 10, 4)

// Graph analytics
Do ##class(Graph.KG.Traversal).BuildKG()
Set ppr = ##class(Graph.KG.PageRank).RunJson(seedsJSON, 0.85, 50)
Set sub = ##class(Graph.KG.Subgraph).SubgraphJson(seedsJSON, 3, "")

Architecture

Global Structure

Global Purpose
^KG Knowledge graph — ("out",s,p,o), ("in",o,p,s), ("deg",s), ("label",label,s), ("prop",s,key)
^NKG Integer-encoded ^KG for Arno acceleration — (-1,sIdx,-(pIdx+1),oIdx)
^VecIdx VecIndex RP-tree ANN — centroids, tree nodes, leaf vectors
^PLAID PLAID multi-vector — centroids, packed doc tokens, inverted index

Schema (Graph_KG)

Table Purpose
nodes Node registry (node_id PK)
rdf_edges Edges (s, p, o_id)
rdf_labels Node labels (s, label)
rdf_props Node properties (s, key, val)
kg_NodeEmbeddings HNSW vector index (id, emb VECTOR)
fhir_bridges ICD-10→MeSH clinical code mappings

ObjectScript Classes (iris-vector-graph-core)

Class Methods
Graph.KG.VecIndex Create, Insert, InsertJSON, Build, Search, SearchJSON, SearchMultiJSON, SeededVectorExpand, Drop, Info
Graph.KG.PLAIDSearch StoreCentroids, StoreDocTokens, BuildInvertedIndex, Search, Insert, Info, Drop
Graph.KG.PageRank RunJson (PPR), PageRankGlobalJson (global)
Graph.KG.Algorithms WCCJson, CDLPJson
Graph.KG.Subgraph SubgraphJson, PPRGuidedJson
Graph.KG.Traversal BuildKG, BuildNKG, BFSFastJson
Graph.KG.GraphIndex InternNode, InternLabel, InsertIndex (dual ^KG+^NKG write)

Performance

Operation Latency Details
VecIndex search (1K vecs, 128-dim) 4ms RP-tree + $vectorop SIMD
HNSW search (143K vecs, 768-dim) 1.7ms Native IRIS VECTOR index
PLAID search (500 docs, 4 query tokens) ~14ms Centroid scoring + MaxSim
PPR (10K nodes) 62ms Pure ObjectScript, early termination
1-hop neighbors 0.3ms $Order on ^KG
k=2 subgraph (10K nodes) 1.8ms BFS over ^KG

Documentation


Changelog

v1.28.0 (2026-03-29)

  • Lightweight default install — base requires only intersystems-irispython
  • Optional extras: [full], [plaid], [dev], [ml], [visualization], [biodata]
  • IPM packages: iris-vector-graph-core (ObjectScript only) + iris-vector-graph (full)

v1.27.0

  • PLAID packed token storage — $ListBuild of $vector per document (53 $Order → 1 $Get)

v1.26.0

  • PLAID multi-vector retrieval — PLAIDSearch.cls pure ObjectScript + $vectorop
  • Python wrappers: plaid_build, plaid_search, plaid_insert, plaid_info, plaid_drop

v1.25.1

  • VecIndex Annoy-style two-means tree splitting (fixes degenerate trees on clustered embeddings)

v1.24.0

  • VecIndex nprobe recall fix (counts leaf visits, not branch points)
  • SearchMultiJSON, InsertBatchJSON batch APIs

v1.22.0

  • VecIndex SearchJSON/InsertJSON — eliminated xecute path (250ms → 4ms)

v1.21.0

  • VecIndex RP-tree ANN — vec_create_index, vec_insert, vec_build, vec_search

v1.20.0

  • Arno acceleration wrappers: khop(), ppr(), random_walk() with auto-detection

v1.19.0

  • ^NKG integer index for Arno acceleration
  • GraphIndex.cls, BenchSeeder.SeedRandom()

v1.18.0

  • FHIR-to-KG bridge: fhir_bridges table, get_kg_anchors(), UMLS MRCONSO ingest

v1.17.0

  • Cypher named path bindings (MATCH p = ... RETURN p, length(p), nodes(p), relationships(p))
  • Cypher CALL subqueries (independent CTE + correlated scalar)
  • kg_PPR_GUIDED_SUBGRAPH with ObjectScript fast path
  • Repo cleanup: 80+ stale files removed

Earlier versions →


License: MIT | Author: Thomas Dyar (thomas.dyar@intersystems.com)

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

iris_vector_graph-1.31.0.tar.gz (539.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

iris_vector_graph-1.31.0-py3-none-any.whl (92.3 kB view details)

Uploaded Python 3

File details

Details for the file iris_vector_graph-1.31.0.tar.gz.

File metadata

  • Download URL: iris_vector_graph-1.31.0.tar.gz
  • Upload date:
  • Size: 539.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for iris_vector_graph-1.31.0.tar.gz
Algorithm Hash digest
SHA256 51cc316e22b74780df9771c67b3ec8fb3b149e2ec03f64685c54d493590e9cb3
MD5 0a683b792df38d783d3c92df5150f943
BLAKE2b-256 275c2a2d1605d700ec84ac52546df8a143dca499aa92e088ce4056a614f8574a

See more details on using hashes here.

File details

Details for the file iris_vector_graph-1.31.0-py3-none-any.whl.

File metadata

File hashes

Hashes for iris_vector_graph-1.31.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2915d1277ff7c4285b753d4b6e75a3b718a2fde4bb1b79d97b51a6c30d4e6742
MD5 4daf48923c2d9f0eaad1b9507e85dd9f
BLAKE2b-256 20b0fca0c7f1d6d075b81a401886d7c742b52becf80729307d2eaf4c7afc3d6e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page