Transactional Graph + Vector retrieval system for InterSystems IRIS with hybrid search, openCypher, and GraphQL APIs

These details have not been verified by PyPI

Project links

Project description

iris-vector-graph

Knowledge graph engine for InterSystems IRIS — temporal property graph, vector search, openCypher, graph analytics, and pre-aggregated analytics.

Install

pip install iris-vector-graph              # Core: just intersystems-irispython
pip install iris-vector-graph[full]        # Full: + FastAPI, GraphQL, numpy, networkx
pip install iris-vector-graph[plaid]       # + sklearn for PLAID K-means build

ObjectScript Only (IPM)

zpm "install iris-vector-graph-core"

Pure ObjectScript — VecIndex, PLAIDSearch, PageRank, Subgraph, GraphIndex, TemporalIndex. No Python. Works on any IRIS 2024.1+, all license tiers.

What It Does

Capability	Description
Temporal Graph	Bidirectional time-indexed edges — `^KG("tout"/"tin"/"bucket")`. O(results) window queries via B-tree traversal. 134K+ edges/sec ingest (RE2-TT benchmark).
Pre-aggregated Analytics	`^KG("tagg")` per-bucket COUNT/SUM/AVG/MIN/MAX and HLL COUNT DISTINCT. O(1) aggregation queries — 0.085ms for 1-bucket, 0.24ms for 24-hour window.
BM25Index	Pure ObjectScript Okapi BM25 lexical search — `^BM25Idx` globals, zero SQL tables. Automatic `kg_TXT` upgrade when `"default"` index exists. Cypher `CALL ivg.bm25.search(name, query, k)`. 0.3ms median search.
VecIndex	RP-tree ANN vector search — pure ObjectScript + `$vectorop` SIMD. Annoy-style two-means splitting.
IVFFlat	Inverted File flat vector index — Python k-means build (sklearn), pure ObjectScript query. Tunable `nprobe` recall/speed tradeoff. `nprobe=nlist` → exact search. Cypher `CALL ivg.ivf.search(name, vec, k, nprobe)`.
PLAID	Multi-vector retrieval (ColBERT-style) — centroid scoring → candidate gen → exact MaxSim. Single server-side call.
HNSW	Native IRIS VECTOR index via `kg_KNN_VEC`. Sub-2ms search.
Cypher	openCypher parser/translator — MATCH, WHERE, RETURN, CREATE, UNION, CASE WHEN, variable-length paths, `shortestPath()` / `allShortestPaths()`, CALL subqueries. Bolt 5.4 protocol (TCP + WebSocket) for standard driver connectivity.
Graph Analytics	PageRank, WCC, CDLP, PPR-guided subgraph — pure ObjectScript over `^KG` globals.
FHIR Bridge	ICD-10→MeSH mapping via UMLS for clinical-to-KG integration.
GraphQL	Auto-generated schema from knowledge graph labels.
Embedded Python	`EmbeddedConnection` — zero-boilerplate dbapi2 adapter for IRIS Language=python methods.

Quick Start

Python

import iris
from iris_vector_graph.engine import IRISGraphEngine

conn = iris.connect(hostname='localhost', port=1972, namespace='USER', username='_SYSTEM', password='SYS')
engine = IRISGraphEngine(conn)
engine.initialize_schema()

Inside IRIS (Language=python, no connection needed)

from iris_vector_graph.embedded import EmbeddedConnection
from iris_vector_graph.engine import IRISGraphEngine

engine = IRISGraphEngine(EmbeddedConnection())
engine.initialize_schema()

Graph Browser + Bolt Connectivity

A built-in Cypher server speaks the Bolt protocol, so standard graph tooling (drivers, visualization, LangChain) works out of the box:

IRIS_HOST=localhost IRIS_PORT=1972 IRIS_NAMESPACE=USER \
IRIS_USERNAME=_SYSTEM IRIS_PASSWORD=SYS \
python3 -m uvicorn iris_vector_graph.cypher_api:app --port 8000

Browser — http://localhost:8000/browser/ (force-directed graph visualization)
Bolt TCP — bolt://localhost:7687 (Python/Java/Go/.NET drivers, LangChain, cypher-shell)
HTTP API — http://localhost:8000/api/cypher (curl, httpie, REST clients)

Temporal Property Graph

Store and query time-stamped edges — service calls, events, metrics, log entries — with sub-millisecond window queries and O(1) aggregation.

Two edge APIs: structural vs. temporal

IVG has two distinct edge APIs that write to different storage and support different query patterns:

	`create_edge` / `bulk_create_edges`	`create_edge_temporal` / `bulk_create_edges_temporal`
Writes to	`Graph_KG.rdf_edges` SQL (durability) + `^KG("out",0,...)` globals (query, synchronous)	`^KG("tout"/"tin")` (time-ordered) + `^KG("out",0,...)` (adjacency)
Query via	`MATCH (a)-[:R]->(b)` — immediately visible, no `BuildKG()` needed	`get_edges_in_window()`, `get_temporal_aggregate()`, temporal Cypher `WHERE r.ts >= $start`; also visible in `MATCH (a)-[:R]->(b)`
Models	Structural relationship — "A is connected to B"	Event log — "A called B at time T with weight W"
Example	`(service:auth)-[:DEPENDS_ON]->(service:payment)`	`(service:auth)-[:CALLS_AT {ts: 1705000042, weight: 38ms}]->(service:payment)`

Use create_edge when the relationship is a permanent structural fact: schema dependencies, ontology hierarchies, entity co-occurrences, foreign key relationships.

Use create_edge_temporal when the relationship is a time-series event: service calls, metric emissions, log events, cost observations, anything you'll query by time window or aggregate over time.

The same node pair can have both: a structural DEPENDS_ON edge (created once) and thousands of temporal CALLS_AT events (one per call). Both are immediately visible in MATCH (a)-[r]->(b) — no rebuild required.

Deleting an edge:

engine.delete_edge("service:auth", "DEPENDS_ON", "service:payment")
# removes from rdf_edges SQL and kills ^KG("out",0,...) immediately

Note — bulk ingest: bulk_create_edges is optimized for high-volume ingest (535M edges validated) and intentionally skips the per-edge ^KG write for performance. Edges inserted in bulk are visible to MATCH/BFS only after calling BuildKG() at the end of the ingest session. bulk_create_edges_temporal does write ^KG immediately. create_edge (single) always writes immediately.

Ingest

import time

# Single edge
engine.create_edge_temporal(
    source="service:auth",
    predicate="CALLS_AT",
    target="service:payment",
    timestamp=int(time.time()),
    weight=42.7,            # latency_ms, metric value, or 1.0
)

# Bulk ingest — 134K+ edges/sec (RE2-TT benchmark, 535M edges validated)
edges = [
    {"s": "service:auth",    "p": "CALLS_AT",       "o": "service:payment", "ts": 1712000000, "w": 42.7},
    {"s": "service:payment", "p": "CALLS_AT",       "o": "db:postgres",     "ts": 1712000001, "w": 8.1},
    {"s": "service:auth",    "p": "EMITS_METRIC_AT","o": "metric:cpu",      "ts": 1712000000, "w": 73.2},
]
engine.bulk_create_edges_temporal(edges)

Window Queries

now = int(time.time())

# All calls from auth in the last 5 minutes
edges = engine.get_edges_in_window(
    source="service:auth",
    predicate="CALLS_AT",
    start=now - 300,
    end=now,
)
# [{"s": "service:auth", "p": "CALLS_AT", "o": "service:payment", "ts": 1712000042, "w": 38.2}, ...]

# Edge velocity — call count in last N seconds (reads pre-aggregated bucket, O(1))
velocity = engine.get_edge_velocity("service:auth", window_seconds=300)
# 847

# Burst detection — which nodes exceeded threshold in last N seconds
bursts = engine.find_burst_nodes(predicate="CALLS_AT", window_seconds=60, threshold=500)
# [{"id": "service:auth", "velocity": 1243}, {"id": "service:checkout", "velocity": 731}]

Pre-aggregated Analytics (O(1) per bucket)

now = int(time.time())

# Average latency for auth→payment calls in the last 5 minutes
avg_latency = engine.get_temporal_aggregate(
    source="service:auth",
    predicate="CALLS_AT",
    metric="avg",           # "count" | "sum" | "avg" | "min" | "max"
    ts_start=now - 300,
    ts_end=now,
)
# 41.3  (float, milliseconds)

# All metrics for count, and extremes
count = engine.get_temporal_aggregate("service:auth", "CALLS_AT", "count", now-300, now)
p_min = engine.get_temporal_aggregate("service:auth", "CALLS_AT", "min", now-300, now)
p_max = engine.get_temporal_aggregate("service:auth", "CALLS_AT", "max", now-300, now)

# GROUP BY source — all services, CALLS_AT, last 5 minutes
groups = engine.get_bucket_groups(predicate="CALLS_AT", ts_start=now-300, ts_end=now)
# [
#   {"source": "service:auth",     "predicate": "CALLS_AT", "count": 847, "avg": 41.3, "min": 2.1, "max": 312.0},
#   {"source": "service:checkout", "predicate": "CALLS_AT", "count": 312, "avg": 28.7, "min": 1.4, "max": 189.0},
#   ...
# ]

# COUNT DISTINCT targets — fanout detection (16-register HLL, ~26% error, good for threshold detection)
distinct_targets = engine.get_distinct_count("service:auth", "CALLS_AT", now-3600, now)
# 14   (distinct services called by auth in last hour)

Rich Edge Properties

# Attach arbitrary attributes to any temporal edge
engine.create_edge_temporal(
    source="service:auth",
    predicate="CALLS_AT",
    target="service:payment",
    timestamp=1712000000,
    weight=42.7,
    attrs={"trace_id": "abc123", "status": 200, "region": "us-east-1"},
)

# Retrieve attributes
attrs = engine.get_edge_attrs(
    ts=1712000000,
    source="service:auth",
    predicate="CALLS_AT",
    target="service:payment",
)
# {"trace_id": "abc123", "status": 200, "region": "us-east-1"}

NDJSON Import / Export

# Export temporal edges for a time window
engine.export_temporal_edges_ndjson(
    path="traces_2026-04-01.ndjson",
    start=1743465600,
    end=1743552000,
)

# Import — resume an ingest from a file
engine.import_graph_ndjson("traces_2026-04-01.ndjson")

ObjectScript Direct

// Ingest
Do ##class(Graph.KG.TemporalIndex).InsertEdge("svc:auth","CALLS_AT","svc:pay",ts,42.7,"")

// Bulk ingest (JSON array)
Set n = ##class(Graph.KG.TemporalIndex).BulkInsert(edgesJSON)

// Query window — returns JSON array
Set result = ##class(Graph.KG.TemporalIndex).QueryWindow("svc:auth","CALLS_AT",tsStart,tsEnd)

// Pre-aggregated average latency
Set avg = ##class(Graph.KG.TemporalIndex).GetAggregate("svc:auth","CALLS_AT","avg",tsStart,tsEnd)

// GROUP BY source
Set groups = ##class(Graph.KG.TemporalIndex).GetBucketGroups("CALLS_AT",tsStart,tsEnd)

// COUNT DISTINCT targets (HLL)
Set n = ##class(Graph.KG.TemporalIndex).GetDistinctCount("svc:auth","CALLS_AT",tsStart,tsEnd)

Vector Search (VecIndex)

engine.vec_create_index("drugs", 384, "cosine")
engine.vec_insert("drugs", "metformin", embedding_vector)
engine.vec_build("drugs")

results = engine.vec_search("drugs", query_vector, k=5)
# [{"id": "metformin", "score": 0.95}, ...]

IVFFlat Vector Index

Inverted File with Flat quantization — Python k-means build, pure ObjectScript query. Tunable nprobe recall/speed tradeoff; nprobe=nlist gives exact results.

# Build: reads kg_NodeEmbeddings, runs MiniBatchKMeans, stores ^IVF globals
result = engine.ivf_build("kg_idx", nlist=256, metric="cosine")
# {"nlist": 256, "indexed": 10000, "dim": 768}

# Search: finds nprobe nearest centroids, scores their cells
results = engine.ivf_search("kg_idx", query_vector, k=10, nprobe=32)
# [("NCIT:C12345", 0.97), ("NCIT:C67890", 0.94), ...]

# Lifecycle
info = engine.ivf_info("kg_idx")   # {"nlist":256,"dim":768,"indexed":10000,...}
engine.ivf_drop("kg_idx")

Cypher:

CALL ivg.ivf.search('kg_idx', $query_vec, 10, 32) YIELD node, score
RETURN node, score ORDER BY score DESC

Global storage: ^IVF(name, "cfg"|"centroid"|"list") — independent of ^KG, ^VecIdx, ^PLAID, ^BM25Idx.

PLAID Multi-Vector Search

# Build: Python K-means + ObjectScript inverted index
engine.plaid_build("colbert_idx", docs)  # docs = [{"id": "x", "tokens": [[f1,...], ...]}, ...]

# Search: single server-side call, pure $vectorop
results = engine.plaid_search("colbert_idx", query_tokens, k=10)
# [{"id": "doc_3", "score": 0.94}, ...]

Cypher

Temporal edge filtering (v1.42.0+)

-- Filter edges by timestamp — routes to ^KG("tout") B-tree, O(results)
MATCH (a)-[r:CALLS_AT]->(b)
WHERE r.ts >= $start AND r.ts <= $end
RETURN r.ts, r.weight
ORDER BY r.ts DESC

-- Temporal + property filter
MATCH (a:Service)-[r:CALLS_AT]->(b)
WHERE r.ts >= $start AND r.ts <= $end
  AND r.weight > 1000
RETURN a.id, b.id, r.ts, r.weight
ORDER BY r.weight DESC

-- Inbound direction — routes to ^KG("tin")
MATCH (b:Service)<-[r:CALLS_AT]-(a)
WHERE r.ts >= $start AND r.ts <= $end
RETURN a.id, b.id, r.ts

Sweet spot: Temporal Cypher is designed for trajectory-style queries (≤~50 edges, ordered output). For aggregation over large windows, use get_temporal_aggregate() / get_bucket_groups() — these are O(1) pre-aggregated and 400× faster.

-- Named paths
MATCH p = (a:Service)-[r:CALLS]->(b:Service)
WHERE a.id = 'auth'
RETURN p, length(p), nodes(p), relationships(p)

-- Variable-length paths
MATCH (a:Service)-[:CALLS*1..3]->(b:Service)
WHERE a.id = 'auth'
RETURN b.id

-- Shortest path between two nodes (v1.49.0+)
MATCH p = shortestPath((a {id: $from})-[*..8]-(b {id: $to}))
RETURN p, length(p), nodes(p), relationships(p)

-- All shortest paths — returns every minimum-length path
MATCH p = allShortestPaths((a {id: $from})-[*..8]-(b {id: $to}))
RETURN p

-- CASE WHEN
MATCH (n:Service)
RETURN n.id,
       CASE WHEN n.calls > 1000 THEN 'high' WHEN n.calls > 100 THEN 'medium' ELSE 'low' END AS load

-- UNION
MATCH (n:ServiceA) RETURN n.id
UNION
MATCH (n:ServiceB) RETURN n.id

-- Vector search in Cypher
CALL ivg.vector.search('Service', 'embedding', [0.1, 0.2, ...], 5) YIELD node, score
RETURN node, score

Graph Analytics

from iris_vector_graph.operators import IRISGraphOperators

ops = IRISGraphOperators(conn)

# Personalized PageRank
scores = ops.kg_PAGERANK(seed_entities=["service:auth"], damping=0.85)

# K-hop subgraph
subgraph = ops.kg_SUBGRAPH(seed_ids=["service:auth"], k_hops=3)

# PPR-guided subgraph (prevents k^n blowup)
guided = ops.kg_PPR_GUIDED_SUBGRAPH(seed_ids=["service:auth"], top_k=50, max_hops=5)

# Community detection
communities = ops.kg_CDLP()
components  = ops.kg_WCC()

FHIR Bridge

# Load ICD-10→MeSH mappings from UMLS MRCONSO
# python scripts/ingest/load_umls_bridges.py --mrconso /path/to/MRCONSO.RRF

anchors = engine.get_kg_anchors(icd_codes=["J18.0", "E11.9"])
# → ["MeSH:D001996", "MeSH:D003924"]  (filtered to nodes in KG)

Architecture

Global Structure

Global	Purpose
`^KG("out", s, p, o)`	Knowledge graph — outbound edges
`^KG("in", o, p, s)`	Knowledge graph — inbound edges
`^KG("tout", ts, s, p, o)`	Temporal index — outbound, ordered by timestamp
`^KG("tin", ts, o, p, s)`	Temporal index — inbound, ordered by timestamp
`^KG("bucket", bucket, s)`	Pre-aggregated edge count per 5-minute bucket
`^KG("tagg", bucket, s, p, key)`	Pre-aggregated COUNT/SUM/MIN/MAX/HLL per bucket
`^KG("edgeprop", ts, s, p, o, key)`	Rich edge attributes
`^NKG`	Integer-encoded `^KG` for Arno acceleration
`^VecIdx`	VecIndex RP-tree ANN
`^PLAID`	PLAID multi-vector
`^BM25Idx`	BM25 lexical search index

Schema (Graph_KG)

Table	Purpose
`nodes`	Node registry (node_id PK)
`rdf_edges`	Edges (s, p, o_id)
`rdf_labels`	Node labels (s, label)
`rdf_props`	Node properties (s, key, val)
`kg_NodeEmbeddings`	HNSW vector index (id, emb VECTOR)
`fhir_bridges`	ICD-10→MeSH clinical code mappings

ObjectScript Classes

Class	Key Methods
`Graph.KG.TemporalIndex`	InsertEdge, BulkInsert, QueryWindow, GetVelocity, FindBursts, GetAggregate, GetBucketGroups, GetDistinctCount, Purge
`Graph.KG.VecIndex`	Create, InsertJSON, Build, SearchJSON, SearchMultiJSON, InsertBatchJSON
`Graph.KG.PLAIDSearch`	StoreCentroids, BuildInvertedIndex, Search
`Graph.KG.PageRank`	RunJson, PageRankGlobalJson
`Graph.KG.Algorithms`	WCCJson, CDLPJson
`Graph.KG.Subgraph`	SubgraphJson, PPRGuidedJson
`Graph.KG.Traversal`	BuildKG, BuildNKG, BFSFastJson, ShortestPathJson
`Graph.KG.BulkLoader`	BulkLoad (`INSERT %NOINDEX %NOCHECK` + `%BuildIndices`)
`Graph.KG.BM25Index`	Build, Search, Insert, Drop, Info, SearchProc (`kg_BM25` stored procedure)
`Graph.KG.IVFIndex`	Build, Search, Drop, Info, SearchProc (`kg_IVF` stored procedure)
`Graph.KG.EdgeScan`	MatchEdges (`Graph_KG.MatchEdges` stored procedure), WriteAdjacency, DeleteAdjacency

Performance

Operation	Latency	Dataset
Temporal edge ingest	134K edges/sec	RE2-TT 535M edges, Enterprise IRIS
Window query (selective)	0.1ms	O(results), B-tree traversal
GetAggregate (1 bucket, 5min)	0.085ms	50K-edge dataset
GetAggregate (288 buckets, 24hr)	0.160ms	O(buckets), not O(edges)
GetBucketGroups (3 sources, 1hr)	0.193ms
GetDistinctCount (1 bucket)	0.101ms	16-register HLL
VecIndex search (1K vecs, 128-dim)	4ms	RP-tree + `$vectorop` SIMD
HNSW search (143K vecs, 768-dim)	1.7ms	Native IRIS VECTOR index
PLAID search (500 docs, 4 tokens)	~14ms	Centroid scoring + MaxSim
BM25Index search (174 nodes, 3-term)	0.3ms	Pure ObjectScript `$Order` posting-list
PPR (10K nodes)	62ms	Pure ObjectScript
1-hop neighbors	0.3ms	`$Order` on `^KG`

Documentation

Changelog

v1.55.2 (2026-04-19)

fix: Bug 6 (final) — SQLCODE -400 on rdf_edges index creation now falls back to ALTER TABLE ADD INDEX; all standard indexes created even when Graph.KG.Edge class was never compiled

v1.55.1 (2026-04-19)

fix: Graph.KG.Edge/TestEdge persistent classes excluded from ObjectScript deploy (fix DDL table ownership conflict — Bug 6)
fix: conftest removes conflicting .cls before LoadDir
fix: apoc.meta.data() samples all nodes per label via JOIN on rdf_labels (no longer skips labels with no first-node properties)

v1.55.0 (2026-04-19)

feat: import_rdf/bulk_create_edges/create_edge_temporal/bulk_create_edges_temporal all accept graph= parameter
feat: USE GRAPH filtering now strict (exact graph_id match, no NULL leakage)
feat: UNIQUE constraint updated to (s,p,o_id,graph_id) allowing same triple in multiple named graphs
feat: db.schema.relTypeProperties() returns actual relationship property names
fix: import_rdf _ensure_node uses WHERE NOT EXISTS (no duplicate key errors)
fix: import_rdf edge INSERT scoped to graph_id in WHERE NOT EXISTS check
fix: graph_id column uses %EXACT for case-sensitive storage
test: 8 E2E tests proving fail-before/pass-after for all 5 FRs (spec 061)

v1.54.1 (2026-04-18)

fix: initialize_schema() idempotent — "already has index" suppressed (Bug 1)
fix: idx_props_val_ifind (iFind) and idx_edges_confidence (JSON_VALUE) now optional — graceful skip on Community (Bugs 2+3)
test: 6 new E2E schema init tests covering idempotency, required tables, optional indexes, core procedures (spec 060)

v1.54.0 (2026-04-18)

fix: materialize_inference respects named graphs — inferred triples use correct graph_id (spec 055)
fix: materialize_inference/retract_inference accept graph= parameter
feat: Cypher % (modulo → MOD) and ^ (power → POWER) operators (spec 056)
feat: FOREACH clause — FOREACH (x IN list | update_clause) (spec 057)
fix: EXISTS { (n)-[r]->(m) } with edge patterns now works; MATCH keyword optional inside EXISTS (spec 058)
feat: Pattern comprehension [(a)-[r]->(b) | proj] collecting edge projections (spec 059)

v1.53.1 (2026-04-18)

feat: engine.materialize_inference(rules="rdfs"|"owl") — transitive subClassOf/subPropertyOf closure, rdf:type inheritance, domain/range, OWL equivalentClass/inverseOf/TransitiveProperty/SymmetricProperty
feat: engine.retract_inference() — removes all inferred triples, restoring asserted-only graph
feat: import_rdf(path, infer="rdfs") — runs inference automatically after load
Inferred triples tagged qualifiers={"inferred":true} for easy exclusion

v1.53.0 (2026-04-18)

feat: Named graphs — create_edge(graph='name'), list_graphs(), drop_graph(name)
feat: USE GRAPH 'name' MATCH (a)-[r]->(b) Cypher syntax adds graph_id filter
feat: Schema migration — graph_id column added to rdf_edges (idempotent, run on initialize_schema)

v1.52.1 (2026-04-18)

feat: engine.import_rdf(path) — load Turtle (.ttl), N-Triples (.nt), N-Quads (.nq) into the graph
Format auto-detected from extension; streaming batch ingest; blank node synthetic IDs; language tags preserved

v1.52.0 (2026-04-18)

feat: ALL/ANY/NONE/SINGLE(x IN list WHERE ...) list predicate expressions
feat: [x IN list WHERE pred | proj] list comprehensions
feat: reduce(acc = init, x IN list | body) reduce expressions
feat: filter()/extract() legacy list functions as aliases
feat: Arithmetic operators +, -, *, / in Cypher expressions

v1.51.1 (2026-04-18)

feat: apoc.meta.data() returns proper schema columns — LangChain Neo4jGraph() connects without error
feat: apoc.meta.schema() returns schema summary

v1.51.0 (2026-04-18)

feat: keys(n) returns node property keys via rdf_props subquery
feat: range(start, end) and range(start, end, step) generate integer lists
feat: size(list) uses JSON_ARRAYLENGTH; head(), last(), tail(), isEmpty() implemented

v1.50.3 (2026-04-18)

Fix: initialize_schema() creates SQLUser.* views automatically — no more manual DEFAULT_SCHEMA workaround
Fix: initialize_schema() detects pre-compiled ObjectScript classes via %Dictionary — fast 0.2ms PPR path activates correctly instead of falling back to 1800ms Python path

v1.50.2 (2026-04-18)

Fix: MATCH (a)-[r]->(b) with unbound source falls back to rdf_edges SQL (avoids IRIS SqlProc 32KB string limit for large graphs with 88K+ edges)
MatchEdges is now only used when source node ID is bound — safe path for single-node traversal

v1.50.1 (2026-04-18)

Fix: bulk_create_edges now calls BuildKG() after batch SQL — bulk-inserted static edges immediately visible to MATCH/BFS
Fix: BuildKG() already uses shard-0 ^KG("out",0,...) layout (confirmed, no code change needed)

v1.50.0 (2026-04-18)

Unified edge store PR-A — MATCH (a)-[r]->(b) now returns both static and temporal edges (spec 048)
Graph.KG.EdgeScan — MatchEdges(sourceId, predicate, shard) SqlProc scans ^KG("out",0,...) globals
create_edge writes ^KG synchronously; delete_edge (new) kills ^KG entry synchronously
Cypher MATCH (a)-[r]->(b) routes to MatchEdges CTE — no SQL JOIN on rdf_edges
TemporalIndex and all traversal code updated to shard-0 layout
IVF index fixes: $vector("double"), JSON float arrays, leading-zero scores, VECTOR(DOUBLE) schema
Parser: negative float literals in list expressions now work

v1.49.0 (2026-04-18)

shortestPath() / allShortestPaths() openCypher syntax — fixes parse error reported by mindwalk (spec 047)
MATCH p = shortestPath((a {id:$from})-[*..8]-(b {id:$to})) RETURN p now works end-to-end
RETURN p → JSON {"nodes":[...],"rels":[...],"length":N}; RETURN length(p), nodes(p), relationships(p) all supported
allShortestPaths(...) returns all minimum-length paths (diamond graphs return both paths)
Graph.KG.Traversal.ShortestPathJson — pure ObjectScript BFS with multi-parent backtracking for all-paths support
Parser fix: [*..N] (dot-dot without leading integer) now parses correctly
Parser fix: bare -- undirected relationship pattern now parses correctly
Translator/engine fix: CREATE without RETURN clause no longer throws UnboundLocalError

v1.48.0 (2026-04-18)

IVFFlat vector index — Graph.KG.IVFIndex ObjectScript class + ^IVF globals (spec 046)
ivf_build(name, nlist, metric, batch_size) — Python MiniBatchKMeans build from kg_NodeEmbeddings; stores centroids + inverted lists as $vector in ^IVF globals
ivf_search(name, query, k, nprobe) — pure ObjectScript centroid scoring → cell scan → top-k; nprobe=nlist gives exact search
ivf_drop(name) / ivf_info(name) — lifecycle management
Graph_KG.kg_IVF SQL stored procedure — enables JSON_TABLE CTE pattern
Cypher CALL ivg.ivf.search(name, query_vec, k, nprobe) YIELD node, score
Translator fix: ORDER BY <alias> DESC now resolves SELECT-level aliases (e.g. count(r) AS deg) without Undefined error
cypher_api.py: Bolt TCP/WS sessions use dedicated IRIS connections (_make_engine) to prevent connection contention with HTTP handlers; threading.Lock on shared engine cache
test_bolt_server.py: fixed 2 TestBoltSessionHello tests using deprecated asyncio.get_event_loop().run_until_complete() → asyncio.run()

v1.47.0 (2026-04-10)

Bolt 5.4 protocol server — TCP (port 7687) + WebSocket (port 8000). Standard graph drivers (Python, Java, Go, .NET), LangChain, and visualization tools connect via bolt://
Graph browser — bundled at /browser/ with force-directed visualization, schema sidebar, :sysinfo
Cypher HTTP API — /api/cypher + Bolt-compatible transactional endpoints. API key auth via X-API-Key
System procedures — db.labels(), db.relationshipTypes(), db.schema.visualization(), dbms.queryJmx(), SHOW DATABASES/PROCEDURES/FUNCTIONS
Graph object encoding — RETURN n, r, m produces typed Node/Relationship structures for visualization
SQL audit — FETCH FIRST → TOP, DISTINCT TOP order, IN clause chunking at 499
Translator fixes — anonymous nodes, BM25 CTE literals, var-length min-hop, UNION ALL with LIMIT
Embedding fixes — probe false negative, string model loading
scripts/load_demo_data.py — canonical dataset loader (NCIT + HLA immunology + embeddings + BM25)
456 tests, 0 skipped

v1.46.0 (2026-04-07)

BM25Index — pure ObjectScript Okapi BM25 lexical search over ^BM25Idx globals. Zero SQL tables, no Enterprise license required.
Graph.KG.BM25Index.Build(name, propsCSV) — indexes all graph nodes by specified text properties; returns {"indexed":N,"avgdl":F,"vocab_size":V}
Graph.KG.BM25Index.Search(name, query, k) — Robertson BM25 scoring via $Order posting-list traversal; returns JSON [{"id":nodeId,"score":S},...]
Graph.KG.BM25Index.Insert(name, docId, text) — incremental document add/replace; updates IDF only for new document's terms (O(doc_length))
Graph.KG.BM25Index.Drop(name) — O(1) Kill of full index
Graph.KG.BM25Index.Info(name) — returns {"N":N,"avgdl":F,"vocab_size":V} or {} if not found
Python wrappers: engine.bm25_build(), bm25_search(), bm25_insert(), bm25_drop(), bm25_info()
kg_TXT automatic upgrade: _kg_TXT_fallback detects a "default" BM25 index and routes through BM25 instead of LIKE-based fallback
Cypher CALL ivg.bm25.search(name, $query, k) YIELD node, score — Stage CTE using Graph_KG.kg_BM25 SQL stored procedure
Translator fix: BM25 and PPR CTEs now use own column names in RETURN clause (BM25.node not BM25.node_id)
SC-002 benchmark: 0.3ms median search on 174-node community IRIS instance

v1.45.3 (2026-04-04)

translate_relationship_pattern: inline property filters on relationship nodes were silently dropped — MATCH (t)-[:R]->(c {id: 'x'}) returned all nodes instead of filtering. Fixed by applying source_node.properties and target_node.properties after JOIN construction.
vector_search: TO_VECTOR(?, DOUBLE, {dim}) now includes explicit dimension in query cast, resolving type mismatch on IRIS 2025.1 when column dimension is known
2 regression tests added (375 unit tests total)

v1.45.2 (2026-04-03)

embedded.py: auto-fixes sys.path shadowing — ensures /usr/irissys/lib/python is first so the embedded iris module takes priority over pip-installed intersystems_irispython
embedded.py: clear error message when shadowed iris (no iris.sql) is detected, naming the root cause
Documented the XD timeout constraint and embed_daemon pattern for long-running ML operations in embedded context
3 new tests covering path-fix and shadowing detection

v1.45.1 (2026-04-03)

embed_nodes: FK-safe delete — DELETE failure on kg_NodeEmbeddings (spurious FK error in embedded Python context) is silently ignored; INSERT proceeds correctly
vector_search: uses VECTOR_COSINE(TO_VECTOR(col), ...) so it works on both native VECTOR columns AND VARCHAR-stored vectors (e.g. DocChunk.VectorChunk from fhir-017)

v1.45.0 (2026-04-03)

embed_nodes(model, where, text_fn, batch_size, force, progress_callback) — incremental node embedding over Graph_KG.nodes with SQL WHERE filter, custom text builder, and per-call model override. Unblocks mixed-ontology graphs (embed only KG8 nodes without re-embedding NCIT's 200K nodes).
vector_search(table, vector_col, query_embedding, top_k, id_col, return_cols, score_threshold) — search any IRIS VECTOR column, not just kg_NodeEmbeddings. Works on DocChunk tables, RAG corpora, custom HNSW indexes.
multi_vector_search(sources, query_embedding, top_k, fusion='rrf') — unified search across multiple IRIS VECTOR tables with RRF fusion. Returns source_table per result. Powers hybrid KG+FHIR document search.
validate_vector_table(table, vector_col) — returns {dimension, row_count} for any IRIS VECTOR column.

v1.44.0 (2026-04-03)

SQL Table Bridge — map existing IRIS SQL tables as virtual graph nodes/edges with zero data copy
engine.map_sql_table(table, id_column, label) — register any IRIS table as a Cypher-queryable node set; no ETL, no data movement
engine.map_sql_relationship(source, predicate, target, target_fk=None, via_table=None) — FK and M:M join relationships traversable via Cypher
engine.attach_embeddings_to_table(label, text_columns, force=False) — overlay HNSW vector search on existing table rows
engine.list_table_mappings(), remove_table_mapping(), reload_table_mappings() — mapping lifecycle management
Cypher MATCH (n:MappedLabel) routes to registered SQL table with WHERE pushdown — O(SQL query), not O(copy)
Mixed queries: MATCH (p:MappedPatient)-[:HAS_DOC]->(d:NativeDocument) spans both mapped and native nodes seamlessly
SQL mapping wins over native Graph_KG.nodes rows for the same label (FR-016)
TableNotMappedError raised with helpful message when attach_embeddings_to_table is called on unregistered label

v1.43.0 (2026-04-03)

EmbeddedConnection and EmbeddedCursor now importable directly from iris_vector_graph (top-level)
IRISGraphEngine(iris.sql) — accepts iris.sql module directly; auto-wraps in EmbeddedConnection (no manual wrapper needed inside IRIS Language=python methods)
load_obo(encoding=, encoding_errors='replace') — handles UTF-8 BOM and Latin-1 bytes from IRIS-written files; fixes NCIT.obo loading edge case
load_obo / load_networkx accept progress_callback=lambda n_nodes, n_edges: ... — called every 10K items; enables progress reporting for large ontologies (NCIT.obo: 200K+ concepts)
Verified: temporal Cypher (WHERE r.ts >= $start AND r.ts <= $end) works end-to-end via EmbeddedConnection path

v1.42.0 (2026-04-03)

Cypher temporal edge filtering: WHERE r.ts >= $start AND r.ts <= $end routes MATCH patterns to ^KG("tout") B-tree — O(results), not O(total edges)
r.ts and r.weight accessible in RETURN and ORDER BY on temporal edges
Inbound direction (b)<-[r:P]-(a) WHERE r.ts >= $start routes to ^KG("tin")
r.ts without WHERE filter → NULL + query-level warning (prevents accidental full scans)
r.weight > expr in WHERE applies as post-filter on temporal result set
Uses IRIS-compatible derived table subquery (not WITH CTE) — works on protocol 65 xDBC
w → weight canonical field name in temporal CTE (consistent with v1.41.0 API aliases)
Sweet spot: trajectory queries ≤50 edges. For aggregation, use get_temporal_aggregate().

v1.41.0 (2026-04-03)

get_edges_in_window() now returns source/target/predicate/timestamp/weight aliases alongside s/o/p/ts/w — backward compatible
get_edges_in_window(direction="in") — query inbound edges by target node (uses ^KG("tin"))
create_edge_temporal(..., upsert=True) and bulk_create_edges_temporal(..., upsert=True) — skip write if edge already exists at that timestamp
purge_before(ts) — delete all temporal edges older than ts, with ^KG("tagg") and ^KG("bucket") cleanup
Graph.KG.TemporalIndex.PurgeBefore(ts) and QueryWindowInbound(target, predicate, ts_start, ts_end) ObjectScript methods

v1.40.0 (2026-04-02)

iris_vector_graph.embedded.EmbeddedConnection — dbapi2 adapter for IRIS Language=python methods
Zero-boilerplate: IRISGraphEngine(EmbeddedConnection()) works inside IRIS identically to external iris.connect()
commit()/rollback() are intentional no-ops (IRIS manages transactions in embedded context)
START TRANSACTION/COMMIT/ROLLBACK via cursor.execute() silently dropped (avoids <COMMAND> in wgproto jobs)
fetchmany(), rowcount, description fully implemented

v1.39.0 (2026-04-01)

Pre-aggregated temporal analytics: ^KG("tagg") COUNT/SUM/AVG/MIN/MAX at O(1)
GetAggregate, GetBucketGroups, GetDistinctCount ObjectScript methods
get_temporal_aggregate(), get_bucket_groups(), get_distinct_count() Python wrappers
16-register HyperLogLog COUNT DISTINCT (SHA1, ~26% error — suitable for fanout threshold detection)
Benchmark: 134K–157K edges/sec sustained across RE2-TT/RE2-OB/RE1-TT (535M edges total)

v1.38.0

Rich edge properties: ^KG("edgeprop", ts, s, p, o, key) — arbitrary typed attributes per temporal edge
get_edge_attrs(), create_edge_temporal(attrs={...})
NDJSON import/export: import_graph_ndjson(), export_graph_ndjson(), export_temporal_edges_ndjson()

v1.37.0

Temporal property graph: create_edge_temporal(), bulk_create_edges_temporal()
get_edges_in_window(), get_edge_velocity(), find_burst_nodes()
^KG("tout"/"tin"/"bucket") globals — bidirectional time-indexed edge store
Graph.KG.TemporalIndex ObjectScript class

v1.35.0

UNION / UNION ALL in Cypher
EXISTS {} subquery predicates

v1.34.0

Variable-length paths: MATCH (a)-[:REL*1..5]->(b) via BFSFastJson bridge

v1.33.0

CASE WHEN / THEN / ELSE / END in Cypher RETURN and WHERE

v1.32.0

CAST functions: toInteger(), toFloat(), toString(), toBoolean()

v1.31.0

RDF 1.2 reification API: reify_edge(), get_reifications(), delete_reification()

v1.30.0

BulkLoader: INSERT %NOINDEX %NOCHECK + %BuildIndices — 46K rows/sec SQL ingest
RDF 1.2 reification schema DDL

v1.29.0

OBO ontology ingest: load_obo(), load_networkx()

v1.28.0

Lightweight install — base requires only intersystems-irispython
Optional extras: [full], [plaid], [dev], [ml], [visualization], [biodata]

v1.26.0–v1.27.0

PLAID multi-vector retrieval — PLAIDSearch.cls pure ObjectScript + $vectorop
PLAID packed token storage: 53 $Order → 1 $Get

v1.24.0–v1.25.1

VecIndex nprobe recall fix (counts leaf visits, not branch points)
Annoy-style two-means tree splitting (fixes degenerate trees)
Batch APIs: SearchMultiJSON, InsertBatchJSON

v1.21.0–v1.22.1

VecIndex RP-tree ANN
SearchJSON/InsertJSON — eliminated xecute path (250ms → 4ms)

v1.20.0

Arno acceleration wrappers: khop(), ppr(), random_walk()

v1.19.0

^NKG integer index for Arno acceleration

v1.18.0

FHIR-to-KG bridge: fhir_bridges table, get_kg_anchors(), UMLS MRCONSO ingest

v1.17.0

Cypher named path bindings, CALL subqueries, PPR-guided subgraph

Earlier versions →

License: MIT | Author: Thomas Dyar (thomas.dyar@intersystems.com)

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.55.3

Apr 19, 2026

This version

1.55.2

Apr 19, 2026

1.55.1

Apr 19, 2026

1.55.0

Apr 19, 2026

1.54.2

Apr 19, 2026

1.54.1

Apr 19, 2026

1.54.0

Apr 19, 2026

1.53.1

Apr 19, 2026

1.53.0

Apr 19, 2026

1.52.1

Apr 19, 2026

1.52.0

Apr 19, 2026

1.51.1

Apr 19, 2026

1.51.0

Apr 19, 2026

1.50.2

Apr 18, 2026

1.50.1

Apr 18, 2026

1.50.0

Apr 18, 2026

1.49.0

Apr 18, 2026

1.48.0

Apr 18, 2026

1.47.1

Apr 11, 2026

1.47.0

Apr 11, 2026

1.46.0

Apr 7, 2026

1.45.3

Apr 5, 2026

1.45.2

Apr 3, 2026

1.45.1

Apr 3, 2026

1.45.0

Apr 3, 2026

1.44.0

Apr 3, 2026

1.43.0

Apr 3, 2026

1.42.0

Apr 3, 2026

1.41.0

Apr 3, 2026

1.40.0

Apr 2, 2026

1.39.0

Apr 2, 2026

1.38.0

Apr 1, 2026

1.37.0

Apr 1, 2026

1.36.0

Mar 31, 2026

1.35.0

Mar 31, 2026

1.34.0

Mar 31, 2026

1.33.0

Mar 31, 2026

1.32.0

Mar 31, 2026

1.31.0

Mar 31, 2026

1.30.0

Mar 31, 2026

1.29.0

Mar 31, 2026

1.28.0

Mar 30, 2026

1.27.0

Mar 29, 2026

1.26.4

Mar 29, 2026

1.26.3

Mar 29, 2026

1.26.2

Mar 29, 2026

1.26.1

Mar 29, 2026

1.26.0

Mar 29, 2026

1.25.1

Mar 29, 2026

1.25.0

Mar 29, 2026

1.24.1

Mar 29, 2026

1.24.0

Mar 29, 2026

1.23.0

Mar 29, 2026

1.22.1

Mar 29, 2026

1.22.0

Mar 29, 2026

1.21.1

Mar 29, 2026

1.21.0

Mar 29, 2026

1.20.2

Mar 29, 2026

1.20.1

Mar 28, 2026

1.20.0

Mar 28, 2026

1.19.2

Mar 28, 2026

1.19.1

Mar 28, 2026

1.19.0

Mar 28, 2026

1.18.0

Mar 28, 2026

1.17.0

Mar 27, 2026

1.16.2

Mar 23, 2026

1.16.1

Mar 19, 2026

1.16.0

Mar 19, 2026

1.15.0

Mar 19, 2026

1.14.1

Mar 19, 2026

1.14.0

Mar 19, 2026

1.13.1

Mar 19, 2026

1.13.0

Mar 19, 2026

1.12.0

Mar 19, 2026

1.11.0

Mar 18, 2026

1.10.2

Mar 18, 2026

1.10.1

Mar 18, 2026

1.10.0

Mar 18, 2026

1.9.0

Feb 28, 2026

1.8.2

Feb 24, 2026

1.8.1

Feb 24, 2026

1.8.0

Feb 24, 2026

1.7.0

Feb 22, 2026

1.6.5

Feb 20, 2026

1.6.4

Feb 20, 2026

1.6.3

Feb 18, 2026

1.6.2

Feb 18, 2026

1.6.1

Feb 18, 2026

1.6.0

Feb 8, 2026

1.5.4

Feb 8, 2026

1.5.3

Feb 8, 2026

1.5.2

Feb 1, 2026

1.5.1

Feb 1, 2026

1.5.0

Feb 1, 2026

1.4.9

Feb 1, 2026

1.4.8

Feb 1, 2026

1.4.7

Feb 1, 2026

1.4.6

Feb 1, 2026

1.4.5

Feb 1, 2026

1.4.4

Feb 1, 2026

1.4.3

Feb 1, 2026

1.4.1

Feb 1, 2026

1.4.0

Feb 1, 2026

1.3.4

Feb 1, 2026

1.3.3

Jan 26, 2026

1.3.2

Jan 26, 2026

1.3.1

Jan 26, 2026

1.3.0

Jan 26, 2026

1.2.0

Jan 26, 2026

1.1.9

Dec 19, 2025

1.1.8

Dec 17, 2025

1.1.7

Nov 23, 2025

1.1.6

Nov 15, 2025

1.1.5

Nov 15, 2025

1.1.4

Nov 9, 2025

1.1.3

Nov 9, 2025

1.1.2

Nov 8, 2025

1.1.1

Nov 8, 2025

1.1.0

Nov 8, 2025

1.0.0

Nov 5, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

iris_vector_graph-1.55.2.tar.gz (648.9 kB view details)

Uploaded Apr 19, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

iris_vector_graph-1.55.2-py3-none-any.whl (143.6 kB view details)

Uploaded Apr 19, 2026 Python 3

File details

Details for the file iris_vector_graph-1.55.2.tar.gz.

File metadata

Download URL: iris_vector_graph-1.55.2.tar.gz
Upload date: Apr 19, 2026
Size: 648.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for iris_vector_graph-1.55.2.tar.gz
Algorithm	Hash digest
SHA256	`68f54d18dca4603e8549321003eb4d356ba9e54f732933a23b7fe2b9a37052b7`
MD5	`a50ecb480c659350c4dbbcb5ffb40d10`
BLAKE2b-256	`ffbb0e044dac0e74816e4a60ef1cc2cd21c8926c3d8bab6f52356a2174aaa565`

See more details on using hashes here.

File details

Details for the file iris_vector_graph-1.55.2-py3-none-any.whl.

File metadata

Download URL: iris_vector_graph-1.55.2-py3-none-any.whl
Upload date: Apr 19, 2026
Size: 143.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for iris_vector_graph-1.55.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`267ff62e76174fa707e98694a249b43e47f88c9902f7ba5ceecafa5620e856dc`
MD5	`012286c15725c86e4b2f88a9fb37235d`
BLAKE2b-256	`b79e268beae12ebae9dd5c0cb10fba95e8aaff4f86411ba5e4c5aaa51ca263d9`

See more details on using hashes here.

iris-vector-graph 1.55.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

iris-vector-graph

Install

ObjectScript Only (IPM)

What It Does

Quick Start

Python

Inside IRIS (Language=python, no connection needed)

Graph Browser + Bolt Connectivity

Temporal Property Graph

Two edge APIs: structural vs. temporal

Ingest

Window Queries

Pre-aggregated Analytics (O(1) per bucket)

Rich Edge Properties

NDJSON Import / Export

ObjectScript Direct

Vector Search (VecIndex)

IVFFlat Vector Index

PLAID Multi-Vector Search

Cypher

Temporal edge filtering (v1.42.0+)

Graph Analytics

FHIR Bridge

Architecture

Global Structure

Schema (Graph_KG)

ObjectScript Classes

Performance

Documentation

Changelog

v1.55.2 (2026-04-19)

v1.55.1 (2026-04-19)

v1.55.0 (2026-04-19)

v1.54.1 (2026-04-18)

v1.54.0 (2026-04-18)

v1.53.1 (2026-04-18)

v1.53.0 (2026-04-18)

v1.52.1 (2026-04-18)

v1.52.0 (2026-04-18)

v1.51.1 (2026-04-18)

v1.51.0 (2026-04-18)

v1.50.3 (2026-04-18)

v1.50.2 (2026-04-18)

v1.50.1 (2026-04-18)

v1.50.0 (2026-04-18)

v1.49.0 (2026-04-18)

v1.48.0 (2026-04-18)

v1.47.0 (2026-04-10)

v1.46.0 (2026-04-07)

v1.45.3 (2026-04-04)

v1.45.2 (2026-04-03)

v1.45.1 (2026-04-03)

v1.45.0 (2026-04-03)

v1.44.0 (2026-04-03)

v1.43.0 (2026-04-03)

v1.42.0 (2026-04-03)

v1.41.0 (2026-04-03)

v1.40.0 (2026-04-02)

v1.39.0 (2026-04-01)

v1.38.0

v1.37.0

v1.35.0

v1.34.0

v1.33.0

v1.32.0

v1.31.0

v1.30.0

v1.29.0

v1.28.0

v1.26.0–v1.27.0

v1.24.0–v1.25.1