Skip to main content

High-performance embedded graph database with vector search

Project description

KiteDB for Python

KiteDB is a high-performance embedded graph database with built-in vector search. This package provides the Python bindings to the Rust core.

Features

  • ACID transactions with commit/rollback
  • Node and edge CRUD operations with properties
  • Labels, edge types, and property keys
  • Fluent traversal and pathfinding (BFS, Dijkstra, A*)
  • Vector embeddings with IVF and IVF-PQ indexes
  • Single-file storage format

Install

From PyPI

pip install kitedb

From source

# Install maturin (Rust extension build tool)
python -m pip install -U maturin

# Build and install in development mode
cd ray-rs
maturin develop --features python

# Or build a wheel
maturin build --features python --release
pip install target/wheels/kitedb-*.whl

Quick start (fluent API)

The fluent API provides a high-level, type-safe interface:

from kitedb import ray, node, edge, prop, optional

# Define your schema
User = node("user",
    key=lambda id: f"user:{id}",
    props={
        "name": prop.string("name"),
        "email": prop.string("email"),
        "age": optional(prop.int("age")),
    }
)

Knows = edge("knows", {
    "since": prop.int("since"),
})

# Open database
with ray("./social.kitedb", nodes=[User], edges=[Knows]) as db:
    # Insert nodes
    alice = db.insert(User).values(key="alice", name="Alice", email="alice@example.com").returning()
    bob = db.insert(User).values(key="bob", name="Bob", email="bob@example.com").returning()

    # Create edges
    db.link(alice, Knows, bob, since=2024)

    # Traverse
    friends = db.from_(alice).out(Knows).nodes().to_list()

    # Pathfinding
    path = db.shortest_path(alice).via(Knows).to(bob).dijkstra()

Quick start (low-level API)

For direct control, use the low-level Database class:

from kitedb import Database, PropValue

with Database("my_graph.kitedb") as db:
    db.begin()

    alice = db.create_node("user:alice")
    bob = db.create_node("user:bob")

    name_key = db.get_or_create_propkey("name")
    db.set_node_prop(alice, name_key, PropValue.string("Alice"))
    db.set_node_prop(bob, name_key, PropValue.string("Bob"))

    knows = db.get_or_create_etype("knows")
    db.add_edge(alice, knows, bob)

    db.commit()

    print("nodes:", db.count_nodes())
    print("edges:", db.count_edges())

Fluent traversal

from kitedb import TraverseOptions

friends = db.from_(alice).out(knows).to_list()

results = db.from_(alice).traverse(
    knows,
    TraverseOptions(max_depth=3, min_depth=1, direction="out", unique=True),
).to_list()

Concurrent Access

KiteDB supports concurrent read operations from multiple threads. Read operations don't block each other:

import threading
from concurrent.futures import ThreadPoolExecutor

# Multiple threads can read concurrently
def read_user(key):
    return db.get_node_by_key(key)

with ThreadPoolExecutor(max_workers=4) as executor:
    futures = [executor.submit(read_user, f"user:{i}") for i in range(100)]
    results = [f.result() for f in futures]

# Or with asyncio (reads run concurrently)
import asyncio

async def read_users():
    loop = asyncio.get_event_loop()
    tasks = [
        loop.run_in_executor(None, db.get_node_by_key, f"user:{i}")
        for i in range(100)
    ]
    return await asyncio.gather(*tasks)

Concurrency model:

  • Reads are concurrent: Multiple get_node_by_key(), get_neighbors(), traversals, etc. can run in parallel
  • Writes are exclusive: Write operations (create_node(), add_edge(), etc.) require exclusive access
  • Thread safety: The Database object is safe to share across threads

Note: Python's GIL is released during Rust operations, allowing true parallelism for I/O-bound database access.

Vector search

from kitedb import IvfIndex, IvfConfig, SearchOptions

index = IvfIndex(dimensions=128, config=IvfConfig(n_clusters=100))

training_data = [0.1] * (128 * 1000)
index.add_training_vectors(training_data, num_vectors=1000)
index.train()

index.insert(vector_id=1, vector=[0.1] * 128)

results = index.search(
    manifest_json='{"vectors": {...}}',
    query=[0.1] * 128,
    k=10,
    options=SearchOptions(n_probe=20),
)

for result in results:
    print(result.node_id, result.distance)

Documentation

https://ray-kwaf.vercel.app/docs

License

MIT License - see the main project LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kitedb-0.2.2.tar.gz (1.6 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kitedb-0.2.2-cp312-cp312-macosx_11_0_arm64.whl (1.4 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

File details

Details for the file kitedb-0.2.2.tar.gz.

File metadata

  • Download URL: kitedb-0.2.2.tar.gz
  • Upload date:
  • Size: 1.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.9.6

File hashes

Hashes for kitedb-0.2.2.tar.gz
Algorithm Hash digest
SHA256 9979c24c0fe67a7074c5ab3f9acd06deaead7c46897ec6debeea38a547643191
MD5 cb5f98327af13ff8f933dbe9728c52d1
BLAKE2b-256 38a232cf04799257551136406c4a50271ae69df1542a751523d23227557a1e72

See more details on using hashes here.

File details

Details for the file kitedb-0.2.2-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for kitedb-0.2.2-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 fbe67a7b1831f014dfa003898c02ad3b2cc425c7c585a77ef17f965159914419
MD5 8f8972ff94fba3f818891bf2f7c46b84
BLAKE2b-256 71ce7ea85d54a608009f5232fc53327a12a80193716de0bcea816034284f43a5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page