Skip to main content

Python bindings for ArrowSpace (Rust) providing graph-based similarity search, signal graphs, and spectral methods for vector data.

Project description

pyarrowspace

Python bindings for arrowspace-rs.

arrowspace is a database for vectors supported by a graph representation and a key-value store. The main use-cases targeted are: AI search capabilities as advanced vector similarity, graph characterisation analysis and search, indexing of high-dimensional vectors. Design principles described in this article.

For labs and tests please see tests/

Installation

From PyPi:

pip install arrowspace

or any other way of installing a Python library.

If you have cargo installed, to compile from source and use locally:

pip install maturin[patchelf]
# quick building
maturin develop
# release building, needed for large datasets
maturin develop --release

Tests

Simple test:

python tests/test_0.py

Test with public QA dataset:

python tests/test_1_quora_questions.py

There are other tests but they require downloadin a dataset separately or fine-tuning the embeddings on a given dataset. Give it a try and let me know!

Simplest Example

from arrowspace import ArrowSpaceBuilder
import numpy as np

items: np.array = np.array(
    [[0.1, 0.2, 0.3], [0.0, 0.5, 0.1], [0.9, 0.1, 0.0]],
    dtype = np.float64
)

graph_params: dict = {
    "eps": 1.0,
    "k": 6,
    "topk": 3,
    "p": 2.0,
    "sigma": 1.0,
}

# Create an ArrowSpace instance, returning the computed
# signal graph and lambdas
aspace, gl = ArrowSpaceBuilder().build(graph_params, items)

# Search comparable items
# defaults: k = nitems, alpha = 0.9, beta = 0.1
query: np.array = np.array(
    [0.05, 0.2, 0.25],
    dtype = np.float64
)

tau: float = 1.0
hits: list = aspace.search(query, gl, tau)

# Search returns a list of `(index, score`) tuples, where
# expected value from the code above show the first index
# having the top score, i.e., being nearest.

print(hits)
# [ (0, 0.989743318610787), (1, 0.7565344158360029), (2, 0.22151940739207396) ]

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arrowspace-0.26.2.tar.gz (286.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

arrowspace-0.26.2-cp312-cp312-macosx_11_0_arm64.whl (4.9 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

File details

Details for the file arrowspace-0.26.2.tar.gz.

File metadata

  • Download URL: arrowspace-0.26.2.tar.gz
  • Upload date:
  • Size: 286.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for arrowspace-0.26.2.tar.gz
Algorithm Hash digest
SHA256 ee767a8f3e998ea2eb182b33800a55eb6715b730b353df249349a5d1614b3a9e
MD5 a352fc4e7ff4235a068b0ea658ab54ca
BLAKE2b-256 d724b45cb9861d4ad6997ec84b07bb77ac02f5af1dee106fd7a4f87ca1da08a7

See more details on using hashes here.

File details

Details for the file arrowspace-0.26.2-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for arrowspace-0.26.2-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 9213bec734b263874194771d5f5eaa12b22f82479ce5b847ff0d8219504b0a58
MD5 486b9502bb315d79e9a8438acb3f4ce7
BLAKE2b-256 c2a0da86d57bd961d2110e56dd8c7ae4311ca70987c0f318ef1792a5414cc1e3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page