Skip to main content

Python bindings for ArrowSpace (Rust) providing graph-based similarity search, signal graphs, and spectral methods for vector data.

Project description

pyarrowspace

Python bindings for arrowspace-rs. This is experimental software meant for research at current state.

This is the starting repository for arrowspace, it is made public as a showcase for the Python interface, to collect feedback and make public some results of the tests run. To run needs the arrowspace-rs Rust module in a sibling directory.

For labs and tests please see tests/

Installation

From PyPi:

pip install arrowspace

or any other way of installing a Python library.

If you have cargo installed, to compile the libraries involved (from crates.io):

pip install maturin[patchelf]
maturin develop

Tests

Simple test:

python tests/test_0.py

Test with public QA dataset:

python tests/test_1_quora_questions.py

There are other tests but they require downloadin a dataset separately or fine-tuning the embeddings on a given dataset. Give it a try and let me know!

Simplest Example

from arrowspace import ArrowSpaceBuilder
import numpy as np

items: np.array = np.array(
    [[0.1, 0.2, 0.3], [0.0, 0.5, 0.1], [0.9, 0.1, 0.0]],
    dtype = np.float64
)

graph_params: dict = {
    "eps": 1.0,
    "k": 6,
    "topk": 3,
    "p": 2.0,
    "sigma": 1.0,
}

# Create an ArrowSpace instance, returning the computed
# signal graph and lambdas
aspace, gl = ArrowSpaceBuilder.build(graph_params, items)

# Search comparable items
# defaults: k = nitems, alpha = 0.9, beta = 0.1
query: np.array = np.array(
    [0.05, 0.2, 0.25],
    dtype = np.float64
)

tau: float = 1.0
hits: list = aspace.search(query, gl, tau)

# Search returns a list of `(index, score`) tuples, where
# expected value from the code above show the first index
# having the top score, i.e., being nearest.

print(hits)
# [ (0, 0.989743318610787), (1, 0.7565344158360029), (2, 0.22151940739207396) ]

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arrowspace-0.24.6.tar.gz (106.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

arrowspace-0.24.6-cp312-cp312-manylinux_2_34_x86_64.whl (4.1 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.34+ x86-64

File details

Details for the file arrowspace-0.24.6.tar.gz.

File metadata

  • Download URL: arrowspace-0.24.6.tar.gz
  • Upload date:
  • Size: 106.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.9.4

File hashes

Hashes for arrowspace-0.24.6.tar.gz
Algorithm Hash digest
SHA256 ce4bd75242ce8ccf30966d0053bc7f04b58ccc4b2e72cd74417a47ddc4ea0818
MD5 44f94e1ebc42e53fdadd83af39573f5a
BLAKE2b-256 5ad32ce06b13ad9ce70f71b9d183ffbbf935de8d7637f189d7da932e77fd5dd0

See more details on using hashes here.

File details

Details for the file arrowspace-0.24.6-cp312-cp312-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for arrowspace-0.24.6-cp312-cp312-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 352dcbc77e66263dad58e8d30ac7205905e8e8a7f86b7d860b4dd36892ce71ca
MD5 aae7481a3e305aafc1cdd1fe4a9391cb
BLAKE2b-256 80e0a4e4b93581c47eec38bee41cc218f07c41289c97941a2eada9c1041d2ea3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page