Skip to main content

Store multidimensional vectors and quickly search nearest neighbors.

Project description

LuxDB

This is a simple database for multidimensional vectors. It basically provides persistance and connectivity with asyncio to Hnswlib. The project contains the server and also a simple client.

Still under development, there will be breaking changes and you will loose data if you only store it in this database. So don't use it for anything that you want to keep.

TODO

  • Sane storage backend (not pickle) (Might still need some polishing)
  • Language agnostic transport layer
  • Performance?
  • Rollbacks, transactions, ...
  • Authentication

(Lack of) Features

Persistence is achieved with ZOBD, each index is stored seperatly in a OOBTree. The store can be created with a path, in that case a FileStorage will be created there. You can also provide a Storage in the constructor of the storage. For testing you can omit path and storage, in that case the data will be stored in memory only.

There is no authentication, you need to provide that through a proxy or make sure you are only allowing access to the database to trusted clients.

So there is just creation of indexes, adding items and searching for near neighbors in the indexes as well as storing them on the file system.

Usage

Start the server, either with docker:

docker run -p 8484:8484 registry.gitlab.com/sacovo/luxdb
docker run -p 8484:8484 registry.gitlab.com/sacovo/luxdb --loglevel=info --port 8484 --host 0.0.0.0

Or directly (after installing the dependencies in requirements.txt)

./luxdb-server --port 8484 --loglevel debug path/to/storage.db

The docker container stores the database in /data/ so you can mount something there in order to store data.

Look into the snippets to see some example configurations and code snippets that show how to use the client.

You can then use the client to connect to the server and add or retrieve data.

from luxdb.client import connect
# Connect to the server
async with connect(host, port) as client:
	name = 'my-index'
	# Create an index for 12 dimensional vectors
	await client.create_index(name, 'l2', 12)
	# Initialize the index
	await client.init_index(name, max_elements)
	# Add some data
	data = np.float32(np.random.random((1000, 12)))
	labels = np.arange(1000)
	await client.add_items(name, data, labels)
	# Search the nearest neighbors of data[0]
	found, distances = await client.query_index(name, data[0], k=5)
	# Or the nearest neighbors of all elements
	found, distances = await client.query_index(name, data, k=2)

For more usage examples you can check the tests in tests/test_client.py

Project structure

The project consists of a wrapper around a collection of hnswlib.Indexobjects, a server that performs modifications and lookups and a client. Communication between the server and the client happen through Command objects.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

luxdb-0.0.4.tar.gz (14.1 kB view details)

Uploaded Source

Built Distribution

luxdb-0.0.4-py3-none-any.whl (15.5 kB view details)

Uploaded Python 3

File details

Details for the file luxdb-0.0.4.tar.gz.

File metadata

  • Download URL: luxdb-0.0.4.tar.gz
  • Upload date:
  • Size: 14.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.9.6

File hashes

Hashes for luxdb-0.0.4.tar.gz
Algorithm Hash digest
SHA256 8756b62763bb43e5c26bcd55e754047e5d61ee4106424ca43fd1f33cb9710fb6
MD5 e9e11d6971e6fb2a9c252ee1e2efacce
BLAKE2b-256 f34229547bdb2bb5687ac91e0828d7187a93628db15261f321563da561b2b147

See more details on using hashes here.

File details

Details for the file luxdb-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: luxdb-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 15.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.9.6

File hashes

Hashes for luxdb-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 a42dd4d0e21ad49ba6c1fc5a19b72c82544fb180c4f58e3a389d80a015082d1a
MD5 68f1676dd9e40d40ad36295f890c5d8b
BLAKE2b-256 ae3a3b6ce4028f6e5ed6aae23e36f0df471ab33a8ba423f1d13e328901491ba3

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page