A similarity search library
Project description
simsity
Simsity is a Super Simple Similarities Service[tm].
It's all about building a neighborhood. Literally!
This simple library is partially inspired by this blogpost by Max Woolfe. You don't always need a full fledged vector database. Polars and numpy might be all you need. And for those moments, simsity is all you need to build a neighborhood!
Install
You can install simsity via pip.
uv pip install simsity
The goal of simsity is to be minimal, to make rapid prototyping very easy and to be "just enough" for medium sized datasets. You will mainly interact with these two functions.
from simsity import create_index, load_index
As their names imply, you can use these to create an index or to load one from disk.
Quickstart
from simsity import create_index, load_index
from simsity.datasets import fetch_recipes
# Let's fetch some demo data
recipes = fetch_recipes()["text"].to_list()
# Let's use model2vec for embeddings
from model2vec import StaticModel
model = StaticModel.from_pretrained("minishlab/potion-base-8M")
# Populate the ANN vector index and use it.
index = create_index(recipes, model.encode)
texts, dists = index.query("pork")
# You can also query using vectors
v_pork = model.encode(["pork"])[0]
texts, dists = index.query_vector(v_pork)
You can also provide a path and then you'll be able to store/load everything.
# Make an index with a path
index = create_index(recipes, encoder, path="demo")
# Load an index from a path
reloaded_index = load_index(path="demo", encoder=encoder)
texts, dists = reloaded_index.query("pork")
That's it! Happy hacking!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file simsity-0.8.0.tar.gz.
File metadata
- Download URL: simsity-0.8.0.tar.gz
- Upload date:
- Size: 334.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.5.27
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b66f580958735d893b29b1b823783487412bced471e5b9c59513e7ce6d6357be
|
|
| MD5 |
8796b9ff5e33e429d425fb3b1b227287
|
|
| BLAKE2b-256 |
bd469cf9435104849ff579d05dbd3c5105f1959fa0f346f1933d186b69c5f44f
|
File details
Details for the file simsity-0.8.0-py3-none-any.whl.
File metadata
- Download URL: simsity-0.8.0-py3-none-any.whl
- Upload date:
- Size: 4.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.5.27
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fd69ed26384beeb74ee46716e62c10da98fde23c3064dc4b4db3b07008cf6eb1
|
|
| MD5 |
89adf852159c2dbd2890ba337d4fc801
|
|
| BLAKE2b-256 |
a63a6a9b9b673087b55b43897365218b662973a2697c266f007178d85b29bf79
|