Skip to main content

A similarity search library

Project description

landing

simsity

Simsity is a Super Simple Similarities Service[tm].
It's all about building a neighborhood. Literally!

This simple library is partially inspired by this blogpost by Max Woolfe. You don't always need a full fledged vector database. Polars and numpy might be all you need. And for those moments, simsity is all you need to build a neighborhood!

Install

You can install simsity via pip.

uv pip install simsity

The goal of simsity is to be minimal, to make rapid prototyping very easy and to be "just enough" for medium sized datasets. You will mainly interact with these two functions.

from simsity import create_index, load_index

As their names imply, you can use these to create an index or to load one from disk.

Quickstart

from simsity import create_index, load_index
from simsity.datasets import fetch_recipes

# Let's fetch some demo data
recipes = fetch_recipes()["text"].to_list()

# Let's use model2vec for embeddings 
from model2vec import StaticModel
model = StaticModel.from_pretrained("minishlab/potion-base-8M")

# Populate the ANN vector index and use it. 
index = create_index(recipes, model.encode)
texts, dists = index.query("pork")

# You can also query using vectors
v_pork = model.encode(["pork"])[0]
texts, dists = index.query_vector(v_pork)

You can also provide a path and then you'll be able to store/load everything.

# Make an index with a path
index = create_index(recipes, encoder, path="demo")

# Load an index from a path
reloaded_index = load_index(path="demo", encoder=encoder)
texts, dists = reloaded_index.query("pork")

That's it! Happy hacking!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

simsity-0.8.0.tar.gz (334.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

simsity-0.8.0-py3-none-any.whl (4.6 kB view details)

Uploaded Python 3

File details

Details for the file simsity-0.8.0.tar.gz.

File metadata

  • Download URL: simsity-0.8.0.tar.gz
  • Upload date:
  • Size: 334.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.27

File hashes

Hashes for simsity-0.8.0.tar.gz
Algorithm Hash digest
SHA256 b66f580958735d893b29b1b823783487412bced471e5b9c59513e7ce6d6357be
MD5 8796b9ff5e33e429d425fb3b1b227287
BLAKE2b-256 bd469cf9435104849ff579d05dbd3c5105f1959fa0f346f1933d186b69c5f44f

See more details on using hashes here.

File details

Details for the file simsity-0.8.0-py3-none-any.whl.

File metadata

  • Download URL: simsity-0.8.0-py3-none-any.whl
  • Upload date:
  • Size: 4.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.27

File hashes

Hashes for simsity-0.8.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fd69ed26384beeb74ee46716e62c10da98fde23c3064dc4b4db3b07008cf6eb1
MD5 89adf852159c2dbd2890ba337d4fc801
BLAKE2b-256 a63a6a9b9b673087b55b43897365218b662973a2697c266f007178d85b29bf79

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page