Skip to main content

search through files with fts5, vectors and get reranked results. Fast

Project description

litesearch

This file will become your README and also the index of your documentation.

Developer Guide

If you are new to using nbdev here are some useful pointers to get you started.

Install litesearch in Development mode

# make sure litesearch package is installed in development mode
$ pip install -e .

# make changes under nbs/ directory
# ...

# compile to have changes apply to litesearch
$ nbdev_prepare

Usage

Installation

Install latest from the GitHub repository:

$ pip install git+https://github.com/Karthik777/litesearch.git

or from pypi ,

$ pip install litesearch

Documentation

Documentation can be found hosted on this GitHub repository’s pages. Additionally you can find package manager specific guidelines on conda and pypi respectively.

Let’s setup some deps to make full use of litesearch

from fastcore.all import *
from fastlite import *
import numpy as np
from litesearch import *

Let’s set the db up. This db has usearch loaded. So, you can run cosine distance calculations using simd(means fast, real fast)

db: Database = setup_db(':memory:')
embs = dict(v1=np.ones(512).tobytes(), v2=np.zeros(512).tobytes())
db.q('''select
    distance_cosine_f16(:v1,:v2) as diff,
    distance_cosine_f16(:v1,:v1) as same ''',embs)
[{'diff': 1.0, 'same': 0.0}]

There are way more functions you can run now. Checkout: https://unum-cloud.github.io/USearch/sqlite/index.html

Checkout the examples/01_simple_rag.ipynb for a full-fledged rag example.

Let’s create a store and push some content in.

store = db.mk_store()
store.schema
'CREATE TABLE [content] (\n   [id] INTEGER PRIMARY KEY,\n   [content] TEXT NOT NULL,\n   [embedding] BLOB,\n   [metadata] TEXT,\n   [uploaded_at] FLOAT DEFAULT CURRENT_TIMESTAMP\n)'
txts = ['this is a text', "I'm hungry", "Let's play! shall we?"]
embs = [np.full(512, i) for i in range(3)]
rows = [dict(content=t, embedding=e) for t,e in zip(txts,embs)]
store.insert_all(rows)
<Table content (id, content, embedding, metadata, uploaded_at)>

Cool, let’s search through these contents

litesearch provides a search method which reranks the results from both FTS and vector search using Reciprocal Rank Fusion (RRF)

You can always turn it off.

q,e='playing hungry',np.full(512,1).tobytes()
res = db.search(pre(q), e, columns=['id', 'content'], lim=2)
print(res)
[{'id': 2, 'content': "I'm hungry"}, {'id': 3, 'content': "Let's play! shall we?"}, {'id': 1, 'content': 'this is a text'}]

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

litesearch-0.0.3.tar.gz (13.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

litesearch-0.0.3-py3-none-any.whl (12.6 kB view details)

Uploaded Python 3

File details

Details for the file litesearch-0.0.3.tar.gz.

File metadata

  • Download URL: litesearch-0.0.3.tar.gz
  • Upload date:
  • Size: 13.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.0rc2

File hashes

Hashes for litesearch-0.0.3.tar.gz
Algorithm Hash digest
SHA256 ee68fe7b4bd1a55c09c9c52a683134ac76ec6290e8e8c1ddfb343d8b1e045815
MD5 e2870e22cc1a6e7ea53ed3d25a5603a9
BLAKE2b-256 8d54b3850882e97ecb75c1f0b1d7199d35b2981c8889376e5662b2a427235e1d

See more details on using hashes here.

File details

Details for the file litesearch-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: litesearch-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 12.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.0rc2

File hashes

Hashes for litesearch-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 9404dc8647c56f2a9614bf8b915b4b6a74139f013d8b172192eba97a6d1a80a1
MD5 e083ed33d34e340bb00aad732ebc9be2
BLAKE2b-256 15305c16518632f6fae9dbebb163b37b66ba0cecb84e3618e834db751ac83493

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page