search through files with fts5, vectors and get reranked results. Fast
Project description
litesearch
This file will become your README and also the index of your documentation.
Developer Guide
If you are new to using nbdev here are some useful pointers to get you
started.
Install litesearch in Development mode
# make sure litesearch package is installed in development mode
$ pip install -e .
# make changes under nbs/ directory
# ...
# compile to have changes apply to litesearch
$ nbdev_prepare
Usage
Installation
Install latest from the GitHub repository:
$ pip install git+https://github.com/Karthik777/litesearch.git
or from conda
$ conda install -c Karthik777 litesearch
or from pypi
$ pip install litesearch
Documentation
Documentation can be found hosted on this GitHub repository’s pages. Additionally you can find package manager specific guidelines on conda and pypi respectively.
Let’s setup some deps to make full use of litesearch
from fastcore.all import *
from fastlite import *
import numpy as np
from litesearch import *
Let’s set the db up. This db has usearch loaded. So, you can run cosine distance calculations using simd(means fast, real fast)
db: Database = setup_db(':memory:')
embs = dict(v1=np.ones(512).tobytes(), v2=np.zeros(512).tobytes())
db.q('''select
distance_cosine_f16(:v1,:v2) as diff,
distance_cosine_f16(:v1,:v1) as same ''',embs)
[{'diff': 1.0, 'same': 0.0}]
There are way more functions you can run now. Checkout: https://unum-cloud.github.io/USearch/sqlite/index.html
Checkout the examples/01_simple_rag.ipynb for a full-fledged rag
example.
Let’s create a store and push some content in.
store = db.mk_store()
store.schema
'CREATE TABLE [content] (\n [id] INTEGER PRIMARY KEY,\n [content] TEXT NOT NULL,\n [embedding] BLOB,\n [metadata] TEXT,\n [uploaded_at] FLOAT DEFAULT CURRENT_TIMESTAMP\n)'
txts = ['this is a text', "I'm hungry", "Let's play! shall we?"]
embs = [np.full(512, i) for i in range(3)]
rows = [dict(content=t, embedding=e) for t,e in zip(txts,embs)]
store.insert_all(rows)
<Table content (id, content, embedding, metadata, uploaded_at)>
Cool, let’s search through these contents
litesearch provides a search method which reranks the results from both FTS and vector search using Reciprocal Rank Fusion (RRF)
You can always turn it off.
q,e='playing hungry',np.full(512,1).tobytes()
res = db.search(pre(q), e, columns=['id', 'content'], lim=2)
print(res)
[{'id': 2, 'content': "I'm hungry"}, {'id': 3, 'content': "Let's play! shall we?"}, {'id': 1, 'content': 'this is a text'}]
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file litesearch-0.0.2.tar.gz.
File metadata
- Download URL: litesearch-0.0.2.tar.gz
- Upload date:
- Size: 13.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
96a0dd4314f0bcfb60ba2f301479b05be46b9d9a2dbb5cbc1923d21aa92e7455
|
|
| MD5 |
509dc262b89e0fe292af1680967397fc
|
|
| BLAKE2b-256 |
5bd16110b0392534d91425bc8325095be5ba07096ccdd1320c8b1c41189cf481
|
File details
Details for the file litesearch-0.0.2-py3-none-any.whl.
File metadata
- Download URL: litesearch-0.0.2-py3-none-any.whl
- Upload date:
- Size: 12.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
99c75fe0d78e350af29b1a64dde3ff07520a3d6100190f727ef503b8deffb323
|
|
| MD5 |
06ce72655698c028575249b2bb88fe80
|
|
| BLAKE2b-256 |
7951f24377ed89ed1235a788767a38f37461131a241b6fdbe9f2c9a93746cac0
|