Skip to main content

This package contains toolkit for faiss-instant. It mainly helps to encode texts via Transformers and build Faiss indexes in an automatic way.

Project description

Faiss Instant

Build a Faiss service instantly. Faiss-instant will simply load existing Faiss index (and the corresponding ID mapping) and provide the search service via POST request.

New features: Now Faiss-instant also provides the toolkit for encoding texts into embeddings via SBERT models and indexing the embeddings into a Faiss ANN index. One just needs to install the toolkit via

pip install faiss-instant

and try this example.

Usage

First, one needs to put the resource files (the ID mapping and the Faiss index, please refer to resources/README.md) under the folder ./resources:

make download  # This will download example resource files. The example index comes from building a SQ index (QT_8bit_uniform) on a 10K-document version of the NQ corpus (dpr-single-nq-base was used for encoding). For other indices, please find under https://public.ukp.informatik.tu-darmstadt.de/kwang/faiss-instant/.

Then, one needs to start the faiss-instant service via docker:

docker pull kwang2049/faiss-instant  # Or `make pull`; or `make build` to build the docker image
docker run --detach --rm -it -p 5001:5000 -v ${PWD}/resources:/opt/faiss-instant/resources --name faiss-instant kwang2049/faiss-instant  # Or `make run`; notice here a volume mapping will be made from ./resources to /opt/faiss-instant in the container

Finally, do the query:

bash query_example.sh  # curl 'localhost:5001/search' -X POST -d '{"k": 5, "vectors":  [[0.31800827383995056, -0.19993115961551666, -0.029884858056902885, ...]]}'

This will return the mappings from document IDs to the corresponding scores:

[{"2426246":106.54305267333984,"4944584":107.05268096923828,"6195536":106.5833511352539,"6398884":107.19760131835938,"8077664":107.86164093017578}]

Whenever update the resources, one needs reload them:

curl 'localhost:5001/reload' -X GET  # Or `make reload`

Advanced

Multiple Indices

One can have multiple indices in the resource folder, to load a certain one (actually a pair of index_name.index and index_name.txt, here the index name is 'ivf-32-sq-QT_8bit_uniform'):

curl -d '{"index_name":"ivf-32-sq-QT_8bit_uniform", "use_gpu":true}' -H "Content-Type: application/json" -X POST 'http://localhost:5001/reload'

To view the available indices under the resource folder and the index loaded, one can run:

curl -X GET 'http://localhost:5001/index_list'

To load a specified index:

curl -d '{"index_name":"ivf-32-sq-QT_8bit_uniform"}' -H "Content-Type: application/json" -X POST 'http://localhost:5001/reload'

Use GPU

Note Faiss only supports part of the index types: https://github.com/facebookresearch/faiss/wiki/Faiss-on-the-GPU#implemented-indexes. And for PQ, it cannot support large m such as 384.

One can also use GPU to accelerate the search. To achieve that, one needs to use the GPU version:

docker pull kwang2049/faiss-instant-gpu  # The current image supports only CUDA 10.2 or higher version

And then start the GPU-version container:

docker run --runtime=nvidia -e CUDA_VISIBLE_DEVICES=0 --detach --rm -it -p 5001:5000 -v ${PWD}/resources:/opt/faiss-instant/resources --name faiss-instant-gpu kwang2049/faiss-instant-gpu  # Or `make run-gpu`

This will split and load the index onto all the GPUs available (in this example it uses only gpu:0). To load a specified index and make it on GPU, one can run:

curl -d '{"index_name":"ivf-32-sq-QT_8bit_uniform", "use_gpu":true}' -H "Content-Type: application/json" -X POST 'http://localhost:5001/reload'

Reconstruct

To get the original vector without indexing by its ID, run:

curl -X 'GET' 'http://localhost:5001/reconstruct?id=1'  # This example returns the vector by its ID='1'

Explain

To compute the similarity score between a given query vector and a support vector by its ID:

bash explain_example.sh

Philosophy

Faiss-instant provides only the search service and relies on uploaded Faiss indices. By using the volume mapping, the huge pain of uploading index files to the docker service can be directly removed. Consequently, a minimal efficient Faiss system for search is born.

For creating index files (and also benchmarking ANN methods), please refer to kwang2049/benchmarking-ann.

Reference

plippe/faiss-web-service

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

faiss-instant-0.0.4.tar.gz (14.1 kB view details)

Uploaded Source

Built Distribution

faiss_instant-0.0.4-py3-none-any.whl (14.1 kB view details)

Uploaded Python 3

File details

Details for the file faiss-instant-0.0.4.tar.gz.

File metadata

  • Download URL: faiss-instant-0.0.4.tar.gz
  • Upload date:
  • Size: 14.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/33.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.63.0 importlib-metadata/4.11.2 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.7

File hashes

Hashes for faiss-instant-0.0.4.tar.gz
Algorithm Hash digest
SHA256 59bae37e9d17f330813f418b76d7cc18ac0d186fbd90db6a1de9e625d861cda3
MD5 afb515ab244be31ecdaa32920f310779
BLAKE2b-256 140a1ad6f8caf775e876455420ce307ed8df6cbf04bb55b06e2c1518df8312ad

See more details on using hashes here.

File details

Details for the file faiss_instant-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: faiss_instant-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 14.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/33.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.63.0 importlib-metadata/4.11.2 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.7

File hashes

Hashes for faiss_instant-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 d68209faa0def3cc8082b00ef5b1e74e825d821186ce376e72f1aea84325e805
MD5 d4df0775c69d1a70ad7b2022bb45deb0
BLAKE2b-256 786e07a9f57aa5b0ff8c9fe499dd24e00571a172067d9cf8235c2d43fdbf29f7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page