Skip to main content

An integration of Qdrant ANN vector database backend with txtai

Project description

qdrant-txtai

txtai simplifies building AI-powered semantic search applications using Transformers. It leverages the neural embeddings and their properties to encode high-dimensional data in a lower-dimensional space and allows to find similar objects based on their embeddings' proximity.

Implementing such applications in real-world use cases requires storing the embeddings efficiently, namely in a vector database like Qdrant. It offers not only a powerful engine for neural search, but also allows setting up a whole cluster if your data does not fit a single machine anymore. It is production-grade and can be launched easily with Docker.

Combining the easiness of txtai with Qdrant's performance enables you to build production-ready semantic search applications way faster than before.

Installation

The library might be installed with pip as following:

pip install qdrant-txtai

Usage

Running the txtai application with Qdrant as vector storage requires launching a Qdrant instance. That might be done easily with Docker:

docker run -p 6333:6333 -p:6334:6334 qdrant/qdrant:latest

Running the txtai application might be done either programmatically or by providing configuration in a YAML file.

Programmatically

from txtai.embeddings import Embeddings

embeddings = Embeddings({
    "path": "sentence-transformers/all-MiniLM-L6-v2",
    "backend": "qdrant_txtai.ann.qdrant.Qdrant",
})
embeddings.index([(0, "Correct", None), (1, "Not what we hoped", None)])
result = embeddings.search("positive", 1)
print(result)

Via YAML configuration

# app.yml
embeddings:
  path: sentence-transformers/all-MiniLM-L6-v2
  backend: qdrant_txtai.ann.qdrant.Qdrant
CONFIG=app.yml uvicorn "txtai.api:app"
curl -X GET "http://localhost:8000/search?query=positive"

Configuration properties

qdrant-txtai allows you to configure both the connection details and some internal properties of the vector collection, which may impact both speed and accuracy. Please refer to Qdrant docs if you are interested in the meaning of each property.

The example below presents all the available options, if we connect to Qdrant server:

embeddings:
  path: sentence-transformers/all-MiniLM-L6-v2
  backend: qdrant_txtai.ann.qdrant.Qdrant
  metric: l2 # allowed values: l1 / l2 / cosine / ip
  qdrant:
    url: qdrant.host
    port: 6333
    grpc_port: 6334
    prefer_grpc: true
    collection: CustomCollectionName
    https: true # for Qdrant Cloud
    api_key: XYZ # for Qdrant Cloud
    search_params:
      hnsw_ef: 100

Local in-memory/disk-persisted mode

Qdrant Python client, from version 1.1.1, supports local in-memory/disk-persisted mode. That's a good choice for any test scenarios and quick experiments in which you do not plan to store lots of vectors. In such a case, spinning a Docker container might be even not required.

In-memory storage

In case you want to have transient storage, for example, in the case of automated tests launched during your CI/CD pipeline, using Qdrant Local mode with in-memory storage might be a preferred option.

embeddings:
  path: sentence-transformers/all-MiniLM-L6-v2
  backend: qdrant_txtai.ann.qdrant.Qdrant
  metric: l2 # allowed values: l1 / l2 / cosine / ip
  qdrant:
    location: ':memory:'
    prefer_grpc: true

On disk storage

However, if you prefer to keep the vectors between different runs of your application, it might be better to use disk storage and pass the path that should be used to persist the data.

embeddings:
  path: sentence-transformers/all-MiniLM-L6-v2
  backend: qdrant_txtai.ann.qdrant.Qdrant
  metric: l2 # allowed values: l1 / l2 / cosine / ip
  qdrant:
    path: '/home/qdrant/storage_local'
    prefer_grpc: true

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

qdrant_txtai-2.0.0.tar.gz (7.7 kB view details)

Uploaded Source

Built Distribution

qdrant_txtai-2.0.0-py3-none-any.whl (8.5 kB view details)

Uploaded Python 3

File details

Details for the file qdrant_txtai-2.0.0.tar.gz.

File metadata

  • Download URL: qdrant_txtai-2.0.0.tar.gz
  • Upload date:
  • Size: 7.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.9.19

File hashes

Hashes for qdrant_txtai-2.0.0.tar.gz
Algorithm Hash digest
SHA256 951ea0a70a56f1f475769a7b711f3aa3774a3788ba0d917662d7cfe9a508b60b
MD5 c9f41b02c7312641666de9f682ae18c1
BLAKE2b-256 ed8961b43a0748aa30a1c66e6d945beae5b584d00ed9919e413c9018ed6d2c2e

See more details on using hashes here.

File details

Details for the file qdrant_txtai-2.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for qdrant_txtai-2.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9e90e4cb1fec2187a3a16de3aff8442c8fd288adc1384e25917f05d6231742af
MD5 58a255a58da73a797d2bd5ad10fbb00e
BLAKE2b-256 9375bffd430d1616d40206b449f75e461c6170a633dac21d06d1873067962124

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page