Project description

qdrant-txtai

txtai simplifies building AI-powered semantic search applications using Transformers. It leverages the neural embeddings and their properties to encode high-dimensional data in a lower-dimensional space and allows to find similar objects based on their embeddings' proximity.

Implementing such application in real-world use cases requires storing the embeddings in an efficient way though, namely in a vector database like Qdrant. It offers not only a powerful engine for neural search, but also allows setting up a whole cluster if your data does not fit a single machine anymore. It is production grade and can be launched easily with Docker.

Combining the easiness of txtai with Qdrant's performance enables you to build production-ready semantic search applications way faster than before.

Installation

The library might be installed with pip as following:

pip install qdrant-txtai

Usage

Running the txtai application with Qdrant as a vector storage requires launching a Qdrant instance. That might be done easily with Docker:

docker run -p 6333:6333 -p:6334:6334 qdrant/qdrant:v0.10.2

Running the txtai application might be done either programmatically or by providing configuration in a YAML file.

Programmatically

from txtai.embeddings import Embeddings

embeddings = Embeddings({
    "path": "sentence-transformers/all-MiniLM-L6-v2",
    "backend": "qdrant_txtai.ann.qdrant.Qdrant",
})
embeddings.index([(0, "Correct", None), (1, "Not what we hoped", None)])
result = embeddings.search("positive", 1)
print(result)

Via YAML configuration

# app.yml
embeddings:
  path: sentence-transformers/all-MiniLM-L6-v2
  backend: qdrant_txtai.ann.qdrant.Qdrant

CONFIG=app.yml uvicorn "txtai.api:app"
curl -X GET "http://localhost:8000/search?query=positive"

Configuration properties

qdrant-txtai allows you to configure both the connection details, and some internal properties of the vector collection which may impact both speed and accuracy. Please refer to Qdrant docs if you are interested in the meaning of each property.

The example below presents all the available options:

embeddings:
  path: sentence-transformers/all-MiniLM-L6-v2
  backend: qdrant_txtai.ann.qdrant.Qdrant
  metric: l2 # allowed values: l2 / cosine / ip
  qdrant:
    host: qdrant.host
    port: 6333
    grpc_port: 6334
    prefer_grpc: true
    collection: CustomCollectionName
    https: true # for Qdrant Cloud
    api_key: XYZ # for Qdrant Cloud
    hnsw:
      m: 8
      ef_construct: 256
      full_scan_threshold:
      ef_search: 512

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

2.0.0

May 18, 2024

1.0.1

Apr 12, 2023

1.0.0

Feb 10, 2023

0.11.7

Jan 18, 2023

0.11.3

Jan 18, 2023

This version

0.11.1

Jan 18, 2023

0.11.0

Jan 18, 2023

0.10.3

Oct 10, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

qdrant_txtai-0.11.1.tar.gz (8.0 kB view hashes)

Uploaded Jan 18, 2023 Source

Built Distribution

qdrant_txtai-0.11.1-py3-none-any.whl (8.3 kB view hashes)

Uploaded Jan 18, 2023 Python 3

Hashes for qdrant_txtai-0.11.1.tar.gz

Hashes for qdrant_txtai-0.11.1.tar.gz
Algorithm	Hash digest
SHA256	`49346247be425feca5e9d47ca200e814e61d811870ab74839b7352cc8eb9b6db`
MD5	`ff7a7694e3fe459d504afac9fc6384b1`
BLAKE2b-256	`0723f0e90e05247f680456e432d03bc00ce0734b16d407e1e87337668e53bff0`

Hashes for qdrant_txtai-0.11.1-py3-none-any.whl

Hashes for qdrant_txtai-0.11.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3a1d350ae59b868662fabb79c6af1e9d36416e34ce8227f5e08e07fc60bb05fd`
MD5	`2edc468c00876631d4b0d3bd742cbdca`
BLAKE2b-256	`b81ad40cce9170f6254a049343832c02293746aeecf5c2b4be74f30cee7572fb`