Skip to main content

Package that make easier the searching process in pyhton, through Embeddings and Semantic Similarity

Project description

QUESE

"Quese" allows you implement in an easy way a Search Algoritm, based on Embeddings and Semantic Similarity, in your Python apps. The module provides a function called search_by_embeddings(), with several params to customize the searching process.

INSTALLATION

You can install "quese" with pip:

pip install quese

EXAMPLE WITH BY

from quese import search_by_embeddings

data_ = [
    {
        "title": "UX Designer",
        "tags": "Designer"
    },
    {
        "title": "Senior Accounter",
        "tags": "Accounter" 
    },
    {
        "title": "Product Manager",
        "tags": "Managment" 
    }
]

results = search_by_embeddings(data=data_, query="Manager", by="title")
#Results will return a LIST with the dictionaries whose title is Semantically Similar to the query: "Manager", so in this case, the last dictionary: "Product Manager".
print(results)

PARAMS

data:

It's the first param, it's REQUIRED, and it must be a list of dictionaries.

query:

It's the second param, it's REQUIRED as well, and it represent the query you want to pass.
Type: string

by:

It's the third param, it's only REQUIRED if you don't pass the "template" param, and it represent the value of your dictionaries that you are searching for.
For example, if you want to search in a list of products, your "by" param could be the prop "name" of each product.
Type: string

template:

It's only REQUIRED if you don't pass the "by" param, and it's similar to "by", but allow you to search by a customized string for each dictionary in your data list.
For example, if you want to search in a list of products, your "template" param could be a string like this: "{name}, seller: {seller}". Notice that you have to define your props between "{}", as you can see in the example with the variables "name" and "seller".
Type: string

accuracy:

It's optional, and it represents the similarity that the dictionary must have with the query to be considered a result.
The default value is 0.4, wich works good with almost all the models. However, if you want to change it, we don't recommend to set vary high values or very low values, the range 0.3-0.6 should be enought.
Type: float number between 0-1

model:

It's optional, and it represents the embedding model you want to use.
The default model is 'sentence-transformers/all-MiniLM-L6-v2'. You can use an other model like 'sentence-transformers/all-mpnet-base-v2', but take care because if the model don't work with sentence-transformers this package will not work with it.
Type: string

EXAMPLE WITH TEMPLATE

from quese import search_by_embeddings

data_ = [
    {
        "title": "UX Designer",
        "tags": "Designer"
    },
    {
        "title": "Senior Accounter",
        "tags": "Accounter" 
    },
    {
        "title": "Product Manager",
        "tags": "Managment" 
    }
]

results = search_by_embeddings(data=data_, query="Manager", template="{title}, {tags}")
#Results will return a LIST with the dictionaries whose title and tags are Semantically Similar to the query: "Manager", so in this case, the last dictionary: "Product Manager".
print(results)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

quese-0.1.2.tar.gz (3.9 kB view details)

Uploaded Source

File details

Details for the file quese-0.1.2.tar.gz.

File metadata

  • Download URL: quese-0.1.2.tar.gz
  • Upload date:
  • Size: 3.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.3

File hashes

Hashes for quese-0.1.2.tar.gz
Algorithm Hash digest
SHA256 e6e5fa32872dbd5209a65b35590262b35a9f042fd6b25af52e7be73c699de8ad
MD5 5b47f7d92227282167f73d2d836c6f18
BLAKE2b-256 3b6452c337925df94f1dfb71f80007c3a03a8b170e26dfa2515f6cdd6f3c7dc9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page