Package that make easier the searching process in pyhton, through Embeddings and Semantic Similarity

These details have not been verified by PyPI

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

QUESE

"Quese" allows you implement in an easy way a Search Algoritm, based on Embeddings and Semantic Similarity, in your Python apps. The module provides a function called search_by_embeddings(), with several params to customize the searching process.

INSTALLATION

You can install "quese" with pip:

pip install quese

EXAMPLE WITH BY

from quese import search_by_embeddings

data_ = [
    {
        "title": "UX Designer",
        "tags": "Designer"
    },
    {
        "title": "Senior Accounter",
        "tags": "Accounter" 
    },
    {
        "title": "Product Manager",
        "tags": "Managment" 
    }
]

results = search_by_embeddings(data=data_, query="Manager", by="title")
#Results will return a LIST with the dictionaries whose title is Semantically Similar to the query: "Manager", so in this case, the last dictionary: "Product Manager".
print(results)

PARAMS

data:

It's the first param, it's REQUIRED, and it must be a list of dictionaries.

query:

It's the second param, it's REQUIRED as well, and it represent the query you want to pass.
Type: string

by:

It's the third param, it's only REQUIRED if you don't pass the "template" param, and it represent the value of your dictionaries that you are searching for.
For example, if you want to search in a list of products, your "by" param could be the prop "name" of each product.
Type: string

template:

It's only REQUIRED if you don't pass the "by" param, and it's similar to "by", but allow you to search by a customized string for each dictionary in your data list.
For example, if you want to search in a list of products, your "template" param could be a string like this: "{name}, seller: {seller}". Notice that you have to define your props between "{}", as you can see in the example with the variables "name" and "seller".
Type: string

accuracy:

It's optional, and it represents the similarity that the dictionary must have with the query to be considered a result.
The default value is 0.4, wich works good with almost all the models. However, if you want to change it, we don't recommend to set vary high values or very low values, the range 0.3-0.6 should be enought.
Type: float number between 0-1

model:

It's optional, and it represents the embedding model you want to use.
The default model is 'sentence-transformers/all-MiniLM-L6-v2'. You can use an other model like 'sentence-transformers/all-mpnet-base-v2', but take care because if the model don't work with sentence-transformers this package will not work with it.
Type: string

EXAMPLE WITH TEMPLATE

from quese import search_by_embeddings

data_ = [
    {
        "title": "UX Designer",
        "tags": "Designer"
    },
    {
        "title": "Senior Accounter",
        "tags": "Accounter" 
    },
    {
        "title": "Product Manager",
        "tags": "Managment" 
    }
]

results = search_by_embeddings(data=data_, query="Manager", template="{title}, {tags}")
#Results will return a LIST with the dictionaries whose title and tags are Semantically Similar to the query: "Manager", so in this case, the last dictionary: "Product Manager".
print(results)

Project details

These details have not been verified by PyPI

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

This version

0.1.2

Aug 31, 2023

0.1.1

Aug 31, 2023

0.1

Aug 31, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

quese-0.1.2.tar.gz (3.9 kB view hashes)

Uploaded Aug 31, 2023 Source

Hashes for quese-0.1.2.tar.gz

Hashes for quese-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`e6e5fa32872dbd5209a65b35590262b35a9f042fd6b25af52e7be73c699de8ad`
MD5	`5b47f7d92227282167f73d2d836c6f18`
BLAKE2b-256	`3b6452c337925df94f1dfb71f80007c3a03a8b170e26dfa2515f6cdd6f3c7dc9`