No project description provided
Project description
Yet Another Sentence Embedding Library
The goal of this library is to make it easy to transform lists of sentences or sets of sentences into a matrix of embeddings (eg. one per sentence). This can be done either at the sentence/document level or by grouping sentence embeddings into grouped embeddings.
Such matrices of documents can easily be queried using kd-trees (see notebook in examples) for the most similar document in training data to a queried sentence. It can also be used to cluster document groups together solely by the text in the campaign.
The results can be tested for quality on a handcrafted evaluation dataset by checking how well the sentence embeddings cluster around the natural clusters of the existing ad campaigns.
(Gensim) Weighed Sentence Embeddings with Gensim model
import gensim.downloader as model_api
import yase
# Load pretrained gensim model
model = model_api.load("glove-wiki-gigaword-300")
# Tokenize list of sentences
tokens = yase.tokenize(data, lower=True, split=True)
# get word weights for higher quality embeddings
weights = yase.getWordWeights(data, "tf-idf")
# create sentence embeddings from tokens
my_embeddings = embedding.sentenceEmbedding(tokens, model, weights)
Running unit tests
python -m unittest discover tests
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file YASE-1.0.1.tar.gz
.
File metadata
- Download URL: YASE-1.0.1.tar.gz
- Upload date:
- Size: 23.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.1 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 884e99689a161ed5a35b13a4b5935d29f83a10838d34bfd3a1197b9aa78f19a4 |
|
MD5 | e3313a823231f16c466d264ae2fa0a21 |
|
BLAKE2b-256 | 0bed870c43059abeac4128fe58e54e5a69d2255ebdef3c03f2b05eaba7da2c67 |
File details
Details for the file YASE-1.0.1-py3-none-any.whl
.
File metadata
- Download URL: YASE-1.0.1-py3-none-any.whl
- Upload date:
- Size: 21.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.1 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | bb804e2dd00c2f544a9dfff452c53e7b1e32afbb31715a631ae6bbb68c45b53e |
|
MD5 | cfb5df8d7539f75ab8469a15f72718a1 |
|
BLAKE2b-256 | 9b5ad0e88feb1c72d3064777fa52af2f2cb55ad76622c726d331906f6a15dd9f |