SISTER (SImple SenTence EmbeddeR)
Project description
sister
SISTER (SImple SenTence EmbeddeR)
Installation
pip install sister
Basic Usage
import sister
sentence_embedding = sister.MeanEmbedding(lang="en")
sentence = "I am a dog."
vector = sentence_embedding(sentence)
Supported languages.
- English
- Japanese
- French
In order to support a new language, please implement Tokenizer
(inheriting sister.tokenizers.Tokenizer
) and add fastText
pre-trained url to word_embedders.get_fasttext()
(List of model urls).
Bert models are supported for en, fr, ja (2020-06-29).
Actually Albert for English, CamemBERT for French and BERT for Japanese.
To use BERT, you need to install sister by pip install 'sister[bert]'
.
import sister
bert_embedding = sister.BertEmbedding(lang="en")
sentence = "I am a dog."
vector = bert_embedding(sentence)
You can also give multiple sentences to it (more efficient).
import sister
bert_embedding = sister.BertEmbedding(lang="en")
sentences = ["I am a dog.", "I want be a cat."]
vectors = bert_embedding(sentences)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
sister-0.2.2.tar.gz
(4.7 kB
view details)
Built Distribution
File details
Details for the file sister-0.2.2.tar.gz
.
File metadata
- Download URL: sister-0.2.2.tar.gz
- Upload date:
- Size: 4.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.5 CPython/3.8.6 Darwin/19.5.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 31efc203fcaefd0ab72816d4dd19f7e845e401385c6b0881d2d44702d59015a4 |
|
MD5 | c877e43b587b21360a75a10b68fa1be7 |
|
BLAKE2b-256 | e56a39c09c4ebfb27e2dd17a16ab6994dc8b94ccf105b899809c34fa76cee321 |
File details
Details for the file sister-0.2.2-py3-none-any.whl
.
File metadata
- Download URL: sister-0.2.2-py3-none-any.whl
- Upload date:
- Size: 5.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.5 CPython/3.8.6 Darwin/19.5.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 498429b6c05995a16c5c6dfcadbf55be394ef8bf26a26d393d0d7862096dde90 |
|
MD5 | eed717eea097e412baa8b9c1b52d4488 |
|
BLAKE2b-256 | b1cb0d037e45122443a544c17d6fd29f611b4039df6199e32219be9483cf9393 |