Sentiment analysis library for russian language
Project description
Dostoevsky
Sentiment analysis library for russian language
Install
Please note that Dostoevsky
supports only Python 3.6+
$ pip install dostoevsky
Social network model [FastText]
This model was trained on RuSentiment dataset and achieves up to ~0.71 F1 score.
Hyperparameters used for training:
epoch = 10
lr = 0.21909
dim = 64
minCount = 1
wordNgrams = 3
minn = 2
maxn = 5
bucket = 259929
dsub = 2
loss = one-vs-all
Usage
First of all, you'll need to download binary model:
$ dostoevsky download fasttext-social-network-model
Then you can use sentiment analyzer:
from dostoevsky.tokenization import RegexTokenizer
from dostoevsky.models import FastTextSocialNetworkModel
tokenizer = RegexTokenizer()
tokens = tokenizer.split('всё очень плохо') # [('всё', None), ('очень', None), ('плохо', None)]
model = FastTextSocialNetworkModel(tokenizer=tokenizer)
messages = [
'привет',
'я люблю тебя!!',
'малолетние дебилы'
]
results = model.predict(messages, k=2)
for message, sentiment in zip(messages, results):
"""
привет -> {'speech': 1.0000100135803223, 'skip': 0.0020607432816177607}
я люблю тебя!! -> {'positive': 0.9886782765388489, 'skip': 0.005394937004894018}
малолетние дебилы -> {'negative': 0.9525841474533081, 'neutral': 0.13661839067935944}]
"""
print(message, '->', sentiment)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
dostoevsky-0.4.0.tar.gz
(7.2 kB
view hashes)
Built Distribution
dostoevsky-0.4.0-py3-none-any.whl
(10.0 kB
view hashes)
Close
Hashes for dostoevsky-0.4.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 567dfd8edd75358d911ab9fb62a79ce9f4b4846b20615ae3091db2a4092b3117 |
|
MD5 | 0b978506e8d94ec0fb36f04ae8c772e8 |
|
BLAKE2b-256 | f02d022e217ea1cde50a420303b52b50d4a8ee247a5f278676eff43b8820e8c5 |