Sentiment analysis pipeline for texts in multiple languages.
Project description
multi-language-sentiment
A pipeline for sentiment analysis for texts with unknown language.
We use the lingua-language-detector to detect the language and run the text samples through an sentiment analysis appropriate pipeline for that language.
Usage
Basic usage for analysing a list of sentences:
import multi_language_sentiment
texts = ["This is a positive sentence", "Tämä on ikävä juttu"]
sentiments = multi_language_sentiment.sentiment(texts)
print(sentiments)
This should print
[{'label': 'positive', 'score': 0.89024418592453}, {'label': 'negative', 'score': 0.8899219632148743}]
Supported language
The module currently supports the following langauges by default: English, Japanese, Arabic, German, Spanish, French, Chinese, Indonesian, Hindi, Italian, Malay, Portuguese, Swedish, and Finnish.
For other languages, you must supply a path for a HuggingFace sentiment analysis pipeline. To supply a pipelien for a new language, use the models parameter:
import multi_language_sentiment
from lingua import Language
texts = ["This is a positive sentence", "Tämä on ikävä juttu"]
models = {Language.FINNISH: "fergusq/finbert-finnsentiment"}
sentiments = multi_language_sentiment.sentiment(texts, models = models)
Technical details
Note that the pipeline will split each text sample to a maximum length of 512 characters. The sentiments are aggregated by adding up the scores and taking the largest value.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for multi_language_sentiment-0.1.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | e96c417811065fc5e05e783e1fd8117a15e269a1190085f1cf1005626da5d25c |
|
MD5 | a35acd42183e40132ccf39316df7bc86 |
|
BLAKE2b-256 | 74e3d5abd32bf27be9ea3ec1ee0e8d64b0f9cccaa3ea48fca4a06d5151a251fb |
Hashes for multi_language_sentiment-0.1.0-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a890212e50a15fa9d002cf262839aff08f7631277f1fdb44a0f185af3f08de18 |
|
MD5 | 9eeccab84dcc579343fe584ce045ac91 |
|
BLAKE2b-256 | 744f59f75bd5831d6d4e7e73f013744e662ed9898fc67f71f31249ca4bb5155c |