A Transformer-based library for Sentiment Analysis in Spanish
Project description
PySentimiento: Sentiment Analysis in Spanish
A simple Transformer-based library for Sentiment Analysis in Spanish (some other languages coming soon!).
Just do pip install pysentimiento
and start using it:
from pysentimiento import SentimentAnalyzer
analyzer = SentimentAnalyzer()
analyzer.predict("Qué gran jugador es Messi")
# returns 'POS'
analyzer.predict("Esto es pésimo")
# returns 'NEG'
analyzer.predict("Qué es esto?")
# returns 'NEU'
analyzer.predict_probas("Dónde estamos?")
# returns {'NEG': 0.10235335677862167,
# 'NEU': 0.8503277897834778,
# 'POS': 0.04731876030564308}
Also, you might use pretrained models directly with transformers
library.
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("finiteautomata/beto-sentiment-analysis")
model = AutoModelForSequenceClassification.from_pretrained("finiteautomata/beto-sentiment-analysis")
Trained models so far
Instructions for developers
- First, download TASS 2020 data to
data/tass2020
(you have to register here to download the dataset)
Labels must be placed under data/tass2020/test1.1/labels
- Run script to train models
python bin/train.py "dccuchile/bert-base-spanish-wwm-cased" models/beto-sentiment-analysis/ --epochs 3
- Upload models to Huggingface's Model Hub
TODO:
- Upload some other models
- Train in other languages
- Write brief paper with description
Suggestions and bugfixes
Please use the repository issue tracker to point out bugs and make suggestions (new models, use another datasets, some other languages, etc)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pysentimiento-0.0.1.2.tar.gz
(4.6 kB
view hashes)
Built Distribution
Close
Hashes for pysentimiento-0.0.1.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6e1cc575de91eff5cdb456d0f1291dde49665fe1125fe60c400e535966336b67 |
|
MD5 | 99fdd14160103730653c5c3a65125fce |
|
BLAKE2b-256 | db962cb7f64e8a711959ecc59ad3936f62d0f2b5178942e1673527ad04976337 |