Skip to main content

A simple tool for Vietnamese Sentiment Analysis

Project description

A Simple Tool For Sentiment Analysis

Sentivi - a simple tool for sentiment analysis which is a wrapper of scikit-learn and PyTorch Transformers models (for more specific purpose, it is recommend to use native library instead). It is made for easy and faster pipeline to train and evaluate several classification algorithms.

Documentation: https://sentivi.readthedocs.io/en/latest/index.html

Classifiers

  • Decision Tree
  • Gaussian Naive Bayes
  • Gaussian Process
  • Nearest Centroid
  • Support Vector Machine
  • Stochastic Gradient Descent
  • Character Convolutional Neural Network
  • Multi-Layer Perceptron
  • Long Short Term Memory
  • Text Convolutional Neural Network
  • Transformer
  • Ensemble
  • Lexicon-based

Text Encoders

  • One-hot
  • Bag of Words
  • Term Frequency - Inverse Document Frequency
  • Word2Vec
  • Transformer Tokenizer (for Transformer classifier only)
  • WordPiece
  • SentencePiece

Install

  • Install legacy version from PyPI:

    pip install sentivi
    
  • Install latest version from source:

    git clone https://github.com/vndee/sentivi
    cd sentivi
    pip install .
    

Example

from sentivi import Pipeline
from sentivi.data import DataLoader, TextEncoder
from sentivi.classifier import SVMClassifier
from sentivi.text_processor import TextProcessor

text_processor = TextProcessor(methods=['word_segmentation', 'remove_punctuation', 'lower'])

pipeline = Pipeline(DataLoader(text_processor=text_processor, n_grams=3),
                    TextEncoder(encode_type='one-hot'),
                    SVMClassifier(num_labels=3))

train_results = pipeline(train='./data/dev.vi', test='./data/dev_test.vi')
print(train_results)

pipeline.save('./weights/pipeline.sentivi')
_pipeline = Pipeline.load('./weights/pipeline.sentivi')

predict_results = _pipeline.predict(['hàng ok đầu tuýp có một số không vừa ốc siết. chỉ được một số đầu thôi .cần '
                                    'nhất đầu tuýp 14 mà không có. không đạt yêu cầu của mình sử dụng',
                                    'Son đẹpppp, mùi hương vali thơm nhưng hơi nồng, chất son mịn, màu lên chuẩn, '
                                    'đẹppppp'])
print(predict_results)
print(f'Decoded results: {_pipeline.decode_polarity(predict_results)}')

Take a look at more examples in example/.

Pipeline Serving

Sentivi use FastAPI to serving pipeline. Simply run a web service as follows:

# serving.py
from sentivi import Pipeline, RESTServiceGateway

pipeline = Pipeline.load('./weights/pipeline.sentivi')
server = RESTServiceGateway(pipeline).get_server()
# pip install uvicorn python-multipart
uvicorn serving:server --host 127.0.0.1 --port 8000

Access Swagger at http://127.0.0.1:8000/docs or Redoc http://127.0.0.1:8000/redoc. For example, you can use curl to send post requests:

curl --location --request POST 'http://127.0.0.1:8000/get_sentiment/' \
     --form 'text=Son đẹpppp, mùi hương vali thơm nhưng hơi nồng'

# response
{ "polarity": 2, "label": "#POS" }

Deploy using Docker

FROM tiangolo/uvicorn-gunicorn-fastapi:python3.7

COPY . /app

ENV PYTHONPATH=/app
ENV APP_MODULE=serving:server
ENV WORKERS_PER_CORE=0.75
ENV MAX_WORKERS=6
ENV HOST=0.0.0.0
ENV PORT=80

RUN pip install -r requirements.txt
docker build -t sentivi .
docker run -d -p 8000:80 sentivi

Future Releases

  • Lexicon-based
  • CharCNN
  • Ensemble learning methods
  • Model serving (Back-end and Front-end)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sentivi-1.1.0.tar.gz (19.8 kB view details)

Uploaded Source

Built Distribution

sentivi-1.1.0-py3-none-any.whl (27.2 kB view details)

Uploaded Python 3

File details

Details for the file sentivi-1.1.0.tar.gz.

File metadata

  • Download URL: sentivi-1.1.0.tar.gz
  • Upload date:
  • Size: 19.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/44.1.1 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.8.2

File hashes

Hashes for sentivi-1.1.0.tar.gz
Algorithm Hash digest
SHA256 71b43371c57e8084fcb4f57c53dbfa02bda63be411c40d93371cdfd1e36eec40
MD5 2e37d65ab61f090a65444ba9f403fa24
BLAKE2b-256 03ca80fe70642399520921071e50bd9dbf36110fe31bc6e82d4f83dbf8fef5a5

See more details on using hashes here.

File details

Details for the file sentivi-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: sentivi-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 27.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/44.1.1 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.8.2

File hashes

Hashes for sentivi-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 46fe10d7d06d523a585b8e764b365c5360b575457478784086ab3f717e035694
MD5 ccb52206f75095600529827ca201d708
BLAKE2b-256 b39b0235f071321d1ac7d3f4552639ef1dee8d3fd4c6559e7b2abf63fe0d1239

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page