Skip to main content

Traditional Chinese sentiment analysis tool based on BERT.

Project description

senti_c (sentiment analysis toolkit for traditional Chinese)

簡介

本工具為繁體中文情感分析套件,支援三種類型分析:句子情感分類、屬性術語提取、屬性情感分類;同時提供函數供使用者應用其它資料重新微調模型。

目錄


執行環境

  • python3.8

安裝方式

1.pip

pip install senti_c 

2.from source

git clone https://github.com/hsinmin/senti_c
cd senti_c
python3 setup.py install

功能介紹

1.句子情感分類:預測

from senti_c import SentenceSentimentClassification

sentence_classifier = SentenceSentimentClassification()

test_data = ["我很喜歡這家店!超級無敵棒!","這個服務生很不親切..."]  
result = sentence_classifier.predict(test_data,run_split=True,aggregate_strategy=False)  # 可依據需求調整參數
  • 結果如下:

avatar

2.句子情感分類:重新微調模型

from senti_c import SentenceSentimentModel

sentence_classifier = SentenceSentimentModel()
sentence_classifier.train(data_dir="./data/sentence_data",output_dir="test_fine_tuning_sent")  # 可依據需求調整參數

3.屬性情感分析:預測

from senti_c import AspectSentimentAnalysis

aspect_classifier = AspectSentimentAnalysis()

test_data = ["我很喜歡這家店!超級無敵棒!","這個服務生很不親切..."]   
result = aspect_classifier.predict(test_data,output_result="all")  # 可依據需求調整參數
  • 結果如下:

avatar

avatar

avatar

avatar

4.屬性情感分析:重新微調模型

from senti_c import AspectSentimentModel

aspect_classifier = AspectSentimentModel()
aspect_classifier.train(data_dir="./data/aspect_data",output_dir="test_fine_tuning_aspect")  # 可依據需求調整參數

範例程式

相關功能demo可參考examples資料夾中的function_demo檔案。

資料

本研究蒐集Google評論上餐廳與飯店領域評論內容、並進行句子情感分類、屬性情感分析標記 (屬性標記與情感標記)。

相關資料格式請見data資料夾。

引用

1.論文:
凃育婷(2020)。基於順序遷移學習開發繁體中文情感分析工具。國立臺灣大學資訊管理學研究所碩士論文,台北市。

2.實驗室:
Business Analytics and Economic Impact Research Lab
Department of Information Management
National Taiwan University
http://www.im.ntu.edu.tw/~lu/index.htm

致謝

本套件基於 Hugging Face 團隊開源的 transformers

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

senti_c-0.1.1.tar.gz (39.6 kB view details)

Uploaded Source

Built Distribution

senti_c-0.1.1-py3-none-any.whl (48.9 kB view details)

Uploaded Python 3

File details

Details for the file senti_c-0.1.1.tar.gz.

File metadata

  • Download URL: senti_c-0.1.1.tar.gz
  • Upload date:
  • Size: 39.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.5

File hashes

Hashes for senti_c-0.1.1.tar.gz
Algorithm Hash digest
SHA256 2bab0389e2efdd55dec1540818e6c64cca71b5b67160b9bc44c26c18de050e4b
MD5 32a9a301b876db626e0f63c52cba6e85
BLAKE2b-256 6e856202225e86c212b1a3b0c16916062c871cb4fcc88de392f42448340ac2a3

See more details on using hashes here.

File details

Details for the file senti_c-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: senti_c-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 48.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.5

File hashes

Hashes for senti_c-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 728aee0cd8433437cd8b09ee3e715b4d62aaa6cd0d8f8e170afabff9431b079c
MD5 47d659234c382b706e4c4261550d7236
BLAKE2b-256 5ad320de26bd81fbcd0aa1c642d02ed5679c95599bc346764912331fd5c7516c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page