Skip to main content

Traditional Chinese sentiment analysis tool based on BERT.

Project description

senti_c (sentiment analysis toolkit for traditional Chinese)

簡介

本工具為繁體中文情感分析套件,支援三種類型分析:句子情感分類、屬性術語提取、屬性情感分類;同時提供函數供使用者應用其它資料重新微調模型。

目錄


執行環境

  • python3.8

安裝方式

1.pip

pip install senti_c 

2.from source

git clone https://github.com/hsinmin/senti_c
cd senti_c
python3 setup.py install

功能介紹

1.句子情感分類:預測

from senti_c import SentenceSentimentClassification

sentence_classifier = SentenceSentimentClassification()

test_data = ["我很喜歡這家店!超級無敵棒!","這個服務生很不親切..."]  
result = sentence_classifier.predict(test_data,run_split=True,aggregate_strategy=False)  # 可依據需求調整參數
  • 結果如下:

avatar

2.句子情感分類:重新微調模型

from senti_c import SentenceSentimentModel

sentence_classifier = SentenceSentimentModel()
sentence_classifier.train(data_dir="./data/sentence_data",output_dir="test_fine_tuning_sent")  # 可依據需求調整參數

3.屬性情感分析:預測

from senti_c import AspectSentimentAnalysis

aspect_classifier = AspectSentimentAnalysis()

test_data = ["我很喜歡這家店!超級無敵棒!","這個服務生很不親切..."]   
result = aspect_classifier.predict(test_data,output_result="all")  # 可依據需求調整參數
  • 結果如下:

avatar

avatar

avatar

avatar

4.屬性情感分析:重新微調模型

from senti_c import AspectSentimentModel

aspect_classifier = AspectSentimentModel()
aspect_classifier.train(data_dir="./data/aspect_data",output_dir="test_fine_tuning_aspect")  # 可依據需求調整參數

範例程式

相關功能demo可參考examples資料夾中的function_demo檔案。

資料

本研究蒐集Google評論上餐廳與飯店領域評論內容、並進行句子情感分類、屬性情感分析標記 (屬性標記與情感標記)。

相關資料格式請見data資料夾。

引用

1.論文:
凃育婷(2020)。基於順序遷移學習開發繁體中文情感分析工具。國立臺灣大學資訊管理學研究所碩士論文,台北市。

2.實驗室:
Business Analytics and Economic Impact Research Lab
Department of Information Management
National Taiwan University
http://www.im.ntu.edu.tw/~lu/index.htm

致謝

本套件基於 Hugging Face 團隊開源的 transformers

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

senti_c-0.1.2.tar.gz (40.8 kB view details)

Uploaded Source

Built Distribution

senti_c-0.1.2-py3-none-any.whl (48.9 kB view details)

Uploaded Python 3

File details

Details for the file senti_c-0.1.2.tar.gz.

File metadata

  • Download URL: senti_c-0.1.2.tar.gz
  • Upload date:
  • Size: 40.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.5

File hashes

Hashes for senti_c-0.1.2.tar.gz
Algorithm Hash digest
SHA256 90c8bd7be157d458cbfda44c7cd57920e8e2a1d592fb10f731ff56e42408d25d
MD5 33e849fbe4c39fbb1a01ea2cf614ddca
BLAKE2b-256 c7223c0c11500672518445e9cdc88e292613ec009720bd1b342163bfe016c5c6

See more details on using hashes here.

File details

Details for the file senti_c-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: senti_c-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 48.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.5

File hashes

Hashes for senti_c-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 6ac0031f1f764c80a34b26e79546fc18bb41e69bdce3288a688a21380fd885fa
MD5 3c6de3dc54ee848fd641b647e84c2d07
BLAKE2b-256 35c05d189b78ec82eadbd357a67db74800396327f4cd48512f349908bd956a08

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page