Traditional Chinese sentiment analysis tool based on BERT.
Project description
senti_c (sentiment analysis toolkit for traditional Chinese)
簡介
本工具為繁體中文情感分析套件,支援三種類型分析:句子情感分類、屬性術語提取、屬性情感分類;同時提供函數供使用者應用其它資料重新微調模型。
目錄
執行環境
- python3.8
安裝方式
- 要先至 https://pytorch.org/ 下載適合作業系統的PyTorch Version 1.x套件。
1.pip
pip install senti_c
2.from source
git clone https://github.com/hsinmin/senti_c
cd senti_c
python3 setup.py install
功能介紹
1.句子情感分類:預測
from senti_c import SentenceSentimentClassification
sentence_classifier = SentenceSentimentClassification()
test_data = ["我很喜歡這家店!超級無敵棒!","這個服務生很不親切..."]
result = sentence_classifier.predict(test_data,run_split=True,aggregate_strategy=False) # 可依據需求調整參數
- 結果如下:
2.句子情感分類:重新微調模型
from senti_c import SentenceSentimentModel
sentence_classifier = SentenceSentimentModel()
sentence_classifier.train(data_dir="./data/sentence_data",output_dir="test_fine_tuning_sent") # 可依據需求調整參數
3.屬性情感分析:預測
from senti_c import AspectSentimentAnalysis
aspect_classifier = AspectSentimentAnalysis()
test_data = ["我很喜歡這家店!超級無敵棒!","這個服務生很不親切..."]
result = aspect_classifier.predict(test_data,output_result="all") # 可依據需求調整參數
- 結果如下:
4.屬性情感分析:重新微調模型
from senti_c import AspectSentimentModel
aspect_classifier = AspectSentimentModel()
aspect_classifier.train(data_dir="./data/aspect_data",output_dir="test_fine_tuning_aspect") # 可依據需求調整參數
範例程式
相關功能demo可參考examples資料夾中的function_demo檔案。
資料
本研究蒐集Google評論上餐廳與飯店領域評論內容、並進行句子情感分類、屬性情感分析標記 (屬性標記與情感標記)。
相關資料格式請見data資料夾。
引用
1.論文:
凃育婷(2020)。基於順序遷移學習開發繁體中文情感分析工具。國立臺灣大學資訊管理學研究所碩士論文,台北市。
2.實驗室:
Business Analytics and Economic Impact Research Lab
Department of Information Management
National Taiwan University
http://www.im.ntu.edu.tw/~lu/index.htm
致謝
本套件基於 Hugging Face 團隊開源的 transformers。
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file senti_c-0.1.1.tar.gz
.
File metadata
- Download URL: senti_c-0.1.1.tar.gz
- Upload date:
- Size: 39.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
2bab0389e2efdd55dec1540818e6c64cca71b5b67160b9bc44c26c18de050e4b
|
|
MD5 |
32a9a301b876db626e0f63c52cba6e85
|
|
BLAKE2b-256 |
6e856202225e86c212b1a3b0c16916062c871cb4fcc88de392f42448340ac2a3
|
File details
Details for the file senti_c-0.1.1-py3-none-any.whl
.
File metadata
- Download URL: senti_c-0.1.1-py3-none-any.whl
- Upload date:
- Size: 48.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
728aee0cd8433437cd8b09ee3e715b4d62aaa6cd0d8f8e170afabff9431b079c
|
|
MD5 |
47d659234c382b706e4c4261550d7236
|
|
BLAKE2b-256 |
5ad320de26bd81fbcd0aa1c642d02ed5679c95599bc346764912331fd5c7516c
|