Chinese Herbal Medicine E-commerce Sentiment Analysis System

These details have not been verified by PyPI

Project links

Project description

Chinese Herbal Medicine Sentiment Analysis System

A comprehensive Natural Language Processing (NLP) toolkit specifically designed for analyzing customer reviews and evaluating supply chain quality in Chinese herbal medicine e-commerce platforms.

🎯 Features

🔍 Sentiment Analysis

Dictionary-based Analysis: Traditional sentiment analysis using Chinese sentiment dictionaries
Machine Learning Models: SVM, Naive Bayes, and Logistic Regression classifiers
Deep Learning Models: LSTM, TextCNN, and BERT-based sentiment analysis
Graph-based Analysis: TextRank algorithm for sentiment analysis

🔑 Keyword Extraction

TF-IDF: Term Frequency-Inverse Document Frequency for keyword extraction
TextRank: Graph-based algorithm for keyword ranking
LDA: Latent Dirichlet Allocation for topic-based keyword extraction

📊 Supply Chain Evaluation

Multi-dimensional Analysis: Upstream (raw materials), midstream (processing), downstream (distribution)
Quality Metrics: Comprehensive evaluation of supply chain quality indicators
Visualization: Rich visualizations for analysis results

🛠️ Utility Features

Data Processing: Efficient handling of large-scale review datasets
Visualization Tools: Comprehensive plotting and charting capabilities
Command-line Interface: Easy-to-use CLI for batch processing
Modular Design: Flexible and extensible architecture

🚀 Quick Start

Installation

# Basic installation
pip install chinese-herbal-sentiment

# With deep learning support
pip install chinese-herbal-sentiment[deep_learning]

# With development tools
pip install chinese-herbal-sentiment[dev]

# Complete installation
pip install chinese-herbal-sentiment[all]

Basic Usage

import pandas as pd
from chinese_herbal_sentiment import SentimentAnalyzer, KeywordExtractor

# Sample data
data = pd.DataFrame({
    '评论内容': [
        '这个中药质量很好，效果不错',
        '包装很差，质量一般',
        '服务态度很好，物流快'
    ]
})

# Sentiment analysis
analyzer = SentimentAnalyzer()
sentiment_results = analyzer.analyze_all_methods(data)

# Keyword extraction
extractor = KeywordExtractor()
keyword_results = extractor.extract_all_methods(data, num_keywords=10)

print("Sentiment Results:", sentiment_results.head())
print("Keywords:", keyword_results.head())

Command Line Usage

# Analyze sentiment
chinese-herbal-analyze data/reviews.xlsx --method all --output results.csv

# Extract keywords
chinese-herbal-keywords data/reviews.xlsx --method tfidf --num_keywords 20

# Full analysis
chinese-herbal-full data/reviews.xlsx --mode all --output_dir results/

📚 Documentation

API Reference

SentimentAnalyzer

from chinese_herbal_sentiment import SentimentAnalyzer

analyzer = SentimentAnalyzer()

# Single method analysis
results = analyzer.analyze_sentiment(data, method='svm')

# All methods analysis
results = analyzer.analyze_all_methods(data)

Methods:

dictionary: Dictionary-based sentiment analysis
svm: Support Vector Machine classifier
naive_bayes: Naive Bayes classifier
logistic_regression: Logistic Regression classifier
all: All available methods

KeywordExtractor

from chinese_herbal_sentiment import KeywordExtractor

extractor = KeywordExtractor()

# Single method extraction
keywords = extractor.extract_keywords(data, method='tfidf', num_keywords=20)

# All methods extraction
keywords = extractor.extract_all_methods(data, num_keywords=20)

Methods:

tfidf: TF-IDF keyword extraction
textrank: TextRank algorithm
lda: Latent Dirichlet Allocation
all: All available methods

Deep Learning Models

from chinese_herbal_sentiment import BERTSentimentAnalyzer, TextCNNSentimentAnalyzer

# BERT analysis
bert_analyzer = BERTSentimentAnalyzer()
bert_results = bert_analyzer.analyze_sentiment(data)

# TextCNN analysis
textcnn_analyzer = TextCNNSentimentAnalyzer()
textcnn_results = textcnn_analyzer.analyze_sentiment(data)

Advanced Usage

Custom Analysis Pipeline

from chinese_herbal_sentiment import DataAnalyzer, Visualizer

# Load and preprocess data
data_analyzer = DataAnalyzer()
data = data_analyzer.load_data('reviews.xlsx', sample_size=10000)

# Perform analysis
sentiment_results = analyzer.analyze_all_methods(data)
keyword_results = extractor.extract_all_methods(data)

# Generate visualizations
visualizer = Visualizer()
visualizer.plot_sentiment_distribution(sentiment_results, save_path='sentiment.png')
visualizer.plot_keyword_cloud(keyword_results, save_path='keywords.png')

Supply Chain Quality Evaluation

from chinese_herbal_sentiment.utils.keyword_mapping import KeywordMapper

# Map keywords to supply chain dimensions
mapper = KeywordMapper()
supply_chain_results = mapper.map_keywords_to_dimensions(keyword_results)

# Analyze quality indicators
quality_metrics = mapper.calculate_quality_metrics(supply_chain_results)

📊 Output Examples

Sentiment Analysis Results

评论内容	dictionary_sentiment	svm_sentiment	naive_bayes_sentiment	logistic_regression_sentiment
质量很好，效果不错	positive	positive	positive	positive
包装很差，质量一般	negative	negative	negative	negative
服务态度很好	positive	positive	positive	positive

Keyword Extraction Results

keyword	score	method
质量	0.85	TF-IDF
包装	0.72	TF-IDF
服务	0.68	TF-IDF
效果	0.65	TextRank
物流	0.58	TextRank

🔧 Configuration

Data Format

The package expects data in the following format:

# Excel/CSV file with columns:
data = pd.DataFrame({
    '评论内容': ['review text 1', 'review text 2', ...],
    '评分': [5, 4, 3, ...],  # Optional
    '时间': ['2024-01-01', '2024-01-02', ...],  # Optional
    '用户ID': ['user1', 'user2', ...]  # Optional
})

Model Configuration

# Custom model parameters
analyzer = SentimentAnalyzer(
    vectorizer_params={'max_features': 5000},
    classifier_params={'C': 1.0}
)

extractor = KeywordExtractor(
    tfidf_params={'max_features': 1000},
    textrank_params={'window_size': 4}
)

🧪 Testing

# Run all tests
pytest

# Run with coverage
pytest --cov=chinese_herbal_sentiment

# Run specific test file
pytest tests/test_sentiment_analysis.py

📈 Performance

Accuracy Comparison

Method	Accuracy	Precision	Recall	F1-Score
Dictionary	0.72	0.71	0.72	0.71
SVM	0.85	0.84	0.85	0.84
Naive Bayes	0.82	0.81	0.82	0.81
Logistic Regression	0.87	0.86	0.87	0.86
BERT	0.91	0.90	0.91	0.90
TextCNN	0.89	0.88	0.89	0.88

Processing Speed

Small dataset (< 1K reviews): ~1-2 seconds
Medium dataset (1K-10K reviews): ~10-30 seconds
Large dataset (> 10K reviews): ~2-5 minutes

🤝 Contributing

We welcome contributions! Please see our Contributing Guidelines for details.

Development Setup

# Clone the repository
git clone https://github.com/chenxingqiang/chinese-herbal-sentiment.git
cd chinese-herbal-sentiment

# Install in development mode
pip install -e .[dev]

# Run tests
pytest

# Format code
black chinese_herbal_sentiment tests

# Lint code
flake8 chinese_herbal_sentiment tests

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Research Foundation: Based on master's thesis research on Chinese herbal medicine e-commerce supply chain quality evaluation
Open Source Libraries: Built on top of scikit-learn, transformers, PyTorch, and other excellent open-source projects
Academic Community: Inspired by research in sentiment analysis and supply chain management

📞 Support

Documentation: GitHub Wiki
Issues: GitHub Issues
Email: chenxingqiang@turingai.cc

🔄 Changelog

v0.1.0 (2024-12-XX)

Initial release
Basic sentiment analysis (dictionary, SVM, Naive Bayes, Logistic Regression)
Keyword extraction (TF-IDF, TextRank, LDA)
Deep learning models (BERT, TextCNN, TextRank)
Command-line interface
Comprehensive documentation and examples

Note: This package is designed specifically for Chinese herbal medicine e-commerce review analysis and supply chain quality evaluation. For general sentiment analysis tasks, consider using more general-purpose NLP libraries.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.0.0

Aug 26, 2025

This version

0.1.0

Aug 25, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chinese_herbal_sentiment-0.1.0.tar.gz (153.2 kB view details)

Uploaded Aug 25, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

chinese_herbal_sentiment-0.1.0-py3-none-any.whl (146.1 kB view details)

Uploaded Aug 25, 2025 Python 3

File details

Details for the file chinese_herbal_sentiment-0.1.0.tar.gz.

File metadata

Download URL: chinese_herbal_sentiment-0.1.0.tar.gz
Upload date: Aug 25, 2025
Size: 153.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.6

File hashes

Hashes for chinese_herbal_sentiment-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`cd0c85a5303d6280fb9962174f03abe4d74bf65bb62b807a177dc8ce35f2a9f8`
MD5	`f25710337e23b6b6df817bf9dc17f5ab`
BLAKE2b-256	`19a3d2404a8da24d3717e789a9a617f4667d27a33d45eb1129bded18cc4e164a`

See more details on using hashes here.

File details

Details for the file chinese_herbal_sentiment-0.1.0-py3-none-any.whl.

File metadata

Download URL: chinese_herbal_sentiment-0.1.0-py3-none-any.whl
Upload date: Aug 25, 2025
Size: 146.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.6

File hashes

Hashes for chinese_herbal_sentiment-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ac9a91bb8a54e0a8ebbf540216dc03fe74509dc0e67e87bcb70dcd4e6a1fc666`
MD5	`43bf236178fd603e4fa89ff3698cd391`
BLAKE2b-256	`8ae86668eb88d9fb528219a85865c7ea9ce7a11d2ac40ba84958294f539e2309`

See more details on using hashes here.

chinese-herbal-sentiment 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Chinese Herbal Medicine Sentiment Analysis System

🎯 Features

🔍 Sentiment Analysis

🔑 Keyword Extraction

📊 Supply Chain Evaluation

🛠️ Utility Features

🚀 Quick Start

Installation

Basic Usage

Command Line Usage

📚 Documentation

API Reference

SentimentAnalyzer

KeywordExtractor

Deep Learning Models

Advanced Usage

Custom Analysis Pipeline

Supply Chain Quality Evaluation

📊 Output Examples

Sentiment Analysis Results

Keyword Extraction Results

🔧 Configuration

Data Format

Model Configuration

🧪 Testing

📈 Performance

Accuracy Comparison

Processing Speed

🤝 Contributing

Development Setup

📄 License

🙏 Acknowledgments

📞 Support

🔄 Changelog

v0.1.0 (2024-12-XX)

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes