A flexible sentiment analysis classifier package supporting multiple pre-trained models, customizable preprocessing, visualization tools, fine-tuning capabilities, and seamless integration with pandas DataFrames.
Project description
Sentiment Analysis Classifier
A flexible sentiment analysis classifier package supporting multiple pre-trained models, customizable preprocessing, visualization tools, fine-tuning capabilities, and seamless integration with pandas DataFrames.
Overview
sentimentclassifier
is a Python package designed to classify sentiments in text using various pre-trained models from Hugging Face's Transformers library. This package provides a user-friendly interface for sentiment classification, along with tools for data preprocessing, visualization, fine-tuning, and integration with popular data platforms.
Features
- Multiple Model Support: Easily switch between different pre-trained models.
- Customizable Preprocessing: Clean and preprocess text data with customizable functions.
- Visualization Tools: Visualize sentiment distributions and trends over time.
- Fine-tuning Capability: Fine-tune models on your own datasets.
- User-friendly CLI: Command-line interface for quick sentiment classification.
- Integration with Data Platforms: Seamless integration with pandas DataFrames.
- Extended Post-processing: Additional utilities for detailed sentiment analysis.
Installation
You can install the package using pip:
pip install sentimentclassifier
Usage
Basic Usage
Here's an example of how to use the SentimentClassifier
to classify a single text:
from sentiment_classifier import SentimentClassifier
# Initialize the classifier with the default model
classifier = SentimentClassifier()
# Classify a single text
text = "I am very happy today!"
result = classifier.predict(text)
print("Sentiment:", result['label'])
print("Confidence:", result['confidence'])
Batch Processing
You can classify multiple texts at once using the predict_batch
method:
texts = ["I am very happy today!", "I am so sad."]
results = classifier.predict_batch(texts)
print("Batch processing results:", results)
Visualization
To visualize the sentiment distribution of a text:
from sentiment_classifier.visualization import plot_sentiment_distribution
result = classifier.predict("I am very happy today!")
plot_sentiment_distribution(result['probabilities'], classifier.labels.values())
CLI Usage
You can also use the package from the command line:
sentimentclassifier --model roberta --text "I am very happy today!"
DataFrame Integration
Integrate with pandas DataFrames to classify text columns:
import pandas as pd
from sentiment_classifier.integration import DataFrameSentimentClassifier
df = pd.DataFrame({
'text': ["I am very happy today!", "I am so sad."]
})
classifier = DataFrameSentimentClassifier()
df = classifier.classify_dataframe(df, 'text')
print(df)
Sentiment Trends Over Time
Analyze and plot sentiment trends over time:
from sentiment_classifier.trends import SentimentAnalysisTrends
texts = ["I am very happy today!", "I am feeling okay.", "I am very sad."]
trends = SentimentAnalysisTrends()
sentiments = trends.analyze_trends(texts)
trends.plot_trends(sentiments)
Fine-tuning
Fine-tune a pre-trained model on your own dataset:
from sentiment_classifier.fine_tune import fine_tune_model
# Define your train and validation datasets
train_dataset = ...
val_dataset = ...
# Fine-tune the model
fine_tune_model(classifier.model, classifier.tokenizer, train_dataset, val_dataset, output_dir='fine_tuned_model')
Logging Configuration
By default, the sentimentclassifier
package logs messages at the WARNING
level and above. If you need more detailed logging (e.g., for debugging), you can set the logging level to INFO
or DEBUG
:
from sentiment_classifier.logger import set_logging_level
# Set logging level to INFO
set_logging_level('INFO')
# Set logging level to DEBUG
set_logging_level('DEBUG')
You can set the logging level to one of the following: DEBUG
, INFO
, WARNING
, ERROR
, CRITICAL
.
Running Tests
Run the tests using pytest:
poetry run pytest
License
This project is licensed under the MIT License - see the LICENSE file for details.
Acknowledgements
This package uses pre-trained models from the Hugging Face Transformers library.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for sentimentpredictor-0.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d9155ca265e451c3e469720052f22b74fcba388901ee3d36c0ab30a2bc23fcfc |
|
MD5 | d208dab1f532d3d5e1e2ffaf8b2811db |
|
BLAKE2b-256 | d8947a32720928cad0347b0de5a43f6c29c35c4ee864806db975b3810a0c3035 |