Skip to main content

A flexible sentiment analysis predictor package supporting multiple pre-trained models, customizable preprocessing, visualization tools, fine-tuning capabilities, and seamless integration with pandas DataFrames.

Project description

python PyPI - Version Code style: black Ruff security: bandit Downloads

Sentiment Analysis Predictor

Emotion Classifier Logo

A flexible sentiment analysis predictor package supporting multiple pre-trained models, customizable preprocessing, visualization tools, fine-tuning capabilities, and seamless integration with pandas DataFrames.

Overview

sentimentpredictor is a Python package designed to classify sentiments in text using various pre-trained models from Hugging Face's Transformers library. This package provides a user-friendly interface for sentiment classification, along with tools for data preprocessing, visualization, fine-tuning, and integration with popular data platforms.

Features

  • Multiple Model Support: Easily switch between different pre-trained models.
  • Customizable Preprocessing: Clean and preprocess text data with customizable functions.
  • Visualization Tools: Visualize sentiment distributions and trends over time.
  • Fine-tuning Capability: Fine-tune models on your own datasets.
  • User-friendly CLI: Command-line interface for quick sentiment classification.
  • Integration with Data Platforms: Seamless integration with pandas DataFrames.
  • Extended Post-processing: Additional utilities for detailed sentiment analysis.

Installation

You can install the package using pip:

pip install sentimentpredictor

Usage

Basic Usage

Here's an example of how to use the SentimentPredictor to classify a single text:

from sentimentpredictor import SentimentPredictor

# Initialize the predictor with the default model
predictor = SentimentPredictor()

# Classify a single text
text = "I am very happy today!"
result = predictor.predict(text)
print("Sentiment:", result['label'])
print("Confidence:", result['confidence'])

Batch Processing

You can classify multiple texts at once using the predict_batch method:

texts = ["I am very happy today!", "I am so sad."]
results = predictor.predict_batch(texts)
print("Batch processing results:", results)

Visualization

To visualize the sentiment distribution of a text:

from sentimentpredictor import plot_sentiment_distribution

result = predictor.predict("I am very happy today!")
plot_sentiment_distribution(result['probabilities'], predictor.labels.values())

CLI Usage

You can also use the package from the command line:

sentimentpredictor --model roberta --text "I am very happy today!"

DataFrame Integration

Integrate with pandas DataFrames to classify text columns:

import pandas as pd
from sentimentpredictor import DataFrameSentimentPredictor

df = pd.DataFrame({
    'text': ["I am very happy today!", "I am so sad."]
})

predictor = DataFrameSentimentPredictor()
df = predictor.classify_dataframe(df, 'text')
print(df)

Sentiment Trends Over Time

Analyze and plot sentiment trends over time:

from sentimentpredictor import SentimentAnalysisTrends

texts = ["I am very happy today!", "I am feeling okay.", "I am very sad."]
trends = SentimentAnalysisTrends()
sentiments = trends.analyze_trends(texts)
trends.plot_trends(sentiments)

Fine-tuning

Fine-tune a pre-trained model on your own dataset:

from sentimentpredictor.fine_tune import fine_tune_model

# Define your train and validation datasets
train_dataset = ...
val_dataset = ...

# Fine-tune the model
fine_tune_model(predictor.model, predictor.tokenizer, train_dataset, val_dataset, output_dir='fine_tuned_model')

Logging Configuration

By default, the sentimentpredictor package logs messages at the WARNING level and above. If you need more detailed logging (e.g., for debugging), you can set the logging level to INFO or DEBUG:

from sentimentpredictor.logger import set_logging_level

# Set logging level to INFO
set_logging_level('INFO')

# Set logging level to DEBUG
set_logging_level('DEBUG')

You can set the logging level to one of the following: DEBUG, INFO, WARNING, ERROR, CRITICAL.

Running Tests

Run the tests using pytest:

poetry run pytest

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgements

This package uses pre-trained models from the Hugging Face Transformers library.

Contributing

Contributions are welcome! Please see the CONTRIBUTING file for guidelines on how to contribute to this project.

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sentimentpredictor-0.1.3.tar.gz (9.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sentimentpredictor-0.1.3-py3-none-any.whl (12.5 kB view details)

Uploaded Python 3

File details

Details for the file sentimentpredictor-0.1.3.tar.gz.

File metadata

  • Download URL: sentimentpredictor-0.1.3.tar.gz
  • Upload date:
  • Size: 9.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.4 Darwin/23.6.0

File hashes

Hashes for sentimentpredictor-0.1.3.tar.gz
Algorithm Hash digest
SHA256 cdb10d73034d3eefd763a9d6dafadac42277ec131403e7943b510f10fd6b903e
MD5 d675b304047316c6f368000445137055
BLAKE2b-256 da9a56da7fc17dc06786cdbff5510e1c5331ee6ec6a8ce247eeb038b96adf9e6

See more details on using hashes here.

File details

Details for the file sentimentpredictor-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: sentimentpredictor-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 12.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.4 Darwin/23.6.0

File hashes

Hashes for sentimentpredictor-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 a0adf14cd8b7b523c1193055c6def2509f5afdb69f7858930826a0a555efec32
MD5 22affcf53ecd5e2378cf338348a8c2d0
BLAKE2b-256 8b57388bc8447437399217e19258856e98a82504be7f7b46aede64e6e26b8c5c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page