Skip to main content

A comprehensive pipeline for sentiment analysis using deep learning models

Project description

NLP Sentiment Analysis Pipeline

A comprehensive, modular pipeline for sentiment analysis using deep learning models. This package provides tools for data extraction, preprocessing, model training, and evaluation.

Features

  • Data Preparation: Extract and preprocess text data for sentiment analysis
  • Modeling: Baseline models with TF-IDF vectorization and neural networks
  • Evaluation: Comprehensive model evaluation utilities

Installation

From PyPI (once published)

pip install nlp-sentiment-pipeline

From Source

git clone https://github.com/FranzCastillo/NLP-Tweets-Sentiment-Analysis-DL-Models
cd NLP-Tweets-Sentiment-Analysis-DL-Models/pipeline
pip install -e .

For Development

pip install -e ".[dev]"

Usage

As a Python Package

from pipeline.data_preparation import DataExtractor, TextPreprocessor, DataSplitter
from pipeline.modeling import BaselineModel, TfidfVectorizerWrapper
from pipeline.evaluation import ModelEvaluator

# Extract data
extractor = DataExtractor(split="train")
df = extractor.extract()

# Preprocess text
preprocessor = TextPreprocessor(remove_stopwords=True)
df['clean_text'] = df['text'].apply(preprocessor.preprocess)

# Train model
model = BaselineModel()
# ... training code ...

# Evaluate
evaluator = ModelEvaluator(model, X_test, y_test)
results = evaluator.evaluate()

As a Command-Line Tool

nlp-sentiment-pipeline

Package Structure

pipeline/
├── __init__.py
├── main.py
├── data_preparation/      # Data extraction and preprocessing
│   ├── __init__.py
│   ├── extraction.py
│   ├── preprocessing.py
│   └── data_splitter.py
├── modeling/              # Model definitions and utilities
│   ├── __init__.py
│   ├── baseline.py
│   ├── vectorizer.py
│   └── model_evaluator.py
└── evaluation/            # Evaluation utilities
    ├── __init__.py
    ├── evaluator.py
    └── model_evaluator.py

Subpackages

data_preparation

Tools for data extraction and preprocessing:

  • DataExtractor: Extract datasets from various sources
  • TextPreprocessor: Clean and preprocess text data
  • DataSplitter: Split data into train/validation/test sets

modeling

Model implementations and utilities:

  • BaselineModel: Baseline neural network model
  • TfidfVectorizerWrapper: TF-IDF vectorization wrapper
  • Various deep learning models

evaluation

Model evaluation tools:

  • ModelEvaluator: Comprehensive model evaluation
  • evaluate_model: Quick evaluation function
  • print_evaluation_results: Pretty-print evaluation metrics

Requirements

  • Python >= 3.8
  • TensorFlow >= 2.15.0
  • pandas >= 2.3.1
  • scikit-learn >= 1.5.2
  • nltk >= 3.9.1
  • spacy >= 3.8.7

See requirements.txt for a complete list of dependencies.

Development

Running Tests

pytest

Code Formatting

black pipeline/

Type Checking

mypy pipeline/

License

MIT License - see LICENSE file for details.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Authors

Francisco Castillo - cas21562@uvg.edu.gt

Changelog

0.1.0 (2025-11-14)

  • Initial release
  • Data preparation subpackage
  • Modeling subpackage
  • Evaluation subpackage

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nlp_sentiment_pipeline-0.1.0.tar.gz (4.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nlp_sentiment_pipeline-0.1.0-py3-none-any.whl (4.1 kB view details)

Uploaded Python 3

File details

Details for the file nlp_sentiment_pipeline-0.1.0.tar.gz.

File metadata

  • Download URL: nlp_sentiment_pipeline-0.1.0.tar.gz
  • Upload date:
  • Size: 4.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.11

File hashes

Hashes for nlp_sentiment_pipeline-0.1.0.tar.gz
Algorithm Hash digest
SHA256 050292bdaff4827f8cf3fa72a409d9db90d958eff7273657593f4de70a40f2a8
MD5 9261b626e81868cec7c9a4a5c4bddda9
BLAKE2b-256 f3d8888d01ad9b889fb61f8f9492590ac9ba3e4c1c19ccda8156252e0a68631a

See more details on using hashes here.

File details

Details for the file nlp_sentiment_pipeline-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for nlp_sentiment_pipeline-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 61febd89142af9248f61c392a587df1b88f54195d6246dccb3b19dba3d765ed3
MD5 bc38c63493c9808db43bc47b0a18aa65
BLAKE2b-256 3c63573489c20e4d70c841c7d0d759447cd9123654de53eb35275185395ae12d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page