Skip to main content

Text Classification with Transformers

Project description

nlpipes_logo

Text Classification with Transformers

nlpipes_screenshot

Overview

NLPipes is for people unfamiliar with Transformers who want an end to end solution to solve practical text classification problems, including:

  • Single-label classification: A typical use case is sentiment detection where one want to detect the overall sentiment polarity (e.g., positive, neutral, negative) in a review.
  • Multi-label classification: A typical use case is aspect categories detection where one want to detect the multiple aspects mentionned in a review (e.g., product_quality, delivery_time, price, ...).
  • Aspect-based classification: A typical use case is aspect based sentiment analysis where one want to detect sentiment polarity associated to each aspect categories mentionned in a review (e.g., product_quality: neutral, delivery_time: negative, price: positive, ...).

NLPipes expose a Model API that provide a unique and simple abstraction for all the tasks. The library maintain a common usage pattern across models (train, evaluate, predict, save) with also a clear and consistent data structure (python lists as inputs/outputs data).

Built with

NLPipes is built with TensorFlow and HuggingFace Transformers:

  • TensorFlow: An end-to-end open source deep learning framework
  • Transformers: An general-purpose open-sources library for transformers-based architectures

Getting Started

Installation

  1. Create a virtual environment
python3 -m venv nlpipesenv
source nlpipesenv/bin/activate
  1. Install the package
pip install nlpipes

Tasks

A model can be trained for a specific task by first loading a backbone model. The train command takes at minimum two parameters (X and Y), where X is a list of texts to train on and Y is the training target.

The training target expect different formats, depending on what task you want to solve:

Single Label Classification:

Give one label name for each sequence of text in X:

 model = Model("albert-base-v2",
               task='single-label-classification',
               all_labels=["NEG", "NEU", "POS"],
              )
 
 X = ["This was bad.", "This was great!"]
 Y = ["NEG", "POS"]
 
 model.train(X, Y)
Multiple Label Classification:

Give a list of class names for each sequence of text in X:

 model = Model("albert-base-v2",
               task='multi-label-classification',
               all_labels=all_labels,
               )
 
 X = ["I want a refund!",
      "The bill I got is not correct and I also have technical issues",
      "All good"]
 Y = [
      ["billing"], 
      ["billing", "tech support"],
      []
     ]
     
 model.train(X, Y)
Aspect Based Classification:

Give a list of lists of label lists (pairs) for each given text in X:

 model = Model("albert-base-v2",
               task='class-label-classification',
               all_labels=["NEG", "NEU", "POS"],
              )
 
 X = ["The room was nice.",
      "The food was great, but the staff was unfriendly.",
      "The room was horrible, but the waiters were welcoming"]
 Y = [
      [["room", "POS"],
      [["food", "POS"], ["staff", "NEG"]],
      [["room", "NEG"], ["staff", "POS"]],
     ]
     
 model.train(X, Y)

Examples

Here are some examples on open datasets that show how to use NLPipes on different tasks:

Name Notebook Description Task Size Memory Speed
GooglePlay Sentiment Detection Available Train a model to detect the sentiment polarity from the GooglePlay store Single label classification
StackOverflow tags Detection Available Train a model to detect tags from the StackOverFlow questions Multiple label classification
Amazon Aspect Based Sentiment Detection Available Train a model to detect the aspect based sentiment polarity on Laptops Amazon reviews Class label classification

Notices

  • NLPipes is still in its early stage. The library comes with no warranty and future releases could bring substantial API and behavior changes.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nlpipes-0.1.18.tar.gz (40.0 kB view hashes)

Uploaded Source

Built Distribution

nlpipes-0.1.18-py3-none-any.whl (52.9 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page