Skip to main content

Python wrapper for the webis Twitter sentiment identification tool

Project description

Python wrapper for the webis Twitter sentiment evaluation ensemble

This is a Python wrapper around the Java implementation of a Twitter sentiment evaluation framework presented by Hagen et al. (2015). The example script fetches Tweets from a PostgreSQL database, uses PyJnius to call the Java modules to evaluate the sentiment, and saves results to a table in the same database.

Dependencies

The script is written in Python 3 and depends on the Python modules PyJnius, pandas and emojientities.

On top of that, a Java Runtime Environment (jre) is required, plus a matching Java Development Kit (jdk). We used Java 8, but other versions might work just as well. OpenJDK works fine.

To install all dependencies on a Debian-based system, run:

apt-get update -y &&
apt-get install -y python3-dev python3-pip python3-virtualenv cython3 openjdk-8-jdk-headless openjdk-8-jre-headless ca-certificates-java

Installation

  • using pip or similar:
pip3 install webis
  • OR: manually:

    • Clone this repository
    git clone https://gitlab.com/christoph.fink/python-webis.git
    
    • Change to the cloned directory
    • Use the Python setuptools to install the package:
    cd python-webis
    python3 ./setup.py install
    
  • OR: (Arch Linux only) from AUR:

# e.g. using yaourt
yaourt python-webis

Usage

First, make sure the environment variable JAVA_HOME is set and pointing to your Java installation. For instance, add the following line to ~./bashrc:

export JAVA_HOME="$(readlink -f $(which javac) 2>/dev/null | sed "s:/bin/javac::")"

Import the webis module in a Python 3 script. On first run, python-webis will download and compile the Java backend – this might take a few minutes.

Then instantiate a webis.SentimentIdentifier object and use its identifySentiment() function, passing in a list of tuples ([(tweetId, tweetText),(tweetId, tweetText), … ]) or a pandas.DataFrame (first column is treated as identifier, second as tweetText).

The function returns a list of dicts ([{"tweetId": tweetId, "sentiment": sentiment}, … ]) or a data frame (first column id, second column sentiment) of rows it successfully identified a sentiment of.

import webis

sentimentIdentifier = webis.SentimentIdentifier()

tweets = [
    (1, "What a beautiful morning! There’s nothing better than cycling to work on a sunny day 🚲."),
    (2, "Argh, I hate it when you find seven (7!) cars blocking the bike lane on a five-mile commute")
]

sentimentIdentifier.identifySentiment(tweets)
# [(1, "positive"), (2, "negative")]

import pandas
tweets = pandas.DataFrame(tweets)
sentimentIdentifier.identifySentiment(tweets)
#   sentiment tweetId
# 0  positive       1
# 1  negative       2

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

webis-0.2.0.tar.gz (40.6 kB view hashes)

Uploaded Source

Built Distribution

webis-0.2.0-py3-none-any.whl (26.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page