Python wrapper for the webis Twitter sentiment identification tool
Project description
Python wrapper for the webis Twitter sentiment evaluation ensemble
This is a Python wrapper around the Java implementation of a Twitter sentiment evaluation framework presented by Hagen et al. (2015). It uses PyJnius to call the Java modules to evaluate sentiment.
If you use python-webis for scientific research, please cite it in your publication:
Fink, C. (2019): python-webis: Python wrapper for the webis Twitter sentiment evaluation ensemble. doi:10.5281/zenodo.2547461.
Dependencies
The script is written in Python 3 and depends on the Python modules PyJnius, pandas and emojientities.
On top of that, a Java Runtime Environment (jre) is required, plus a matching Java Development Kit (jdk). We used Java 8, but other versions might work just as well. OpenJDK works fine.
To install all dependencies on a Debian-based system, run:
apt-get update -y &&
apt-get install -y python3-dev python3-pip python3-virtualenv cython3 openjdk-8-jdk-headless openjdk-8-jre-headless ca-certificates-java
(There’s an Archlinux AUR package pulling in all dependencies, see further down)
Installation
- using
pip
or similar:
pip3 install webis
-
OR: manually:
- Clone this repository
git clone https://gitlab.com/christoph.fink/python-webis.git
- Change to the cloned directory
- Use the Python
setuptools
to install the package:
cd python-webis python3 ./setup.py install
-
OR: (Arch Linux only) from AUR:
# e.g. using yay
yay python-webis
Usage
Import the webis
module. On first run, python-webis will download and compile the Java backend – this might take a few minutes.
Then instantiate a webis.SentimentIdentifier
object and use its identifySentiment()
function, passing in a list of tuples ([(tweetId, tweetText),(tweetId, tweetText), … ]
), a dict ({tweetId: tweetText, … }
) or a pandas.DataFrame
(first column is treated as identifier, second as tweetText).
The function returns a list of tuples ([(tweetId, sentiment), … ]
), a dict ({tweetId: sentiment, … }
) or a data frame (first column id, second column sentiment) of rows it successfully identified a sentiment of. The type of the return value matches the argument, with which the function is called. The tweetId
values will be cast to the type of the first row’s tweetId
.
By default messages from the Java classes (written to System.out
and System.err
) are suppressed. To print all messages, pass a keyword argument suppressJavaMessages=False
to the constructor of webis.SentimentIdentifier
or the shorthand function webis.identifySentiment
(see further down).
import webis
sentimentIdentifier = webis.SentimentIdentifier()
# list of tuples
tweets = [
(1, "What a beautiful morning! There’s nothing better than cycling to work on a sunny day 🚲."),
(2, "Argh, I hate it when you find seven (7!) cars blocking the bike lane on a five-mile commute")
]
tweets = sentimentIdentifier.identifySentiment(tweets)
# [(1, "positive"), (2, "negative")]
# pandas Dataframe
import pandas
tweets = pandas.DataFrame([
(1, "What a beautiful morning! There’s nothing better than cycling to work on a sunny day 🚲."),
(2, "Argh, I hate it when you find seven (7!) cars blocking the bike lane on a five-mile commute")
])
tweets = sentimentIdentifier.identifySentiment(tweets)
# sentiment tweetId
# 0 positive 1
# 1 negative 2
# dict
tweets = {
1: "What a beautiful morning! There’s nothing better than cycling to work on a sunny day 🚲.",
2: "Argh, I hate it when you find seven (7!) cars blocking the bike lane on a five-mile commute"
}
tweets = sentimentIdentifier.identifySentiment(tweets)
# { 1: "positive", 2: "negative" }
python-webis
can act as a context manager:
with webis.SentimentIdentifier() as s:
tweets = s.identifySentiment(tweets)
webis.identifySentiment()
is a short-hand for initialising a SentimentIdentifier
object and calling its identifySentiment()
method:
tweets = webis.identifySentiment(tweets)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file webis-0.3.1.tar.gz
.
File metadata
- Download URL: webis-0.3.1.tar.gz
- Upload date:
- Size: 37.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.54.0 CPython/3.9.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 417fd708cfbc54078e0bc678fa6c64d973a5112888c8a697931970398283f543 |
|
MD5 | 79e0f0147f2977ed578cbf755f2bc1d8 |
|
BLAKE2b-256 | d2d31b88ce1e95708c6c7e2f310b3468c7a06b796362abfef4f7a9b2ade0f891 |
File details
Details for the file webis-0.3.1-py3-none-any.whl
.
File metadata
- Download URL: webis-0.3.1-py3-none-any.whl
- Upload date:
- Size: 26.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.54.0 CPython/3.9.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | bb71c40ab028e8f6c59f30afe1e0bf99d5088dff609d1ed076373bbc955b2fd9 |
|
MD5 | 622cb787f80318596482483e33b72bdf |
|
BLAKE2b-256 | 47c60dfc5dceb29c2b39f4319006b9e7a9a50492dbf49eda099aa5c8492daf9c |