Target Dependent Sentiment Analysis (TDSA) framework.

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 3 - Alpha
License
- OSI Approved :: MIT License
Natural Language
- English
Programming Language
- Python :: 3.6
Topic
- Text Processing
- Text Processing :: Linguistic

Project description

Bella

Target Dependent Sentiment Analysis (TDSA) framework.

Requirements and Installation

Python 3.6
pip install bella-tdsa
Install docker
Start Stanford CoreNLP server: docker run -p 9000:9000 -d --rm mooreap/corenlp
Start the TweeboParser API server: docker run -p 8000:8000 -d --rm mooreap/tweeboparserdocker

To stop the docker servers running:

Find the name assigned to the docker image using: docker ps
Then stop the relevant docker image: docker stop name_of_image

NOTE Both of these servers will run with as many threads as your machine has CPUs to limit this do the following:

For stanford: docker run -p 9000:9000 -d --rm mooreap/corenlp -threads 6 will run it with 6 threads
For TweeboParser: docker run -p 8000:8000 -d --rm mooreap/tweeboparserdocker --threads 6 will run it with 6 threads

Dataset

All of the dataset are required to be downloaded and are not stored in this repository. We recomend using the config file to state where the datasets are stored like we did but this is not a requirement as you can state where they are stored explictly in the code. For more details on the datasets and downloading them see the dataset notebook The datasets used:

SemEval 2014 Resturant dataset. We used Train dataset version 2 and the test dataset.
SemEval 2014 Laptop dataset. We used Train dataset version2 and the test dataset.
Election dataset
Dong et al. Twitter dataset
Youtubean dataset by Marrese-Taylor et al.
Mitchell dataset which was released with this paper.

NOTE Before using Mitchell and YouTuBean datasets please go through these pre-processing notebooks: Mitchell YouTuBean for splitting their data and also in Mitchell case which train test split to use.

Lexicons

These lexicons are required to be downloaded if you use any methods that require them. Please see the use of the config file for stroing the location of the lexicons:

MPQA can be found here
NRC here
Hu and Liu here

Word Vectors

All the word vectors are automatically downloaded for you and they are stored in the root directory called '.Bella' which is created in your user directory e.g. on Linux that would be ~/.Bella/. The word vectors included in this repository are the following:

SSWE
Word Vectors trained on sentences that contain emojis
Glove Common Crawl
Glove Twitter
Glove Wiki Giga

Model Zoo

The model zoo can be found in the "model zoo" folder.

The notebooks

Can be found here

The best order to look at the notebooks is first look at the data with this notebook. Then looking at the notebook that describes how to load and use the saved models from the model zoo. Then go and explore the rest if you would like:

The Mass evaluation notebooks are the following
- Mass Evaluation - TDParse for the Target dependent models
- Mass Evaluation - Target Dependent for the TDParse models
- Mass Evaluation LSTM for the LSTM models All of these do not contain any analysis just demostartes how we gathered the results. Lastly they also create the model zoo. For the analysis of the Mass evaluations see Mass Evaluation Result Analysis notebook
For the analysis of the reproduction of the Target Dependent model of Vo and Zhang see this notebook
For the analysis of the reproduction of the TDParse model of Wang et al. see this notebook
For the analysis of the reproduction of the LSTM models of Tang et al. see this notebook
For the statistics of the datasets and where to find them see this notebook
For the code on creating training and test splits for the YouTuBean dataset see this notebook
For the code on creating training and test splits for Mitchell et al. dataset see this notebook

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 3 - Alpha
License
- OSI Approved :: MIT License
Natural Language
- English
Programming Language
- Python :: 3.6
Topic
- Text Processing
- Text Processing :: Linguistic

Release history Release notifications | RSS feed

0.3.29

Apr 13, 2019

0.3.28

Feb 11, 2019

0.3.27

Feb 9, 2019

0.3.26

Feb 6, 2019

0.3.25

Feb 6, 2019

0.3.24

Feb 6, 2019

0.3.23

Feb 6, 2019

0.3.22

Feb 5, 2019

0.3.21

Feb 5, 2019

0.3.3

Feb 5, 2019

0.3.2

Jan 31, 2019

0.3.1

Jan 29, 2019

0.3.0

Jan 28, 2019

0.2.9

Jan 28, 2019

0.2.8

Jan 24, 2019

0.2.7

Jan 22, 2019

0.2.6

Nov 29, 2018

0.2.5

Oct 26, 2018

0.2.4

Oct 25, 2018

0.2.3

Oct 6, 2018

0.2.2

Sep 11, 2018

0.2.1

Sep 7, 2018

0.2.0

Aug 17, 2018

This version

0.1.0

Jun 11, 2018

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bella_tdsa-0.1.0.tar.gz (65.5 kB view hashes)

Uploaded Jun 11, 2018 Source

Built Distribution

bella_tdsa-0.1.0-py3-none-any.whl (76.4 kB view hashes)

Uploaded Jun 11, 2018 Python 3

Hashes for bella_tdsa-0.1.0.tar.gz

Hashes for bella_tdsa-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`48a16c002618a18ea79dd2f8ba971c43c90c2a58c080476f7761643fc00a1306`
MD5	`9469f067e66bc437bbf2312a8eccd8c9`
BLAKE2b-256	`36969f0a5d0e1879cd57835ddea2bee03473368909f589ecc6ea64ad64b2e14d`

Hashes for bella_tdsa-0.1.0-py3-none-any.whl

Hashes for bella_tdsa-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`aef075e4f3aa005a6524fe1e16b6de5483754100186071ea5498027911969dac`
MD5	`1b3468abba75d7653a0afd178111ee1c`
BLAKE2b-256	`ee873ac67657e9becfb39bb55b33b145afbfccb95ffd9fc83fac06628e324ff7`