Python library that collects tweets about movies, performs a sentiment analysis and correlates it with the boxoffice result of the 7 days after the movie release.
MTSB (Movie Tweet Sentiment Boxoffice) is a python module that collects tweets about movies, performs a sentiment analysis and correlates it with the boxoffice result of the 7 days after the movie release.
- Collect tweets about movies
- Creates hashtags for each movie
- Performs sentiment analysis on those tweets using Google's API or Textblob and returns the average score and the average magnitude
- Gets boxoffice data from boxofficemojo
- Performs correlation between the sentiment analysis and boxoffice data
- Python >= 3.5 (Might work on older versions but it has not been tested)
- The package has only been tested on Linux, with the following docker compose environment: https://gitlab.com/aletundo/data-management-lab
- All module dependencies are installed on installation, but you will also need:
- You need to have set up correctly ntlk module: https://www.nltk.org/install.html
- Performed at least once "ntlk.download()"
- Already have API keys for tweet collection: https://developer.twitter.com/en.html
- If you plan on using Google's API you lready need to have API keys for Google Natural Language service: https://cloud.google.com/natural-language/docs/setup
- You also need to have the following services installed (tested on Linux system)
In order to install MTSB you can simply:
pip install mtsb
Collect tweets about movies. It lets you choose between movies released in 2019 and releasing in 2020. It then creates a list of hashtags based on the movie's name and top actors and uses it to collect tweets from twitter.
import mtsb mtsb.tweet_collector()
Performs sentiment analysis on collected tweets using Google's API or Textblob and returns the average score, the average magnitude, their standard deviations and the percentage of positive tweets.
import mtsb mtsb.sentiment()
Creates a dataframe with the following info for each movie: * Movie title and genres * Average mean and std of the tweets' scores and magnitudes * Percentage of positive and negative labelled tweets (if score==0 is labelled as positive) * Sum of the boxoffice of the 7 days after the movie release
import mtsb mtsb.sentiment_boxoffice_all()
Performs a spearman correlation using the df returned by sentiment_boxoffice_all().
Useful python libraries used:
MIT licensed. See the bundled LICENSE file for more details.
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.