Real Time Tweets Analysis.
Project description
Real-Time Tweets Sentiment Analysis Package
Overview
Retrieving real-time tweets using twitter API, Apache Kafka, and Apache Spark Streaming; then, using tensorflow deep learning model to classify the tweets wether they positive, negative, or neutral; all in a pypi package.
TweetsAnalysis
The streamer and model package, available on pypi TweetsAnalysis
Package Requirements
- gensim
- pandas
- pyspark
- kafka-python
- streamlit
- scikit-learn
- seaborn
- tensorflow
- tweepy==3.9.0
- pydantic
- strictyaml
- joblib
Model
The model architecture:
The model results in about 85.5% in the train set and 84.4% accuracy on the test set, which has 160000 tweets; therefore, there is no over-fitting here.
Run
First we need to install the requirements with:
pip install TweetsAnalysis
To train the model run, but first we need to specifiy the model and data directories in the config file:
python train_model.py
Straming
Start kafka with:
bin/zookeeper-server-start.sh config/zookeeper.properties
bin/kafka-server-start.sh config/server.properties
then create a kafka topic (tweets_stream) with:
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic tweets_stream
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for TweetsAnalysis-1.1.8-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 62f1fbc34d64967055fad4d9cd11ed3ad4b68e3b86952f9720d42188752f736e |
|
MD5 | 40b79784bb1f382870500caedd62aec5 |
|
BLAKE2b-256 | bf3a916e9f6a58a574ab9f8d061518571ba259011037aead55fd69d20d5e6269 |