Real Time Tweets Analysis.
Project description
Real-Time Tweets Sentiment Analysis Package
Overview
Retrieving real-time tweets using twitter API, Apache Kafka, and Apache Spark Streaming; then, using tensorflow deep learning model to classify the tweets wether they positive, negative, or neutral; all in a pypi package.
TweetsAnalysis
The streamer and model package, available on pypi TweetsAnalysis
Package Requirements
- gensim
- pandas
- pyspark
- kafka-python
- streamlit
- scikit-learn
- seaborn
- tensorflow
- tweepy==3.9.0
- pydantic
- strictyaml
- joblib
Model
The model architecture:
The model results in about 85.5% in the train set and 84.4% accuracy on the test set, which has 160000 tweets; therefore, there is no over-fitting here.
Run
First we need to install the requirements with:
pip install TweetsAnalysis
To train the model run, but first we need to specifiy the model and data directories in the config file:
python train_model.py
Straming
Start kafka with:
bin/zookeeper-server-start.sh config/zookeeper.properties
bin/kafka-server-start.sh config/server.properties
then create a kafka topic (tweets_stream) with:
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic tweets_stream
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for TweetsAnalysis-1.0.5-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9ece59c31129d998d57140323a0bc990b92fab86800530cefb7796d173c95432 |
|
MD5 | 1fa9422acf8c92d796bfbbe291755a6e |
|
BLAKE2b-256 | 4dfebf352994bf58504398744ed6e598b09e068854698bc19e97945e1a413300 |