Real Time Tweets Analysis.
Project description
Real-Time Tweets Sentiment Analysis Package
Overview
Retrieving real-time tweets using twitter API, Apache Kafka, and Apache Spark Streaming; then, using tensorflow deep learning model to classify the tweets wether they positive, negative, or neutral; all in a pypi package.
TweetsAnalysis
The streamer and model package, available on pypi TweetsAnalysis
Package Requirements
- gensim
- pandas
- pyspark
- kafka-python
- streamlit
- scikit-learn
- seaborn
- tensorflow
- tweepy==3.9.0
- pydantic
- strictyaml
- joblib
Model
The model architecture:
The model results in about 85.5% in the train set and 84.4% accuracy on the test set, which has 160000 tweets; therefore, there is no over-fitting here.
Run
First we need to install the requirements with:
pip install TweetsAnalysis
To train the model run, but first we need to specifiy the model and data directories in the config file:
python train_model.py
Straming
Start kafka with:
bin/zookeeper-server-start.sh config/zookeeper.properties
bin/kafka-server-start.sh config/server.properties
then create a kafka topic (tweets_stream) with:
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic tweets_stream
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for TweetsAnalysis-1.1.6-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a02c2e7421809be1d80f98b9bb91629b334e65afc0ec7d05ee4f427553dad0aa |
|
MD5 | 92fb9508cdcd468dbba2e89132d0239b |
|
BLAKE2b-256 | 790233663f6559bcd23bec9144ac401d1d57f981efbfb56e2b9c079b3093b4d8 |