Real Time Tweets Analysis.
Project description
Real-Time Tweets Sentiment Analysis Package
Overview
Retrieving real-time tweets using twitter API, Apache Kafka, and Apache Spark Streaming; then, using tensorflow deep learning model to classify the tweets wether they positive, negative, or neutral; all in a pypi package.
TweetsAnalysis
The streamer and model package, available on pypi TweetsAnalysis
Package Requirements
- gensim
- pandas
- pyspark
- kafka-python
- streamlit
- scikit-learn
- seaborn
- tensorflow
- tweepy==3.9.0
- pydantic
- strictyaml
- joblib
Model
The model architecture:
The model results in about 85.5% in the train set and 84.4% accuracy on the test set, which has 160000 tweets; therefore, there is no over-fitting here.
Run
First we need to install the requirements with:
pip install TweetsAnalysis
To train the model run, but first we need to specifiy the model and data directories in the config file:
python train_model.py
Straming
Start kafka with:
bin/zookeeper-server-start.sh config/zookeeper.properties
bin/kafka-server-start.sh config/server.properties
then create a kafka topic (tweets_stream) with:
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic tweets_stream
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for TweetsAnalysis-1.0.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b02b21fb9dd54f031ecd5570f738a3cde60a789cf1fe70337a67e41a05d38b38 |
|
MD5 | f2ed3cdfb090388cfbbc1fff1b05088c |
|
BLAKE2b-256 | 6e51901e6feb4fce62603bb7a4f3b13c95a710e215c29dc3758892ea1f68eb10 |