Real Time Tweets Analysis.
Project description
Real-Time Tweets Sentiment Analysis Package
Overview
Retrieving real-time tweets using twitter API, Apache Kafka, and Apache Spark Streaming; then, using tensorflow deep learning model to classify the tweets wether they positive, negative, or neutral; all in a pypi package.
TweetsAnalysis
The streamer and model package, available on pypi TweetsAnalysis
Package Requirements
- gensim
- pandas
- pyspark
- kafka-python
- streamlit
- scikit-learn
- seaborn
- tensorflow
- tweepy==3.9.0
- pydantic
- strictyaml
- joblib
Model
The model architecture:
The model results in about 85.5% in the train set and 84.4% accuracy on the test set, which has 160000 tweets; therefore, there is no over-fitting here.
Run
First we need to install the requirements with:
pip install TweetsAnalysis
To train the model run, but first we need to specifiy the model and data directories in the config file:
python train_model.py
Straming
Start kafka with:
bin/zookeeper-server-start.sh config/zookeeper.properties
bin/kafka-server-start.sh config/server.properties
then create a kafka topic (tweets_stream) with:
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic tweets_stream
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for TweetsAnalysis-1.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 34998c1d3194f13ba3e6df54fed8885033c41ace09771a252fc5ccc923f63a2a |
|
MD5 | 5fd254cda7f7d09dcbe2393b0f60252e |
|
BLAKE2b-256 | c2d50fa42c65c9214fff09ef8243dabc13ef1a2061fb9ce8f1acaa025874b175 |