This project contains different tools to help search and analyze in twitter.
Project description
Twitter tools
This project contains different tools to help search and analyze in twitter.
The analysis of tweets is often followed by doing the same pipeline for different projects. This toolkit is a compilation and wrapper of many tools to ease the pipeline of analysis in twitter. First of all it provides search utilities, either by searching by a twitter query or by identifier. Then it integrates some models to infer users age, gender, and if it is a person or an organisation. This inference should be used with caution since it is not perfect, but can yield an overview of the type of users analyzed. There is also a location of users inference for Spanish locations based on their location text or description in their Twitter profile. For the text analysis we provide a pipeline for Topic analysis using the LDA algorithm and some sentiment analysis too. Finally we provide a network creation of the tweets and users function for a network analysis.
Twitter Search
Credentials
To run this you need to provide your Twitter API credentials in the form of YAML file.
For example:
search_tweets_api:
endpoint: https://api.twitter.com/2/tweets/search/all
consumer_key: XXXXXXXXXXXX
consumer_secret: XXXXXXXXXXXXXXXX
bearer_token: XXXXXXXXXXXXXXXXX
Searching tweets
You query tweets with search_tweets_by_query.
To have a more detailed for the parameters take a look at the Twitter API
from twitter_tools.search_tools import TweetSearchUtil
tsu = TweetSearchUtil('path_to_yaml_creds')
tweets = tsu.search_tweets_by_query(
"python OR #python"
,tweet_fields="author_id,conversation_id,created_at,id,in_reply_to_user_id,lang,public_metrics,text"
)
Searching by id
You can also search retrieve tweets and users by their id.
from twitter_tools.search_tools import TweetSearchUtil
tsu = TweetSearchUtil('path_to_yaml_creds')
tweets = tsu.retreive_tweets_by_id(
['12341','12342']
,tweet_fields="author_id,conversation_id,created_at,id,in_reply_to_user_id,lang,public_metrics,text"
)
users = tsu.retreive_users_by_id(
['4321','4322']
,user_fields="created_at,description,id,name,profile_image_url,public_metrics,username"
)
Twitter Inference
This is a Wrapper of M3Inference but with an ease to use and make a general pipeline with this set of tools.
from twitter_tools.user_inference import TwitterUserInference
users = [{...},...]
tui = TwitterUserInference()
inference = tui.infer_users(users, lang='en')
Users Location
This tool is only available for Spain locations.
To feature other countries, a json in the format as places_spain.json should
be added.
This tools checks the location of an user based otheir text location and description when no geolocation is available. Checks for city/country/region words in the user profile to try to identify for its location.
from location.location_detector import LocationDetector
user = {...}
detector = LocationDetector('path_to_places_json')
loc, method = detector.identify_location(user['location'], user['description'])
Topic analysis
This tool will do every step of topic analysis using LDA.
The typical pipeline can be represented by as follows.
from twitter_tools.topic_analysis import TopicAnalysis
tweets = [...]
analyzer = TopicAnalysis(language='es')
tweets_clean = analyser.clean_docs(tweets)
tweets_lemmas = analyser.lemmatize(tweets_clean,
filter_postags=['ADJ', 'ADV', 'NOUN', 'VERB'])
ldamodel, docs_dict = analyzer.topic_analysis(tweets_lemmas,
topics_nb=10, print_words=10)
Sentiment Analysis
Sentiment analysis of text using pretrained models.
from twitter_tools.topic_analysis import TopicAnalysis
tweets = [...]
analyzer = TopicAnalysis(language='es')
sentiments = [analyzer.sentiment_analysis(t) for t in tweets]
Network creation
This tool creates graphs based on the tweets and users interactions.
It can create the user and the tweet graph.
The tweets dict like object must contain at least the following fields:
id, retweeted_by, favorited_by.
The users dict like object must contain at least the following fields:
id, screen_name.
from twitter_tools.network_tools import create_tweets_network, create_users_network
users = [...]
tweets = [...]
T = create_tweets_network(tweets)
U = create_users_network(users, tweets)
Once the network create you can export it and open the file in Gephi to visualize it and analize it.
import networkx as nx
nx.write_gml(T, "tweets_network.gml")
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file twitter-toolkit-0.1.0.tar.gz.
File metadata
- Download URL: twitter-toolkit-0.1.0.tar.gz
- Upload date:
- Size: 26.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.2.1 CPython/3.10.9 Linux/6.1.8-arch1-1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e0269723dca373f0f1687ba2885c1e835c726db4a9c66d277db38f328e7dccce
|
|
| MD5 |
52443a59961f572465b1836fc8489aff
|
|
| BLAKE2b-256 |
9bdfe397a5302c85dfd79c59b855dadd26f0e20ef690f3f5fb4de2077d760079
|
File details
Details for the file twitter_toolkit-0.1.0-py3-none-any.whl.
File metadata
- Download URL: twitter_toolkit-0.1.0-py3-none-any.whl
- Upload date:
- Size: 26.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.2.1 CPython/3.10.9 Linux/6.1.8-arch1-1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bdac6e5b900e03f075cd806713ca07d140b72e594b464d397c465877afd140e2
|
|
| MD5 |
93d257387ae8b17954c290bcd1472ec3
|
|
| BLAKE2b-256 |
e9c416be81e9eba75cace7b55990a4b4585c9ee19fa0f08b05dd8e157b57204f
|