Skip to main content

A package to do everything from getting tweets to pre-processing

Project description

Tweetl

By using Tweetl, you can simplify the steps from getting tweets to pre-processing them. If you don't have twitter API key, you can get it here.

This package help you to ・・・

  • get tweets with the target name and any keywords.
  • pre-processes the following list.
    • remove hashtags, URLs, pictographs, mentions, image strings and RT.
    • unify characters (uppercase to lowercase, halfwidth forms to fullwidth forms).
    • replace number to zero.
    • remove duplicates (because they might be RT.)

Installation

pip install Tweetl

Usage

Getting Tweets

Create an instance of the 'GetTweet' Class.

import Tweetl

# your api keys
consumer_api_key = "xxxxxxxxx"
consumer_api_secret_key = "xxxxxxxxx"
access_token = "xxxxxxxxx"
access_token_secret = "xxxxxxxxx"

# create an instance
tweet_getter = Tweetl.GetTweet(
                    consumer_api_key,
                    consumer_api_secret_key, 
                    access_token, 
                    access_token_secret
                )

With target name

You can collect tweets of the target if you use 'get_tweets_target' method and set the target's name not inclueded '@'. Then it returns collected tweets as DataFrame type. And you can specify the number of tweets.

# get 1000 tweets of @Deepblue_ts
df_target = tweet_getter.get_tweets_target("Deepblue_ts", 1000)
df_target.head()
スクリーンショット 2020-05-22 14 33 39

With any keywords

You can also get tweets about any keywords if you use 'get_tweets_keyword' method and set any one. And you can specify the number of tweets.

# get 1000 tweets about 'deep learning'
df_keyword = tweet_getter.get_tweets_keyword("deep learning", 1000)

Cleansing Tweets

Create an instance of the 'CleansingTweets' Class. And using 'cleansing_df' method, you can pre-processing tweets. You can select columns that you want to cleanse. The default is only text-colmn.

# create an instance
tweet_cleanser = Tweetl.CleansingTweets()
cols = ["text", "user_description"]
df_clean = tweet_cleanser.cleansing_df(df_keyword, subset_cols=cols)

Author

deepblue

License

This software is released under the MIT License, see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Tweetl-0.0.7.tar.gz (4.9 kB view hashes)

Uploaded Source

Built Distribution

Tweetl-0.0.7-py3-none-any.whl (5.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page