A Python package for scraping Twitter data without API. With proxy and account-cookie support

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

ReverseTwitterScraper

Description

ReverseTwitterScraper is a Python package that provides an easy-to-use tool for scraping tweets of a single or multiple Twitter accounts. This package uses Selenium and httpx to scrape tweets and other account data.

IMPORTANT NOTICE in 0.8

Twitter has introduced a change in its system, which now mandates the use of account cookies for viewing tweets.

Please note that the use of cookies is associated with a high risk of account suspension or ban by Twitter.
Using account cookies essentially means that you are automating actions on behalf of a specific user account, which is against Twitter's Terms of Service. Always make sure to use this tool responsibly, adhering to Twitter's rules, and avoid excessive or suspicious activity that could lead to account limitations. Always make sure that you are informed about and compliant with the Twitter Terms of Service, as well as all relevant privacy laws and regulations.
The creators of the ReverseTwitterScraper are not responsible for any misuse of the tool or violations of these terms.

Links

GitHub: https://github.com/1220moritz/reverse-twitter-scraper
PyPI: https://pypi.org/project/ReverseTwitterScraper/

Installation

To install the package, simply run the following command:

pip install ReverseTwitterScraper

Usage

To use this package, you need to follow these steps:

Import the TwitterScraper class from the package.
Create an object of the TwitterScraper class.
Call any method of the TwitterScraper class.

Here's an example code:

from ReverseTwitterScraper import TwitterScraper

chromedriver_path = "C:/Program Files (x86)/chromedriver.exe"
cookies = {'Cookie': 'your Cookie', 'X-Csrf-Token': 'your csrf token'} # required account cookie
proxy_list = []


twitter_handle = ["elonmusk"]  # single account
twitter_handles = ["elonmusk", "POTUS", "latestinspace"]  # multiple accounts
scraper = TwitterScraper(twitter_handle, chromedriver_path, cookies, proxy_list)
tweets = scraper.getTweetsText()

print(tweets)

In the above code, we first import the TwitterScraper class from the package. Then, we create an object of the TwitterScraper class with the required parameters. Finally, we call the getTweetsText() method to get the tweets of the specified Twitter account.

Parameters

The TwitterScraper class takes the following parameters:

twitterHandle (Required): The Twitter handle of the account(s) to be scraped. For example, if the account URL is https://twitter.com/elonmusk, then the twitterHandle parameter should be set to ['elonmusk'].
chromedriverPath (Required): The path of the Chrome driver executable file. This file is required to use the Selenium module.
cookies (Required): The cookies of a logged-in Twitter account. If you have a Twitter account and want to scrape tweets that are not publicly available, you can pass the cookies of your logged-in account.
proxyList: (Optional) A list of proxies to use for scraping. The list should contain proxy addresses in the format ip:port:user:pw.

How to get account cookies + x-csrf-token:

Following a private "target" account is necessary to access its data. Then, account cookies can be used to scrape the account.

Open the Chrome browser and go to the Twitter website.
If you're not already logged in, log in to your Twitter account.
Right-click anywhere on the page and select "Inspect" from the context menu. Alternatively, you can press "Ctrl+Shift+I" (Windows) or "Cmd+Option+I" (Mac) on your keyboard.
This will open the Developer Tools pane. Click on the "Network" tab at the top and then filter with fetch/XHR.
On the left-hand sidebar, click on any request.
You should now see a list of metadata associated with this specific request. Look for the "Request Headers" section and then find the "cookies" and x-csrf-token entries. Copy the entire value of the cookies and x-csrf-token.
In your Python code, create a new instance of the TwitterScraper class and paste the cookie value as the value of the "cookies" parameter.
That's it! You can now use the TwitterScraper class to scrape data from your Twitter account.

By following these steps, you should be able to retrieve the necessary cookies from your Twitter account and use them in your Python code to scrape data.

Methodes:

get twitter data

getUserPlain()

  get all (unfiltered) data from every account in your handle list ("unnecessary" data and ads included)
  returns [{"handle": handle1, "id": id1, "resp": data1}, {"handle": handle2, "id": id2, "resp": data2},]

getTweetsPlain()

  get all (unfiltered) tweets from every account in your handle list (unnecessary data and ads included)
  returns [{"handle": handle1, "id": id1, "resp": {1}}, {"handle": handle2, "id": id2, "resp": {2}}]

getTweetsText()

  get text from all tweets from every account in your handle list
  returns [{'entryId': entryId1, 'retweet': retweet1, 'text': text1}, {'entryId': entryId2, 'retweet': retweet2, 'text': text2}]

filter Tweet data

filterRetweetInfo(singlePlainTweet, getRetweetInfo=False):

    checks if the tweet is a retweet (returns True, False or the TweetInfo (if you use getRetweetInfo=True))
    :param singlePlainTweet: plain (unfiltered) info of a tweet. Use getTweetsPlain() to get the info
    :param getRetweetInfo: default=False -> get all info about the retweetet tweet

filterTweetCreatedAt(singlePlainTweet)

returns createTimeDate of a tweet
:param singlePlainTweet: plain (unfiltered) info of a tweet. Use getTweetsPlain() to get the info

filterTweetID(singlePlainTweet)

    returns the ID of a tweet
    :param singlePlainTweet: plain (unfiltered) info of a tweet. Use getTweetsPlain() to get the info

filterRetweetCount(singlePlainTweet)

    returns how many times the tweet has been retweeted
    :param singlePlainTweet: plain (unfiltered) info of a tweet. Use getTweetsPlain() to get the info

filterReplyCount( singlePlainTweet)

    returns how many replies the tweet has
    :param singlePlainTweet: plain (unfiltered) info of a tweet. Use getTweetsPlain() to get the info

filterViews(singlePlainTweet)

    returns how many views the tweet has
    :param singlePlainTweet: plain (unfiltered) info of a tweet. Use getTweetsPlain() to get the info

filter Account data

filterPinnedTweetInfo(singleUserPlain)

    returns all (unfiltered) information about the pinned tweet
	:param singleUserPlain: plain (unfiltered) info of a Twitter account. Use getUserPlain() to get the info

filterIsBusinessAccount(singleUserPlain)

    returns if the account is a business account
    :param singleUserPlain: plain (unfiltered) info of a Twitter account. Use getUserPlain() to get the info

filterUserID(singleUserPlain)

    returns the id of an account
    :param singleUserPlain: plain (unfiltered) info of a Twitter account. Use getUserPlain() to get the info

filterIsBlueVerified(singleUserPlain)

    returns if the account is verified with a twitter blue check
    :param singleUserPlain: plain (unfiltered) info of a Twitter account. Use getUserPlain() to get the info

filterAccountCreationDate(singleUserPlain)

    returns the creation time of an account
    :param singleUserPlain: plain (unfiltered) info of a Twitter account. Use getUserPlain() to get the info

filterDescription(singleUserPlain)

    returns the description of an account
    :param singleUserPlain: plain (unfiltered) info of a Twitter account. Use getUserPlain() to get the info

getUserSpecificData(singleUserPlain)

    returns all (unfiltered) data about an account
    :param singleUserPlain: plain (unfiltered) info of a Twitter account. Use getUserPlain() to get the info

Project details

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

This version

0.8

Jul 8, 2023

0.7

May 24, 2023

0.6

May 9, 2023

0.5

May 4, 2023

0.4

Mar 27, 2023

0.3

Mar 14, 2023

0.2

Mar 9, 2023

0.1

Mar 9, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ReverseTwitterScraper-0.8.tar.gz (10.5 kB view details)

Uploaded Jul 8, 2023 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ReverseTwitterScraper-0.8-py3-none-any.whl (9.1 kB view details)

Uploaded Jul 8, 2023 Python 3

File details

Details for the file ReverseTwitterScraper-0.8.tar.gz.

File metadata

Download URL: ReverseTwitterScraper-0.8.tar.gz
Upload date: Jul 8, 2023
Size: 10.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.11.2

File hashes

Hashes for ReverseTwitterScraper-0.8.tar.gz
Algorithm	Hash digest
SHA256	`bfad3b21b4ca93f2b50e106f615cbce150f53328798049003949d2e12a79b780`
MD5	`a3c3a6a4ba70090eae6a5b46a29cdca4`
BLAKE2b-256	`b2e3ff4255848380a4904869f486bb54fec8defe599a44c9031ff6431d420b5b`

See more details on using hashes here.

File details

Details for the file ReverseTwitterScraper-0.8-py3-none-any.whl.

File metadata

Download URL: ReverseTwitterScraper-0.8-py3-none-any.whl
Upload date: Jul 8, 2023
Size: 9.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.11.2

File hashes

Hashes for ReverseTwitterScraper-0.8-py3-none-any.whl
Algorithm	Hash digest
SHA256	`cdddf657f4da21eb3134da4748add726f1c4e16f63ca7f7f42132e8a732a8d1d`
MD5	`0ceb47ac835db2e1c3963487702c2343`
BLAKE2b-256	`a2bbfa06a2cca8f9bab6bbf6b6c722df3d933a696dd3b41f22e72be77cf897a3`

See more details on using hashes here.

ReverseTwitterScraper 0.8

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

ReverseTwitterScraper

Description

IMPORTANT NOTICE in 0.8

Links

Installation

Usage

Parameters

How to get account cookies + x-csrf-token:

Methodes:

get twitter data

getUserPlain()

getTweetsPlain()

getTweetsText()

filter Tweet data

filterRetweetInfo(singlePlainTweet, getRetweetInfo=False):

filterTweetCreatedAt(singlePlainTweet)

filterTweetID(singlePlainTweet)

filterRetweetCount(singlePlainTweet)

filterReplyCount( singlePlainTweet)

filterViews(singlePlainTweet)

filter Account data

filterPinnedTweetInfo(singleUserPlain)

filterIsBusinessAccount(singleUserPlain)

filterUserID(singleUserPlain)

filterIsBlueVerified(singleUserPlain)

filterAccountCreationDate(singleUserPlain)

filterDescription(singleUserPlain)

getUserSpecificData(singleUserPlain)

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes