Unofficial library to scrape Twitter profiles and posts from Nitter instances
Project description
Unofficial Nitter scraper
This is a simple library to scrape Nitter instances for tweets. It can:
-
search and scrape tweets with a certain term
-
search and scrape tweets with a certain hashtag
-
scrape tweets from a user profile
-
get profile information of a user, such as display name, username, number of tweets, profile picture ...
If the instance to use is not provided to the scraper, it will use a random instance among those listed in https://github.com/zedeus/nitter/wiki/Instances.
Installation
pip install ntscraper
How to use
First, initialize the library:
from ntscraper import Nitter
scraper = Nitter(log_level=1)
The valid logging levels are:
- None = no logs
- 0 = only warning and error logs
- 1 = previous + informational logs (default)
Then, choose the proper function for what you want to do from the following.
Scrape tweets
github_hash_tweets = scraper.get_tweets("github", mode='hashtag')
bezos_tweets = scraper.get_tweets("JeffBezos", mode='user')
Parameters:
- term: search term
- mode: modality to scrape the tweets. Default is 'term' which will look for tweets containing the search term. Other modes are 'hashtag' to search for a hashtag and 'user' to scrape tweets from a user profile
- number: number of tweets to scrape. Default is 5. If 'since' is specified, this is bypassed.
- since: date to start scraping from, formatted as YYYY-MM-DD. Default is None
- until: date to stop scraping at, formatted as YYYY-MM-DD. Default is None
- max_retries: max retries to scrape a page. Default is 5
- instance: Nitter instance to use. Default is None and will be chosen at random
Returns a dictionary with tweets and threads for the term.
Get profile information
bezos_information = scraper.get_profile_info("JeffBezos")
Parameters:
- username: username of the page to scrape
- max_retries: max retries to scrape a page. Default is 5
- instance: Nitter instance to use. Default is None
Returns a dictionary of the profile's information.
Get random Nitter instance
random_instance = scraper.get_random_instance()
Returns a random Nitter instance.
Note
Some Nitter instances may not work properly due to recent changes on Twitter's side. If you have trouble scraping with a certain instance, try changing it and check if the problem persists.
To do list
- Add scraping of individual posts with comments
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for ntscraper-0.1.8-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e9ed319ea1f1fef6f2bf42b3e7caa54d4173c40b7905b01634f7586a3860b5a4 |
|
MD5 | 65c5b29bf30cbd9bc530d601af1ad18d |
|
BLAKE2b-256 | ee155e75d0401722c84f38f0b124846436e1a8cf180ebb781bb7df06628209a1 |