cli utility for data mining https://nitter.net/serach
Project description
scraper for nitter *twitter mirror
usage
this scraper is designed to work in debain/ubuntu/wsl2
# clone and cd to the nitter_stuff repository
git clone 'https://github.com/hashirkz/nitter_scraper'
cd nitter_scraper
# AUTOMATED SETUP
./setup.sh
. ./nitter_stuff/bin/activate
# bank tweets searcher already setup not dynamic
python3 -m nitter_scraper
# cli utility see cli usage below
python3 -m nitter
maintaining
-
./twitter_queries.csv
- contains all the queries for each bank being searched
- add any new banks as columns and any new rows to add new queries to existing banks
see twitter_queries.csv: bmo cibc rbc scotiabank td @BMO CIBC RBCNewsroom Scotiabank TDbank BMO Field CIBCInnovation RBC Newsroom ScotiabankHelps TD Bank BMOVanMarathon BanqueCIBC RBC_Newsroom ScotiabankArena TD_Bank BMOmedia TorontoRun @RBC ScotiabankCtr TDCanada BMOHarrisBank CIBC_FCIB RBCCanada ScotiabankTT TD Canada Bank Of Montreal CIBCFCIBBS RBC Canada ScotiabankGY TDEconomics BankOfMontreal CIBCFCIBJM RBC_Canada sccniagara TD Economics BMO Bank CIBCCareers RBCGAMAdvisor ScotiaEconomics TDDirectInvest BMO_Bank CCS_RFTC_OG RBCGAMNews scotiahockey TD DirectInvest BMO Capital Markets CIBCMellon RBC4Students GillerPrize TD_DirectInvest BMO_Media CIBCFCIBBB RBCCareers RunCRS TDCareers lifeatbmo CIBC_PWM_US RBCInsurance scotiacapital TD Careers RFTCBlueMtn RBC_Trading ScotiabankViews TD_Careers Canadian Imperial Bank of Commerce RBC_Insurance ScotiabankBB TD Garden CIBC Private Wealth US RBC Insurance ScotiaColpatria TD_Garden CIBC Mellon rbc bank RunCRSWest TDNewsCanada CIBC_Mellon royal bank of canada ScotiabankPE TDNews Canada CIBC Bank RBC Wealth ScotiabankFC TDNews_Canada CIBC_Bank RBC_Wealth ScotiabankBS TDBankUS CIBCBank RBCWealth ScotiabankMX TDBank US CIBC Wood Gundy RBC capital markets CHINPicnic TDBank_US CIBC Run for the Cure RBC_capital_markets ScotiaCaribbean TDNewsUS CIBC Future Heroes RBCcapitalmarkets ScotiabankJM TDNews US CIBC World Markets RBC Global Asset Management TDBank US RBC GAM TD_Insurance RBC_GAM TDInsurance RBCGAM TD Insurance rbc visa card TD_Canada rbc_bank TD Asset Management rbcbank TDAM_Canada TD Visa card -
updating css selectors for nitter
- in nitter_scraper.py > nitter_scraper > search()
- pass selectors for tweet_css='tweet-content', showmore_css='show-more', username_css='username', date_css='tweet-date'
- only needed if the html/css changes for nitter.net *probably wont for a while
notes
-
OSError: Could not find a suitable TLS CA certificate bundle, invalid path: /etc/ssl/certs/ca-certificates.crt
- sudo ln /etc/ssl/certs/<any .pem cert> /etc/ssl/certs/ca-certificates.crt
- sudo update-ca-certificates
-
issues with emoji package
- downgrade to emoji==1.7.0 i.e pip uninstall emoji pip install emoji==1.7.0
cli usage
python3 -m nitter <query: str> xor <query_file: str> OPTIONS
*xor meaning only 1 of these can be used at a time dont mix together lol
query:
query to search nitter for
query_file:
if ur constantly searching the same set of queries make a queryfile
to avoid having to retype commands and stuff e.x ./twitter_queries.csv
has banks as columns and under each bank the queries to search for
OPTIONS ? DEFAULT:
-p / --pgs: int, max number of pgs to search for query ? 50
-m / --mirror: str -> which mirror to search ? https://nitter.net/search
--retweets: flag -> if present wont filter out retweets ? false
--no-sentiments: flag -> if present wont append sentiments ? false
--no-save: flag -> if present wont save to a file ? false
example usage *see "./231120_mcgill or mcgill university_262.csv":
>> python3 -m nitter -q '"mcgill" OR "mcgill university"' -p 50
[nltk_data] Downloading package stopwords to
[nltk_data] /home/sleepyzzz/nltk_data...
[nltk_data] Package stopwords is already up-to-date!
[nltk_data] Downloading package punkt to /home/sleepyzzz/nltk_data...
[nltk_data] Package punkt is already up-to-date!
[nltk_data] Downloading package wordnet to
[nltk_data] /home/sleepyzzz/nltk_data...
[nltk_data] Package wordnet is already up-to-date!
pgs 1/50
searching https://nitter.net/search?f=tweets&q="mcgill" OR "mcgill university"&since=&until=&near=
pgs 2/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9abby9rQJAAIAAIAAAACCAADAAAAAAgABAAAAAAKAAUX9agjk8AnEAoABhf1qCOTv9jwAAA
pgs 3/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9aVlDpsgwgAIAAIAAAACCAADAAAAAAgABAAAAAEKAAUX9agjk8AnEAoABhf1qCOTv7HgAAA
pgs 4/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9aKKVZaAuQAIAAIAAAACCAADAAAAAAgABAAAAAIKAAUX9agjk8AnEAoABhf1qCOTv4rQAAA
pgs 5/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9Z8QwlcAIAAIAAIAAAACCAADAAAAAAgABAAAAAMKAAUX9agjk8AnEAoABhf1qCOTv2PAAAA
pgs 6/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9ZuI9tvQdgAIAAIAAAACCAADAAAAAAgABAAAAAQKAAUX9agjk8AnEAoABhf1qCOTvzywAAA
pgs 7/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9Zj3BJqR7AAIAAIAAAACCAADAAAAAAgABAAAAAUKAAUX9agjk8AnEAoABhf1qCOTvxWgAAA
pgs 8/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9ZY0V1uRvwAIAAIAAAACCAADAAAAAAgABAAAAAYKAAUX9agjk8AnEAoABhf1qCOTvu6QAAA
pgs 9/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9ZQrIBtxKAAIAAIAAAACCAADAAAAAAgABAAAAAcKAAUX9agjk8AnEAoABhf1qCOTvseAAAA
pgs 10/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9ZKAedphugAIAAIAAAACCAADAAAAAAgABAAAAAgKAAUX9agjk8AnEAoABhf1qCOTvqBwAAA
pgs 11/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9ZDxMdqBzgAIAAIAAAACCAADAAAAAAgABAAAAAkKAAUX9agjk8AnEAoABhf1qCOTvnlgAAA
pgs 12/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9Y9Q1Nbh0wAIAAIAAAACCAADAAAAAAgABAAAAAoKAAUX9agjk8AnEAoABhf1qCOTvlJQAAA
pgs 13/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9Y4ropbxDgAIAAIAAAACCAADAAAAAAgABAAAAAsKAAUX9agjk8AnEAoABhf1qCOTvitAAAA
pgs 14/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9Y1EE9rQ5gAIAAIAAAACCAADAAAAAAgABAAAAAwKAAUX9agjk8AnEAoABhf1qCOTvgQwAAA
pgs 15/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9Ywwy5uxGAAIAAIAAAACCAADAAAAAAgABAAAAA0KAAUX9agjk8AnEAoABhf1qCOTvd0gAAA
pgs 16/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9YtLYJqRmAAIAAIAAAACCAADAAAAAAgABAAAAA4KAAUX9agjk8AnEAoABhf1qCOTvbYQAAA
pgs 17/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9YocKFbQ8QAIAAIAAAACCAADAAAAAAgABAAAAA8KAAUX9agjk8AnEAoABhf1qCOTvY8AAAA
pgs 18/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9Ykw8ZYgJwAIAAIAAAACCAADAAAAAAgABAAAABAKAAUX9agjk8AnEAoABhf1qCOTvWfwAAA
pgs 19/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9YhKFVehBgAIAAIAAAACCAADAAAAAAgABAAAABEKAAUX9agjk8AnEAoABhf1qCOTvUDgAAA
pgs 20/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9Ycov5qAOgAIAAIAAAACCAADAAAAAAgABAAAABIKAAUX9agjk8AnEAoABhf1qCOTvRnQAAA
pgs 21/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9YXgWtewFAAIAAIAAAACCAADAAAAAAgABAAAABMKAAUX9agjk8AnEAoABhf1qCOTvPLAAAA
pgs 22/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9YU4H9aQZgAIAAIAAAACCAADAAAAAAgABAAAABQKAAUX9agjk8AnEAoABhf1qCOTvMuwAAA
pgs 23/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9YR-WZZATAAIAAIAAAACCAADAAAAAAgABAAAABUKAAUX9agjk8AnEAoABhf1qCOTvKSgAAA
pgs 24/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9YOa7BqQyQAIAAIAAAACCAADAAAAAAgABAAAABYKAAUX9agjk8AnEAoABhf1qCOTvH2QAAA
pgs 25/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9YKVXlbR1wAIAAIAAAACCAADAAAAAAgABAAAABcKAAUX9agjk8AnEAoABhf1qCOTvFaAAAA
pgs 26/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9YHz0tth6gAIAAIAAAACCAADAAAAAAgABAAAABgKAAUX9agjk8AnEAoABhf1qCOTvC9wAAA
pgs 27/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9YGWc1ugNwAIAAIAAAACCAADAAAAAAgABAAAABkKAAUX9agjk8AnEAoABhf1qCOTvAhgAAA
pgs 28/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9YDvGReBiAAIAAIAAAACCAADAAAAAAgABAAAABoKAAUX9agjk8AnEAoABhf1qCOTu-FQAAA
pgs 29/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9YBlBRbBiwAIAAIAAAACCAADAAAAAAgABAAAABsKAAUX9agjk8AnEAoABhf1qCOTu7pAAAA
pgs 30/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9X_MT5cgXAAIAAIAAAACCAADAAAAAAgABAAAABwKAAUX9agjk8AnEAoABhf1qCOTu5MwAAA
pgs 31/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9X7zHNYRmQAIAAIAAAACCAADAAAAAAgABAAAAB0KAAUX9agjk8AnEAoABhf1qCOTu2wgAAA
pgs 32/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9X4IrNvAfwAIAAIAAAACCAADAAAAAAgABAAAAB4KAAUX9agjk8AnEAoABhf1qCOTu0UQAAA
pgs 33/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9X1xINpQZwAIAAIAAAACCAADAAAAAAgABAAAAB8KAAUX9agjk8AnEAoABhf1qCOTux4AAAA
pgs 34/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9XztZtYRMwAIAAIAAAACCAADAAAAAAgABAAAACAKAAUX9agjk8AnEAoABhf1qCOTuvbwAAA
pgs 35/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9Xww5NfBhAAIAAIAAAACCAADAAAAAAgABAAAACEKAAUX9agjk8AnEAoABhf1qCOTus_gAAA
pgs 36/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9XupEpfQDAAIAAIAAAACCAADAAAAAAgABAAAACIKAAUX9agjk8AnEAoABhf1qCOTuqjQAAA
pgs 37/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9XtAZJbwtwAIAAIAAAACCAADAAAAAAgABAAAACMKAAUX9agjk8AnEAoABhf1qCOTuoHAAAA
pgs 38/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9XqEZhrgowAIAAIAAAACCAADAAAAAAgABAAAACQKAAUX9agjk8AnEAoABhf1qCOTulqwAAA
pgs 39/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9XnTWRYgFwAIAAIAAAACCAADAAAAAAgABAAAACUKAAUX9agjk8AnEAoABhf1qCOTujOgAAA
pgs 40/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9XktFpZgqgAIAAIAAAACCAADAAAAAAgABAAAACYKAAUX9agjk8AnEAoABhf1qCOTugyQAAA
pgs 41/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9XhLBZoQFgAIAAIAAAACCAADAAAAAAgABAAAACcKAAUX9agjk8AnEAoABhf1qCOTueWAAAA
pgs 42/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9XdZEtfBpAAIAAIAAAACCAADAAAAAAgABAAAACgKAAUX9agjk8AnEAoABhf1qCOTub5wAAA
pgs 43/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9XWv4ZtQIQAIAAIAAAACCAADAAAAAAgABAAAACkKAAUX9agjk8AnEAoABhf1qCOTuZdgAAA
pgs 44/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9XUMQFdxxgAIAAIAAAACCAADAAAAAAgABAAAACoKAAUX9agjk8AnEAoABhf1qCOTuXBQAAA
pgs 45/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9XPdW5dAWAAIAAIAAAACCAADAAAAAAgABAAAACsKAAUX9agjk8AnEAoABhf1qCOTuUlAAAA
pgs 46/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9XLXGxsgqAAIAAIAAAACCAADAAAAAAgABAAAACwKAAUX9agjk8AnEAoABhf1qCOTuSIwAAA
pgs 47/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9XJLTpaRRgAIAAIAAAACCAADAAAAAAgABAAAAC0KAAUX9agjk8AnEAoABhf1qCOTuPsgAAA
pgs 48/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9XFuEpcAQwAIAAIAAAACCAADAAAAAAgABAAAAC4KAAUX9agjk8AnEAoABhf1qCOTuNQQAAA
ity%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9XCDc9eRFgAIAAIAAAACCAADAAAAAAgABAAAAC8KAAUX9agjk8AnEAoABhf1qCOTuK0AAAA
pgs 50/50
searching https://nitter.net/search/?f=tweets&q=%22mcgill%22+OR+%22mcgill+university%22&cursor=DAADDAABCgABF_Wn_9hW8WQKAAIX9W-MNVawWQAIAAIAAAACCAADAAAAAAgABAAAADAKAAUX9agjk8AnEAoABhf1qCOTuIXwAAA
results: 262
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
nitter-miner-0.0.1.tar.gz
(13.3 kB
view details)
Built Distribution
File details
Details for the file nitter-miner-0.0.1.tar.gz
.
File metadata
- Download URL: nitter-miner-0.0.1.tar.gz
- Upload date:
- Size: 13.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 57a32acff582e0f9e4f9ac1c7c89247deeb9a8b2c0a0e5a7e3c140189b73a817 |
|
MD5 | 9f76e61690898fe6406ed0844a225f4c |
|
BLAKE2b-256 | 0413cdfbbd1582b283de2101b1299f33628001d47e3ad101c1ef62b65156aa9a |
File details
Details for the file nitter_miner-0.0.1-py3-none-any.whl
.
File metadata
- Download URL: nitter_miner-0.0.1-py3-none-any.whl
- Upload date:
- Size: 11.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c7a9b282c3f943bb43f9c6949d7f3e36b45f0182b97b42bb46156b5e4040abbb |
|
MD5 | 02007154d83178157207321063af4fea |
|
BLAKE2b-256 | af7a5c454d2295be3bbd301b12972aa5712f7ba83db2e356a57cd45267931dd7 |