Extract accounts' identifiers from personal pages on various platforms
Project description
socid_extractor
Extract information about a user from profile webpages / API responses and save it in machine-readable format.
Usage
As a command-line tool:
$ socid_extractor --url https://www.deviantart.com/muse1908
country: France
created_at: 2005-06-16 18:17:41
gender: female
username: Muse1908
website: www.patreon.com/musemercier
links: ['https://www.facebook.com/musemercier', 'https://www.instagram.com/muse.mercier/', 'https://www.patreon.com/musemercier']
tagline: Nothing worth having is easy...
Without installing:
$ ./run.py --url https://www.deviantart.com/muse1908
As a Python library:
>>> import socid_extractor, requests
>>> r = requests.get('https://www.patreon.com/annetlovart')
>>> socid_extractor.extract(r.text)
{'patreon_id': '33913189', 'patreon_username': 'annetlovart', 'fullname': 'Annet Lovart', 'links': "['https://www.facebook.com/322598031832479', 'https://www.instagram.com/annet_lovart', 'https://twitter.com/annet_lovart', 'https://youtube.com/channel/UClDg4ntlOW_1j73zqSJxHHQ']"}
Installation
$ pip3 install socid-extractor
The latest development version can be installed directly from GitHub:
$ pip3 install -U git+https://github.com/soxoj/socid_extractor.git
Sites and methods
More than 100 methods for different sites and platforms are supported!
- Google (all documents pages, maps contributions), cookies required
- Yandex (disk, albums, znatoki, music, realty, collections), cookies required to prevent captcha blocks
- Mail.ru (my.mail.ru user mainpage, photo, video, games, communities)
- Facebook (user & group pages)
- VK.com (user page)
- OK.ru (user page)
- Medium
- Flickr
- Tumblr
- TikTok
- GitHub
...and many others.
You can also check tests file for data examples, schemes file to expore all the methods.
When it may be useful
- Getting all available info by the username or/and account UID. Examples: Week in OSINT, OSINTCurious
- Users tracking, checking that the account was previously known (by ID) even if all public info has changed. Examples: Aware Online
- Searching by commonly used cross-service UIDs (GAIA ID, Facebook UID, Yandex Public ID, etc.)
- DB leaks of forums and platforms in SQL format
- Indexed links that contain target profile ID
- Searching for tracking data by comparison with other IDs - how it works, how can it be used.
- Law enforcement online requests
Tools using socid_extractor
-
Maigret - powerful namechecker, generate a report with all available info from accounts found.
-
TheScrapper - scrape emails, phone numbers and social media accounts from a website.
-
InfoHunter - An open source OSINT tool that allows you to search, collect and analyze information online to get a complete picture of the person or company you are interested in.
-
YaSeeker - tool to gather all available information about Yandex account by login/email.
-
Marple - scrape search engines results for a given username.
Testing
python3 -m pytest tests/test_e2e.py -n 10 -k 'not cookies' -m 'not github_failed and not rate_limited'
Contributing
Check separate page if you want to add a new methods of fix anything.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for socid_extractor-0.0.26-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 694d822589114c344cfe0feb2977d39b0530a9803bb4831ad3eea990a3fce933 |
|
MD5 | ad954196c2db5f869d785e1e49a8a95d |
|
BLAKE2b-256 | e8e6828b7907ca29db725d5bc44ed4fa4686ac99747061cb3ae2f0823a2a7f94 |