Python framework to scrape PasteBin pastes and analyze them
Project description
pastepwn - Paste-Scraping Python Framework
Pastebin is a very helpful tool to store or rather share ascii encoded data online. In the world of OSINT, pastebin is being used by researchers all around the world to retreive e.g. leaked account data, in order to find indicators about security breaches.
Pastepwn is a framework to scrape pastes and scan them for certain indicators. There are several analyzers and actions to be used out-of-the-box, but it is also easily extensible - you can create your own analyzers and actions on the fly.
Please note: This framework is not to be used for illegal actions. It can be used for querying public Pastebin pastes for e.g. your username or email address in order to increase your own security.
⚠️ Important note
In April 2020 Pastebin disabled access to their scraping API for a short period of time. At first people weren't able to access the scraping API in any way, but later on they re-enabled access to the API setup page. But since then it isn't possible to scrape "text" pastes. Only pastes with any kind of syntax set. That reduces the amount of pastes to a minimum, which reduced the usefulness of this tool.
Setting up pastepwn
To use the pastepwn framework you need to follow these simple steps:
- Make sure to have a Pastebin premium account!
- Install pastepwn via pip (
pip3 install pastepwn
)¹ - Create a file (e.g.
main.py
) in your project root, where you put your code in² - Fill that file with content - add analyzers and actions. Check the example implementation.
¹ Note that pastepwn only works with python3.5 or above (so better use pip3)
² (If you want to store all pastes, make sure to setup a mongodb
, mysql
or sqlite
instance)
Behind a proxy
There are 2 ways to use this tool behind a proxy:
- Define the following environment variables:
HTTP_PROXY
,HTTPS_PROXY
,NO_PROXY
. - When initializing the PastePwn object, use the
proxies
argument.proxies
is a dict as defined in requests' documentation.
Troubleshooting
If you are having troubles, check out the wiki pages first. If your question/issue is not resolved there, feel free to create an issue or contact me on Telegram.
Roadmap and ToDos
Check the bug tracker on GitHub to get an up-to-date status about features and ToDos.
- REST API for querying paste data (will be another project)
- Add a helpful wiki with instructions and examples
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.