Skip to main content

Python framework to scrape PasteBin pastes and analyze them

Project description

Logo

pastepwn - Paste-Scraping Python Framework

Build Status PyPI version Coverage Status Codacy Badge

Pastebin is a very helpful tool to store or rather share ascii encoded data online. In the world of OSINT, pastebin is being used by researchers all around the world to retrieve e.g. leaked account data, in order to find indicators about security breaches.

Pastepwn is a framework to scrape pastes and scan them for certain indicators. There are several analyzers and actions to be used out-of-the-box, but it is also easily extensible - you can create your own analyzers and actions on the fly.

Please note: This framework is not to be used for illegal actions. It can be used for querying public Pastebin pastes for e.g. your username or email address in order to increase your own security.

⚠️ Important note

In April 2020 Pastebin disabled access to their scraping API for a short period of time. At first people weren't able to access the scraping API in any way, but later on they re-enabled access to the API setup page. But since then it isn't possible to scrape "text" pastes. Only pastes with any kind of syntax set. That reduces the amount of pastes to a minimum, which reduced the usefulness of this tool.

Setting up pastepwn

To use the pastepwn framework you need to follow these simple steps:

  1. Make sure to have a Pastebin premium account!
  2. Install pastepwn via pip (pip3 install pastepwn
  3. Create a file (e.g. main.py) in your project root, where you put your code in²
  4. Fill that file with content - add analyzers and actions. Check the example implementation.

¹ Note that pastepwn only works with python3.6 or above
² (If you want to store all pastes, make sure to set up a mongodb, mysql or sqlite instance)

Behind a proxy

There are 2 ways to use this tool behind a proxy:

  • Define the following environment variables: HTTP_PROXY, HTTPS_PROXY, NO_PROXY.
  • When initializing the PastePwn object, use the proxies argument. proxies is a dict as defined in requests' documentation.

Troubleshooting

If you are having troubles, check out the wiki pages first. If your question/issue is not resolved there, feel free to create an issue or contact me on Telegram.

Roadmap and ToDos

Check the bug tracker on GitHub to get an up-to-date status about features and ToDos.

  • REST API for querying paste data (will be another project)
  • Add a helpful wiki with instructions and examples

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pastepwn-2.1.0.tar.gz (68.3 kB view details)

Uploaded Source

Built Distribution

pastepwn-2.1.0-py3-none-any.whl (123.8 kB view details)

Uploaded Python 3

File details

Details for the file pastepwn-2.1.0.tar.gz.

File metadata

  • Download URL: pastepwn-2.1.0.tar.gz
  • Upload date:
  • Size: 68.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.10

File hashes

Hashes for pastepwn-2.1.0.tar.gz
Algorithm Hash digest
SHA256 f596da9e2bad04db6fe5ee7d4d19bd27f14bcb2324579529d11cab8468d4bc30
MD5 4153130a62d76857b0cb0479a9868f3c
BLAKE2b-256 37278d6926847641757737b864bf00ab824f9a364311319b4236033406abfedc

See more details on using hashes here.

File details

Details for the file pastepwn-2.1.0-py3-none-any.whl.

File metadata

  • Download URL: pastepwn-2.1.0-py3-none-any.whl
  • Upload date:
  • Size: 123.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.10

File hashes

Hashes for pastepwn-2.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0e373f7fe9f060e148caf8034352e92566bc4b0d639fe9082053542b1847cf73
MD5 391d0fb7038094ec9f588174ebd5ad4e
BLAKE2b-256 5c72a5600bc3d69508d7aa9959a4474c2a084347048febb6fdca254ec12f5c73

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page