a small web crawler for the pastebin.com website
Project description
Simple-Pastebin-Parser
this is a simpler parser for the pastebin.com website.
it will iterate posts and parse their elements using lxml
installation:
pip install simple-pastebin-parser
example usage
import simple_pastebin_parser
for paste in simple_pastebin_parser.get_pastes():
print("Title: ", paste.Title)
print("Author: ", paste.Author)
print("date: ", paste.Date)
print("Content: ")
print(paste.Content)
print("*" * 20)
Release notes:
v0.1.0 - P.O.C
initial proof of concept. nothing special, just doing the dirty work of parsing the posts.
how to execute: 1. create a virtual env of python 3.6 2. install requirements 3. run python poc.py
v0.2.5 (2020-03-07)
integration with travis.ci
v0.2.6 (2020-03-07)
changing the POC code to work with installed pypi package
v0.3.0 (2020-03-07)
created the Paste() object for pastebin posts
ability to stream data
v0.3.3 (2020-03-07)
small fixes
v0.3.5 (2020-03-07)
update README
v0.4.0 (2020-03-08)
added documentation
cleaned most pep8 issues
some tests
v0.5.0 (2020-03-08)
parse date in UTC
add some logs
add id to Paste()
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for simple_pastebin_parser-0.5.1.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | ecc9f3e75c0673c6e10ed116ec057e5719e3582d43c49269660291d6adac38b1 |
|
MD5 | 263cd950e376bfd01206bf89ae1d3487 |
|
BLAKE2b-256 | de9304b3acb1e4710a83a07f05a92dfa6b4aa4d5d63e4e3720ecd1ba8c6461ee |
Hashes for simple_pastebin_parser-0.5.1-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 798093067df8073ea5d1dabc2e098569bfa01de4ab42dbfcb18d084e6a057ebc |
|
MD5 | b0a9e06de9cf345637e42ae203835cfe |
|
BLAKE2b-256 | 8bfa62c06224b6fdf29bccf7ed42a99431e823db7d12930616ef2844d0432a55 |