Screen scraping and web crawling framework
Project description
Pomp is a screen scraping and web crawling framework. Pomp is inspired by and similar to Scrapy, but has a simpler implementation that lacks the hard Twisted dependency.
Features:
- Pure python
- Only one dependency for Python 2.x - concurrent.futures (backport of package for Python 2.x)
- Supports one file applications; Pomps doesn’t force a specific project layout or other restrictions.
- Pomp is a meta framework like Paste: you may use it to create your own scraping framework.
- Extensible networking: you may use any sync or async method.
- No parsing libraries in the core; use you preferred approach.
- Pomp instances may be distributed and are designed to work with an external queue.
Pomp makes no attempt to accomodate:
- redirects
- proxies
- caching
- database integration
- cookies
- authentication
- etc.
If you want proxies, redirects, or similar, you may use the excellent requests library as the Pomp downloader.
Continuous integration status by drone.io:
PyPI status:
Docs status:
Pomp is written and maintained by Evgeniy Tatarkin and is licensed under the BSD license.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pomp-0.2.1.tar.gz
(17.5 kB
view hashes)
Built Distribution
pomp-0.2.1-py2.py3-none-any.whl
(18.1 kB
view hashes)