This is a pre-production deployment of Warehouse. Changes made here affect the production instance of PyPI (pypi.python.org).
Help us improve Python packaging - Donate today!

Screen scraping and web crawling framework

Project Description

Pomp is a screen scraping and web crawling framework. Pomp is inspired by and similar to Scrapy, but has a simpler implementation that lacks the hard Twisted dependency.

Features:

  • Pure python
  • Only one dependency for Python 2.x - concurrent.futures (backport of package for Python 2.x)
  • Supports one file applications; Pomps doesn’t force a specific project layout or other restrictions.
  • Pomp is a meta framework like Paste: you may use it to create your own scraping framework.
  • Extensible networking: you may use any sync or async method.
  • No parsing libraries in the core; use you preferred approach.
  • Pomp instances may be distributed and are designed to work with an external queue.

Pomp makes no attempt to accomodate:

  • redirects
  • proxies
  • caching
  • database integration
  • cookies
  • authentication
  • etc.

If you want proxies, redirects, or similar, you may use the excellent requests library as the Pomp downloader.

Pomp examples

Pomp docs

Continuous integration status by drone.io:

PyPI status:

Docs status:

Pomp is written and maintained by Evgeniy Tatarkin and is licensed under the BSD license.

Release History

Release History

This version
History Node

0.2.1

History Node

0.2

History Node

0.2.dev0

History Node

0.1.2.dev

History Node

0.1.1.dev

History Node

0.1

History Node

0.0.2.dev

History Node

0.0.1.dev

Download Files

Download Files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

File Name & Checksum SHA256 Checksum Help Version File Type Upload Date
pomp-0.2.1-py2.py3-none-any.whl (18.1 kB) Copy SHA256 Checksum SHA256 3.5 Wheel Sep 12, 2016
pomp-0.2.1.tar.gz (17.5 kB) Copy SHA256 Checksum SHA256 Source Sep 12, 2016

Supported By

WebFaction WebFaction Technical Writing Elastic Elastic Search Pingdom Pingdom Monitoring Dyn Dyn DNS Sentry Sentry Error Logging CloudAMQP CloudAMQP RabbitMQ Heroku Heroku PaaS Kabu Creative Kabu Creative UX & Design Fastly Fastly CDN DigiCert DigiCert EV Certificate Rackspace Rackspace Cloud Servers DreamHost DreamHost Log Hosting