Skip to main content

A framework for creating web content extractors

Project description

Travis-CI Build Status Downloads Latest Version

Scrapple is a framework for creating web scrapers and web crawlers according to a key-value based configuration file. It provides a command line interface to run the script on a given JSON-based configuration input, as well as a web interface to provide the necessary input.

You can install Scrapple by using

$ sudo apt-get install libxml2-dev libxslt-dev python-dev lib32z1-dev
$ pip install scrapple

You can read the complete documentation.

Maintained by Alex Mathew and Harish Balakrishnan.

History

0.2.4 - 2015-04-13

  • Update documentation
  • Minor fixes

0.2.3 - 2015-03-11

  • Include implementation to use csv as the output format

0.2.2 - 2015-02-22

  • Fix bug in generate script template

0.2.1 - 2015-02-21

  • Update tests

0.2.0 - 2015-02-20

  • Include implementation for scrapple run and scrapple generate for crawlers
  • Modify web interface for editing scraper config files
  • Revise skeleton configuration files

0.1.1 - 2015-02-10

  • Release on PyPI with revisions
  • Include web interface for editing scraper config files
  • Modified implementations of certain functions

0.1.0 - 2015-02-04

  • First release on PyPI

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for scrapple, version 0.2.4
Filename, size File type Python version Upload date Hashes
Filename, size scrapple-0.2.4.linux-i686.tar.gz (318.7 kB) File type Source Python version None Upload date Hashes View
Filename, size scrapple-0.2.4-py2.7.egg (334.5 kB) File type Egg Python version 2.7 Upload date Hashes View
Filename, size scrapple-0.2.4-py2-none-any.whl (325.7 kB) File type Wheel Python version py2 Upload date Hashes View
Filename, size scrapple-0.2.4-py2.py3-none-any.whl (325.7 kB) File type Wheel Python version py2.py3 Upload date Hashes View
Filename, size scrapple-0.2.4.tar.gz (1.8 MB) File type Source Python version None Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring DigiCert DigiCert EV certificate Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page