Crawler integration with INSPIRE-HEP.
Project description
Crawler integration with INSPIRE-HEP using scrapy project HEPCrawl.
This module allows scheduling of crawler jobs to a Scrapyd instance serving a Scrapy project. E.g. in this case the default scrapy project is HEPCrawl.
It integrates directly with invenio-workflows module to create workflows for every record harvested by the crawler.
This module is meant to use only with INSPIRE-HEP overlay. Use at own risk.
Full documentation is hosted here: http://pythonhosted.org/inspire-crawler/
See also documentation of HEPCrawl: http://pythonhosted.org/hepcrawl/
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
inspire-crawler-2.0.3.tar.gz
(34.5 kB
view hashes)