Django application which crawls and downloads online content following instructions
Project description
django-scraper is a Django application which crawls and downloads online content following configurable instructions.
Extract content of given online websites/pages using XPath queries.
Automatically browse and download content in related pages, with given depth.
Support metadata extract along with other content
Have content refinement rules and black words filtering
Store and prevent duplication of downloaded content
Support HTTP, HTTPS proxies.
Documentation
The full documentation is not ready yet, please go here for notes about installation and usage: https://github.com/zniper/django-scraper
Support
If you have any questions or any ideas regarding this application, please email to me[at]zniper.net
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.