A high-level Web Crawling and Web Scraping framework
Project description
Overview
Scrapy is a fast high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing.
For more information including a list of features check the Scrapy homepage at: http://scrapy.org
Requirements
Python 2.7 or Python 3.3+
Works on Linux, Windows, Mac OSX, BSD
Install
The quick way:
pip install scrapy
For more details see the install section in the documentation: http://doc.scrapy.org/en/latest/intro/install.html
Releases
You can download the latest stable and development releases from: http://scrapy.org/download/
Documentation
Documentation is available online at http://doc.scrapy.org/ and in the docs directory.
Community (blog, twitter, mail list, IRC)
Contributing
Please note that this project is released with a Contributor Code of Conduct (see https://github.com/scrapy/scrapy/blob/master/CODE_OF_CONDUCT.md).
By participating in this project you agree to abide by its terms. Please report unacceptable behavior to opensource@scrapinghub.com.
Companies using Scrapy
Commercial Support
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for cyberplant-Scrapy-1.2.0.dev2.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1c2e64205ea728fa177147eb8179c94aebbe6a919b5f56c38f75b558bb0bfb46 |
|
MD5 | 18bd13f36466d2596d6e82389ddebb7d |
|
BLAKE2b-256 | dc4488b02cbc6b924e6e4239befbe5e29c5ed3325640efb1c7c3a60754c97e2a |
Hashes for cyberplant_Scrapy-1.2.0.dev2-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 41393a25df0bb371a2c7a05ead2e64b8d2df1c263109e3fac2971ca0f35974cb |
|
MD5 | 29819d0e3f51745d1667a66dfa4dde23 |
|
BLAKE2b-256 | 424198d4ed775d4122ccea52e09fb1b5f4471686edd678c39896f51437720856 |