Scrapy Save Statistics: Save statistics extension for Scrapy
Project description
Save statistics to mongo for analytics.
Install
The quick way:
pip install scrapy-save-statistics
Or install from GitHub:
pip install git+git://github.com/light4/scrapy-save-statistics.git@master
Or checkout the source and run:
python setup.py install
settings.py
Mongodb settings for save statistics, need a statistics database.
MONGO_HOST = "127.0.0.1"
MONGO_PORT = 27017
MONGO_DB = "myspider"
MONGO_STATISTICS = "statistics"
EXTENSIONS = {
'scrapy_save_statistics.SaveStatistics': 100,
}
Spider
Spider must have statistics attributes and contains spider_url. We’ll save that info to mongodb.
class TestSpider(scrapy.Spider):
name = "test"
def __init__(self, name=None, **kwargs):
super(TestSpider, self).__init__(name=name, **kwargs)
self.statistics = []
def parse(self, response):
crawl_info = {'spider_url': spider_url,
'expected_crawl_num': expected_crawl_num,
'pages': total_page}
self.statistics.append(crawl_info)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file scrapy_save_statistics-0.2-py2.py3-none-any.whl.
File metadata
- Download URL: scrapy_save_statistics-0.2-py2.py3-none-any.whl
- Upload date:
- Size: 4.4 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aeb9743d06da179c480c5747cd086b1c19c5d0757295b72e950bc43c223f7e53
|
|
| MD5 |
571d7b75a8fa663a540ed5b7c30b0dc7
|
|
| BLAKE2b-256 |
c0ddd05e4c25af97e1bd384e1666b12d59e9a452f032b975e182a1dea6b67be2
|