Skip to main content

Scrapy Save Statistics: Save statistics extension for Scrapy

Project description

Save statistics to mongo for analytics.

Install

The quick way:

pip install scrapy-save-statistics

Or install from GitHub:

pip install git+git://github.com/light4/scrapy-save-statistics.git@master

Or checkout the source and run:

python setup.py install

settings.py

Mongodb settings for save statistics, need a statistics database.

MONGO_HOST = "127.0.0.1"
MONGO_PORT = 27017
MONGO_DB = "myspider"
MONGO_STATISTICS = "statistics"

EXTENSIONS = {
    'scrapy_save_statistics.SaveStatistics': 100,
}

Spider

Spider must have statistics attributes and contains spider_url. We’ll save that info to mongodb.

class TestSpider(scrapy.Spider):
    name = "test"

    def __init__(self, name=None, **kwargs):
        super(TestSpider, self).__init__(name=name, **kwargs)
        self.statistics = []

    def parse(self, response):
        crawl_info = {'spider_url': spider_url,
                      'expected_crawl_num': expected_crawl_num,
                      'pages': total_page}
        self.statistics.append(crawl_info)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

scrapy_save_statistics-0.2-py2.py3-none-any.whl (4.4 kB view details)

Uploaded Python 2Python 3

File details

Details for the file scrapy_save_statistics-0.2-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for scrapy_save_statistics-0.2-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 aeb9743d06da179c480c5747cd086b1c19c5d0757295b72e950bc43c223f7e53
MD5 571d7b75a8fa663a540ed5b7c30b0dc7
BLAKE2b-256 c0ddd05e4c25af97e1bd384e1666b12d59e9a452f032b975e182a1dea6b67be2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page