Skip to main content

A simple python module for Zyte Scrapy ZyteAPI projects

Project description

zyte-api-convertor

A Python module to convert Zyte API Json payload to Scrapy ZyteAPI project. It uses Scrapy and scrapy-zyte-api plugin to generate the project, also it uses black to format the code.

Requirements

Python 3.6+
Scrapy
scrapy-zyte-api
black

Documentation

Zyte API Documentation

Test the Zyte API payload using postman or curl. Once it gives the desired response, use the same payload with this module to convert it to a Scrapy ZyteAPI project.

Installation

pip install zyte-api-convertor

Usage

    Usage: zyte-api-convertor <payload> --project-name <project_name> --spider-name <spider_name>
    Example: zyte-api-convertor '{"url": "https://httpbin.org/ip", "browserHtml": true, "screenshot": true}' --project-name sample_project --spider-name sample_spider

    Usage: zyte-api-convertor <payload> --project-name <project_name>
    Example: zyte-api-convertor '{"url": "https://httpbin.org/ip", "browserHtml": true, "screenshot": true}' --project-name sample_project

    Usage: zyte-api-convertor <payload> --spider-name <spider_name>
    Example: zyte-api-convertor '{"url": "https://httpbin.org/ip", "browserHtml": true, "screenshot": true}' --spider-name sample_spider

    Usage: zyte-api-convertor <payload>
    Example: zyte-api-convertor '{"url": "https://httpbin.org/ip", "browserHtml": true, "screenshot": true}'

Example

zyte-api-convertor expects a valid json payload at the least. But it does have other options as well. You can use the --project-name and --spider-name options to set the project and spider name. If you don't use these options, it will use the default project and spider name.

zyte-api-convertor '{"url": "https://httpbin.org/ip", "browserHtml": true, "screenshot": true}' --project-name sample_project --spider-name sample_spider

Output:

mukthy@Mukthys-MacBook-Pro % zyte-api-convertor '{"url": "https://httpbin.org/ip", "browserHtml": true, "screenshot": true}' --project-name sample_project --spider-name sample_spider
Code Generated!
Writing to file...
Writing Done!
reformatted sample_project/sample_project/spiders/sample_project.py

All done!  🍰 1 file reformatted.
Formatting Done!

Project Created Successfully.

mukthy@Mukthys-MacBook-Pro %  sample_project % tree
.
├── sample_project
│   ├── __init__.py
│   ├── items.py
│   ├── middlewares.py
│   ├── pipelines.py
│   ├── settings.py
│   └── spiders
│       ├── __init__.py
│       └── sample_project.py
└── scrapy.cfg

3 directories, 8 files

Sample Spider Code:

import scrapy


class SampleQuotesSpider(scrapy.Spider):
    name = "sample_spider"

    custom_settings = {
        "DOWNLOAD_HANDLERS": {
            "http": "scrapy_zyte_api.ScrapyZyteAPIDownloadHandler",
            "https": "scrapy_zyte_api.ScrapyZyteAPIDownloadHandler",
        },
        "DOWNLOADER_MIDDLEWARES": {
            "scrapy_zyte_api.ScrapyZyteAPIDownloaderMiddleware": 1000
        },
        "REQUEST_FINGERPRINTER_CLASS": "scrapy_zyte_api.ScrapyZyteAPIRequestFingerprinter",
        "TWISTED_REACTOR": "twisted.internet.asyncioreactor.AsyncioSelectorReactor",
        "ZYTE_API_KEY": "YOUR_API_KEY",
    }

    def start_requests(self):
        yield scrapy.Request(
            url="https://httpbin.org/ip",
            meta={
                "zyte_api": {
                    "javascript": False,
                    "screenshot": True,
                    "browserHtml": True,
                    "actions": [],
                    "requestHeaders": {},
                    "geolocation": "US",
                    "experimental": {"responseCookies": False},
                }
            },
        )

    def parse(self, response):
        print(response.text)

Please note that the ZYTE_API_KEY is not set in the custom_settings of the spider. You need to set it before running it.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zyte_api_convertor-1.0.2.tar.gz (6.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

zyte_api_convertor-1.0.2-py3-none-any.whl (6.1 kB view details)

Uploaded Python 3

File details

Details for the file zyte_api_convertor-1.0.2.tar.gz.

File metadata

  • Download URL: zyte_api_convertor-1.0.2.tar.gz
  • Upload date:
  • Size: 6.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.15

File hashes

Hashes for zyte_api_convertor-1.0.2.tar.gz
Algorithm Hash digest
SHA256 14c264bcb84a6d407cefcd0be7e24ab194590da171cd8cf61477b2f1e971b7db
MD5 b85264724c56ebc223587a0f576c3936
BLAKE2b-256 b44529304a71a60edd6cbe00bc7fc684d52bee6c08fd9bc798c16b512d6d3939

See more details on using hashes here.

File details

Details for the file zyte_api_convertor-1.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for zyte_api_convertor-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 744738dfbe1183df8ccbcc76a2d5893bfc1a741b4ed9b9e74f3670bae88319ad
MD5 fc65aca4a2534c9fbfa216da546fc14d
BLAKE2b-256 838b2e9d17fdf30d5271e308d74a3472769f9514f3dfbb6d8f087d3066769cc2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page