Skip to main content

A simple python module for Zyte Scrapy ZyteAPI projects

Project description

zyte-api-convertor

A Python module to convert Zyte API Json payload to Scrapy ZyteAPI project. It uses Scrapy and scrapy-zyte-api plugin to generate the project, also it uses black to format the code.

Requirements

Python 3.6+
Scrapy
scrapy-zyte-api
black

Documentation

Zyte API Documentation

Test the Zyte API payload using postman or curl. Once it gives the desired response, use the same payload with this module to convert it to a Scrapy ZyteAPI project.

Installation

pip install zyte-api-convertor

Usage

    Usage: zyte-api-convertor <payload> --project-name <project_name> --spider-name <spider_name>
    Example: zyte-api-convertor '{"url": "https://httpbin.org/ip", "browserHtml": true, "screenshot": true}' --project-name sample_project --spider-name sample_spider

    Usage: zyte-api-convertor <payload> --project-name <project_name>
    Example: zyte-api-convertor '{"url": "https://httpbin.org/ip", "browserHtml": true, "screenshot": true}' --project-name sample_project

    Usage: zyte-api-convertor <payload> --spider-name <spider_name>
    Example: zyte-api-convertor '{"url": "https://httpbin.org/ip", "browserHtml": true, "screenshot": true}' --spider-name sample_spider

    Usage: zyte-api-convertor <payload>
    Example: zyte-api-convertor '{"url": "https://httpbin.org/ip", "browserHtml": true, "screenshot": true}'

Example

zyte-api-convertor expects a valid json payload at the least. But it does have other options as well. You can use the --project-name and --spider-name options to set the project and spider name. If you don't use these options, it will use the default project and spider name.

zyte-api-convertor '{"url": "https://httpbin.org/ip", "browserHtml": true, "screenshot": true}' --project-name sample_project --spider-name sample_spider

Output:

mukthy@Mukthys-MacBook-Pro % zyte-api-convertor '{"url": "https://httpbin.org/ip", "browserHtml": true, "screenshot": true}' --project-name sample_project --spider-name sample_spider
Code Generated!
Writing to file...
Writing Done!
reformatted sample_project/sample_project/spiders/sample_project.py

All done!  🍰 1 file reformatted.
Formatting Done!

Project Created Successfully.

mukthy@Mukthys-MacBook-Pro %  sample_project % tree
.
├── sample_project
│   ├── __init__.py
│   ├── items.py
│   ├── middlewares.py
│   ├── pipelines.py
│   ├── settings.py
│   └── spiders
│       ├── __init__.py
│       └── sample_project.py
└── scrapy.cfg

3 directories, 8 files

Sample Spider Code:

import scrapy


class SampleQuotesSpider(scrapy.Spider):
    name = "sample_spider"

    custom_settings = {
        "DOWNLOAD_HANDLERS": {
            "http": "scrapy_zyte_api.ScrapyZyteAPIDownloadHandler",
            "https": "scrapy_zyte_api.ScrapyZyteAPIDownloadHandler",
        },
        "DOWNLOADER_MIDDLEWARES": {
            "scrapy_zyte_api.ScrapyZyteAPIDownloaderMiddleware": 1000
        },
        "REQUEST_FINGERPRINTER_CLASS": "scrapy_zyte_api.ScrapyZyteAPIRequestFingerprinter",
        "TWISTED_REACTOR": "twisted.internet.asyncioreactor.AsyncioSelectorReactor",
        "ZYTE_API_KEY": "YOUR_API_KEY",
    }

    def start_requests(self):
        yield scrapy.Request(
            url="https://httpbin.org/ip",
            meta={
                "zyte_api": {
                    "javascript": False,
                    "screenshot": True,
                    "browserHtml": True,
                    "actions": [],
                    "requestHeaders": {},
                    "geolocation": "US",
                    "experimental": {"responseCookies": False},
                }
            },
        )

    def parse(self, response):
        print(response.text)

Please note that the ZYTE_API_KEY is not set in the custom_settings of the spider. You need to set it before running it.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zyte_api_convertor-1.0.1.tar.gz (5.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

zyte_api_convertor-1.0.1-py3-none-any.whl (6.0 kB view details)

Uploaded Python 3

File details

Details for the file zyte_api_convertor-1.0.1.tar.gz.

File metadata

  • Download URL: zyte_api_convertor-1.0.1.tar.gz
  • Upload date:
  • Size: 5.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.15

File hashes

Hashes for zyte_api_convertor-1.0.1.tar.gz
Algorithm Hash digest
SHA256 b5d6120fc5ca31106d90a1f50d51a00acadcf1edb885e405bb4218fb0a432398
MD5 dcf9c69d9fecae84c1f270659f23b0f2
BLAKE2b-256 e1d10bc27406356a1e0196388c5a15d7024159c6adc258f0c3b659f463719106

See more details on using hashes here.

File details

Details for the file zyte_api_convertor-1.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for zyte_api_convertor-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b5444ef182eec62753a475039b7788b9e59ea7b2485508afd14ed2cc80047b11
MD5 dd3944b2751f06b082ef6c35410c2a00
BLAKE2b-256 e0130ee0030d7642f698c4d46709aaba9a3b593fb939ad32e3843e0ec37e3287

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page