Skip to main content

A simple python module for Zyte Scrapy ZyteAPI projects

Project description

zyte-api-convertor

A Python module to convert Zyte API Json payload to Scrapy ZyteAPI project. It uses Scrapy and scrapy-zyte-api plugin to generate the project, also it uses black to format the code.

Requirements

Python 3.6+
Scrapy
scrapy-zyte-api
black

Documentation

Zyte API Documentation

Test the Zyte API payload using postman or curl. Once it gives the desired response, use the same payload with this module to convert it to a Scrapy ZyteAPI project.

Installation

pip install zyte-api-convertor

Usage

    Usage: zyte-api-convertor <payload> --project-name <project_name> --spider-name <spider_name>
    Example: zyte-api-convertor '{"url": "https://httpbin.org/ip", "browserHtml": true, "screenshot": true}' --project-name sample_project --spider-name sample_spider

    Usage: zyte-api-convertor <payload> --project-name <project_name>
    Example: zyte-api-convertor '{"url": "https://httpbin.org/ip", "browserHtml": true, "screenshot": true}' --project-name sample_project

    Usage: zyte-api-convertor <payload> --spider-name <spider_name>
    Example: zyte-api-convertor '{"url": "https://httpbin.org/ip", "browserHtml": true, "screenshot": true}' --spider-name sample_spider

    Usage: zyte-api-convertor <payload>
    Example: zyte-api-convertor '{"url": "https://httpbin.org/ip", "browserHtml": true, "screenshot": true}'

Example

zyte-api-convertor expects a valid json payload at the least. But it does have other options as well. You can use the --project-name and --spider-name options to set the project and spider name. If you don't use these options, it will use the default project and spider name.

zyte-api-convertor '{"url": "https://httpbin.org/ip", "browserHtml": true, "screenshot": true}' --project-name sample_project --spider-name sample_spider

Output:

mukthy@Mukthys-MacBook-Pro % zyte-api-convertor '{"url": "https://httpbin.org/ip", "browserHtml": true, "screenshot": true}' --project-name sample_project --spider-name sample_spider
Code Generated!
Writing to file...
Writing Done!
reformatted sample_project/sample_project/spiders/sample_project.py

All done!  🍰 1 file reformatted.
Formatting Done!

Project Created Successfully.

mukthy@Mukthys-MacBook-Pro %  sample_project % tree
.
├── sample_project
│   ├── __init__.py
│   ├── items.py
│   ├── middlewares.py
│   ├── pipelines.py
│   ├── settings.py
│   └── spiders
│       ├── __init__.py
│       └── sample_project.py
└── scrapy.cfg

3 directories, 8 files

Sample Spider Code:

import scrapy


class SampleQuotesSpider(scrapy.Spider):
    name = "sample_spider"

    custom_settings = {
        "DOWNLOAD_HANDLERS": {
            "http": "scrapy_zyte_api.ScrapyZyteAPIDownloadHandler",
            "https": "scrapy_zyte_api.ScrapyZyteAPIDownloadHandler",
        },
        "DOWNLOADER_MIDDLEWARES": {
            "scrapy_zyte_api.ScrapyZyteAPIDownloaderMiddleware": 1000
        },
        "REQUEST_FINGERPRINTER_CLASS": "scrapy_zyte_api.ScrapyZyteAPIRequestFingerprinter",
        "TWISTED_REACTOR": "twisted.internet.asyncioreactor.AsyncioSelectorReactor",
        "ZYTE_API_KEY": "YOUR_API_KEY",
    }

    def start_requests(self):
        yield scrapy.Request(
            url="https://httpbin.org/ip",
            meta={
                "zyte_api": {
                    "javascript": False,
                    "screenshot": True,
                    "browserHtml": True,
                    "actions": [],
                    "requestHeaders": {},
                    "geolocation": "US",
                    "experimental": {"responseCookies": False},
                }
            },
        )

    def parse(self, response):
        print(response.text)

Please note that the ZYTE_API_KEY is not set in the custom_settings of the spider. You need to set it before running it.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zyte_api_convertor-1.0.3.tar.gz (6.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

zyte_api_convertor-1.0.3-py3-none-any.whl (6.1 kB view details)

Uploaded Python 3

File details

Details for the file zyte_api_convertor-1.0.3.tar.gz.

File metadata

  • Download URL: zyte_api_convertor-1.0.3.tar.gz
  • Upload date:
  • Size: 6.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.15

File hashes

Hashes for zyte_api_convertor-1.0.3.tar.gz
Algorithm Hash digest
SHA256 ee464e45c21f8e7ebebc7f90a12f80c5062fe3054808401199eccaf4df30146e
MD5 0fadbdc158214049555f8075c1a8ba65
BLAKE2b-256 5ddb04402002c8f0786b93bdaed3ef4543d740b652818bb72ce8b3802339b016

See more details on using hashes here.

File details

Details for the file zyte_api_convertor-1.0.3-py3-none-any.whl.

File metadata

File hashes

Hashes for zyte_api_convertor-1.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 6b1fb357ca7f52d4705e4578768981d20286ba3d34d2c8481514ad74379fe44e
MD5 0cbea52ddf2a63d1cf9ee63cc9902c0d
BLAKE2b-256 df24fa661dc8fbc3684611013da1020204754fd07be47a15b0a160a6cf3598ed

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page