Skip to main content

generate selector schemas classes from dsl-like language based on python

Project description

Selector schema codegen

ssc_codegen - generator of parsers for various programming languages (for html priority) using python-DSL configurations with built-in declarative language.

Designed to port parsers to various programming languages and libs

Motivation

  • interesting in practice write DSL-like language
  • decrease boilerplate code for web-parsers
  • write once - convert to other mainstream http parser libs
  • minimal operations for easy add another libs and languages in future
    • include css, xpath, attributes operations, regex, minimal string formatting operations
  • pre validate css/xpath queries and logic before generate code
  • standardisation: generate classes with minimal dependencies and documented parsed signature

Install

pipx (recommended for CLI usage)

pipx install ssc_codegen

pip

pip install ssc_codegen

Supported libs and languages

language lib xpath css formatter
python bs4 NO YES black
- parsel YES YES -
- selectolax (modest) NO YES -
- scrapy (based on parsel, but class init argument - Response) YES YES -
dart universal_html NO YES dart format

Quickstart

see example and read code with comments

Recommendations

  • usage css selector: they can be guaranteed converted to xpath (if target language not support CSS selectors)
  • usage simple operations for more compatibility other libraries.
    • Some libraries may not fully support selector specifications
    • for example, #product_description+ p selector in parsel works fine, but not works in selectolax, dart libs
  • there is a xpath to css converter for simple queries without guarantees of functionality. For example, in css there is no analogue of contains from xpath, etc.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ssc_codegen-0.2.6.tar.gz (19.3 kB view details)

Uploaded Source

Built Distribution

ssc_codegen-0.2.6-py3-none-any.whl (37.8 kB view details)

Uploaded Python 3

File details

Details for the file ssc_codegen-0.2.6.tar.gz.

File metadata

  • Download URL: ssc_codegen-0.2.6.tar.gz
  • Upload date:
  • Size: 19.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.11.4 Linux/5.15.0-79-generic

File hashes

Hashes for ssc_codegen-0.2.6.tar.gz
Algorithm Hash digest
SHA256 f9716977b01e77a4e4de3fbb64f9e7e315b3b3292229b9bdc39fbe00f60ed245
MD5 00aff692e001152ce632264dc53aa703
BLAKE2b-256 44ab7e00eae4d88aa5f15afa6af06a61b4e37eeafc481b8c8642557d4956ee26

See more details on using hashes here.

File details

Details for the file ssc_codegen-0.2.6-py3-none-any.whl.

File metadata

  • Download URL: ssc_codegen-0.2.6-py3-none-any.whl
  • Upload date:
  • Size: 37.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.11.4 Linux/5.15.0-79-generic

File hashes

Hashes for ssc_codegen-0.2.6-py3-none-any.whl
Algorithm Hash digest
SHA256 d913097f2cf1fa2304b8c9da5abb9e774c6b0f558875fe1454b37b1775725717
MD5 da4072dda3f7733c814f93ca7930afa7
BLAKE2b-256 d7a1d7b15cc44f8894a7df3e688a7be32a2bfb805ba8410988c9c2779e7ac609

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page