generate selector schemas classes from dsl-like language based on python
Project description
Selector schema codegen
ssc_codegen - generator of parsers for various programming languages (for html priority) using python-DSL configurations with built-in declarative language.
Designed to port parsers to various programming languages and libs
Motivation
interesting in practice write DSL-like language- decrease boilerplate code for web-parsers
- write once - convert to other mainstream http parser libs
- minimal operations for easy add another libs and languages in future
- include css, xpath, attributes operations, regex, minimal string formatting operations
- pre validate css/xpath queries and logic before generate code
- standardisation: generate classes with minimal dependencies and documented parsed signature
Install
pipx (recommended for CLI usage)
pipx install ssc_codegen
pip
pip install ssc_codegen
Supported libs and languages
language | lib | xpath | css | formatter |
---|---|---|---|---|
python | bs4 | NO | YES | black |
- | parsel | YES | YES | - |
- | selectolax (modest) | NO | YES | - |
- | scrapy (based on parsel, but class init argument - Response) | YES | YES | - |
dart | universal_html | NO | YES | dart format |
Quickstart
see example and read code with comments
Recommendations
- usage css selector: they can be guaranteed converted to xpath (if target language not support CSS selectors)
- usage simple operations for more compatibility other libraries.
- Some libraries may not fully support selector specifications
- for example,
#product_description+ p
selector inparsel
works fine, but not works inselectolax
,dart
libs
- there is a xpath to css converter for simple queries without guarantees of functionality.
For example, in css there is no analogue of
contains
from xpath, etc.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
ssc_codegen-0.2.3.tar.gz
(19.0 kB
view details)
Built Distribution
File details
Details for the file ssc_codegen-0.2.3.tar.gz
.
File metadata
- Download URL: ssc_codegen-0.2.3.tar.gz
- Upload date:
- Size: 19.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.1 CPython/3.11.4 Linux/5.15.0-79-generic
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8084a8956e6190e4e37acf510524ac66d07ff7108094ac6bba07b5ddb8dbe8e8 |
|
MD5 | 8494a1443a803ccdf50ddd5a9c36d666 |
|
BLAKE2b-256 | f17438847aeee7c7aae2279c1770beb2f15b4bd482afb96753d4677ad6c0d28c |
File details
Details for the file ssc_codegen-0.2.3-py3-none-any.whl
.
File metadata
- Download URL: ssc_codegen-0.2.3-py3-none-any.whl
- Upload date:
- Size: 37.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.1 CPython/3.11.4 Linux/5.15.0-79-generic
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ae58a4f31716792dba79d699b2e9c7806977d88c38418782f8e4892fbfc779b1 |
|
MD5 | 586e9cd1a336e06e2aa303b4a1f0b51b |
|
BLAKE2b-256 | 6871dfb04eae88135318419c2c81c80e200d5f374a0f4578262e8cdee1e9a546 |