Skip to main content

Unified data extraction — CSS, XPath, Regex, and JMESPath behind one query interface.

Project description

ChadSelect

Unified data extraction — CSS, XPath, Regex, and JMESPath behind one query interface.

from chadselect import ChadSelect

cs = ChadSelect()
cs.add_html(html)
cs.add_json(json_str)

# One syntax, four engines
title = cs.select(0, "css:h1.title")
author = cs.select(0, "xpath://span[@class='author']/text()")
vin = cs.select(0, r"regex:[A-HJ-NPR-Z0-9]{17}")
name = cs.select(0, "json:data.products[0].name")

# Function piping
clean = cs.select(0, "css:.price >> trim >> uppercase()")

Install

pip install chadselect

Query Syntax

Queries use a engine:expression prefix:

Prefix Engine Best For
css: CSS Selectors (selectolax) HTML element selection
xpath: XPath 1.0 (lxml) Complex HTML/XML traversal
regex: Regular Expressions (re) Pattern matching on raw text
json: JMESPath (jmespath) JSON field extraction

No prefix defaults to regex.

Function Piping

Chain text transformations with >>:

cs.select(0, "css:.price >> trim >> substring-after('$') >> uppercase()")

Available functions: trim, uppercase(), lowercase(), normalize-space(), substring-after('delim'), substring-before('delim'), substring(start, len), replace('old', 'new'), get-attr('name').

API

cs = ChadSelect()

# Load content
cs.add_html(html_string)
cs.add_json(json_string)
cs.add_text(plain_text)

# Query (index: 0=first, -1=all)
results = cs.query(-1, "css:.price")          # List[str] — all matches
value = cs.select(0, "css:.price")            # str — first match or ""

# Multi-query
first_hit = cs.select_first([(0, "css:#id"), (0, "xpath://fallback")])
combined = cs.select_many([(-1, "css:.a"), (-1, "css:.b")])

# Batch (fastest for many fields)
results = cs.query_batch([(-1, "css:.title"), (-1, "json:data.name")])

# With validators
results = cs.select_where(0, "css:.vin", lambda v: len(v) == 17)

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chadselect-0.2.0.tar.gz (16.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

chadselect-0.2.0-py3-none-any.whl (12.6 kB view details)

Uploaded Python 3

File details

Details for the file chadselect-0.2.0.tar.gz.

File metadata

  • Download URL: chadselect-0.2.0.tar.gz
  • Upload date:
  • Size: 16.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for chadselect-0.2.0.tar.gz
Algorithm Hash digest
SHA256 c91f9f7713f2eaad7acab47ab16086087c1b2973656d8a799fc4439164e947c7
MD5 377a27fe86cfaa8adc512f7bbf7c0384
BLAKE2b-256 ba9857fa5e9a0568c6209eb1d8706dc824e9c622d793969eab1f45b2394d798a

See more details on using hashes here.

File details

Details for the file chadselect-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: chadselect-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 12.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for chadselect-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b05857a6f23dcbbf2f7100eb4dc4b16f1eebfc80f2b226ca29908cf3716a51ee
MD5 bbb8ae9e4eff630a4db3144aaae16166
BLAKE2b-256 c0b4bf0d595f01b826eb65ffc1c1eb4af3cf0aeaa32c413385ac0331ff67b002

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page