Skip to main content

RDX parser for Python — parse .rdx documents at Rust speed

Project description

rdx-py

Python bindings for the RDX parser via PyO3 and maturin. Parse .rdx documents at Rust speed, get plain Python dicts back.

Installation

pip install rdx-parser

Usage

import rdx

ast = rdx.parse("""---
title: API Reference
---

# {$title}

<Notice type="warning">
  This endpoint is deprecated.
</Notice>
""")

print(ast["frontmatter"])  # {'title': 'API Reference'}
print(ast["children"][0]["type"])  # 'heading'

API

rdx.parse(input: str) -> dict

Parse an RDX document into an AST dict.

rdx.parse_with_defaults(input: str) -> dict

Parse with built-in transforms (auto-slug headings + table of contents).

rdx.parse_with_transforms(input: str, transforms: list[str]) -> dict

Parse with selected transforms. Available: "auto-slug", "toc".

rdx.validate(ast: dict, schema: dict) -> list[dict]

Validate an AST against a component schema.

schema = {
    "strict": True,
    "components": {
        "Notice": {
            "props": {
                "type": {"type": "enum", "required": True, "values": ["info", "warning", "error"]}
            }
        }
    }
}

diagnostics = rdx.validate(ast, schema)
for d in diagnostics:
    print(f"{d['severity']}: {d['message']} at line {d['line']}")

rdx.collect_text(ast: dict) -> str

Extract all plain text from the AST. Useful for search indexing, embeddings, and reading time estimation.

text = rdx.collect_text(ast)
words = text.split()
reading_time = len(words) // 200  # minutes

rdx.query_all(ast: dict, node_type: str) -> list[dict]

Find all nodes of a given type.

headings = rdx.query_all(ast, "heading")
components = rdx.query_all(ast, "component")

rdx.version() -> str

Returns the RDX parser version.

RAG / AI Pipeline Example

import rdx

def prepare_for_embedding(rdx_source: str) -> list[str]:
    """Parse RDX and split into clean text chunks by heading."""
    ast = rdx.parse(rdx_source)
    chunks = []
    current = []

    for node in ast["children"]:
        if node["type"] == "heading":
            if current:
                chunks.append(rdx.collect_text({"type": "root", "frontmatter": None, "children": current, "position": ast["position"]}))
            current = [node]
        else:
            current.append(node)

    if current:
        chunks.append(rdx.collect_text({"type": "root", "frontmatter": None, "children": current, "position": ast["position"]}))

    return chunks

Development

pip install maturin
maturin develop
python -c "import rdx; print(rdx.parse('# Hello'))"

License

Licensed under either of Apache License, Version 2.0 or MIT License at your option.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rdx_parser-0.1.0b4.tar.gz (48.2 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

rdx_parser-0.1.0b4-cp314-cp314-macosx_11_0_arm64.whl (874.8 kB view details)

Uploaded CPython 3.14macOS 11.0+ ARM64

rdx_parser-0.1.0b4-cp312-cp312-win_amd64.whl (788.5 kB view details)

Uploaded CPython 3.12Windows x86-64

rdx_parser-0.1.0b4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

File details

Details for the file rdx_parser-0.1.0b4.tar.gz.

File metadata

  • Download URL: rdx_parser-0.1.0b4.tar.gz
  • Upload date:
  • Size: 48.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for rdx_parser-0.1.0b4.tar.gz
Algorithm Hash digest
SHA256 68a9b3f4c3983e6dd5d19fe5143bab665c7f93e15b9dc9bd93cff40709828cb4
MD5 3ff8a8df5756ccb9f26d91e4dbd1f51c
BLAKE2b-256 ce8af46f4d75e91b5ea50fa0c3e48d507a47587856e9daa31b405cf8f357936f

See more details on using hashes here.

Provenance

The following attestation bundles were made for rdx_parser-0.1.0b4.tar.gz:

Publisher: release.yml on rdx-lang/rdx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rdx_parser-0.1.0b4-cp314-cp314-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for rdx_parser-0.1.0b4-cp314-cp314-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 a34865fa600546ec7f45735f30bedfe08bfbb55f52d0dfe054a4950e8030b003
MD5 786079faf696f70ea83e401de03a647b
BLAKE2b-256 636711a7a62e331f5fc49a7c038640d67031a1eec140058a3a7c90611018af72

See more details on using hashes here.

Provenance

The following attestation bundles were made for rdx_parser-0.1.0b4-cp314-cp314-macosx_11_0_arm64.whl:

Publisher: release.yml on rdx-lang/rdx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rdx_parser-0.1.0b4-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for rdx_parser-0.1.0b4-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 9eb42d0b34ff2c1a27fbd54bb7e6ce869529dc08877c7e11e15addc8f85082c8
MD5 c7fce91b1e4a231750ef75a4a45ce43c
BLAKE2b-256 a496940c21eb8accdec5769bb1a3b5a3ee2f2393b9d9b14d8594bacf9c4087b3

See more details on using hashes here.

Provenance

The following attestation bundles were made for rdx_parser-0.1.0b4-cp312-cp312-win_amd64.whl:

Publisher: release.yml on rdx-lang/rdx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rdx_parser-0.1.0b4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rdx_parser-0.1.0b4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 c2d5aed2bb8113e91db05dba1b005359e8b8f1543ddd0859bf2d176d5ebbdf0d
MD5 7f20347f3b8bb549ce8c602b33959363
BLAKE2b-256 ecbbda585b9faf1fe2ca2a89b4033572efe6e7e484a4ae6803c82ddf206df930

See more details on using hashes here.

Provenance

The following attestation bundles were made for rdx_parser-0.1.0b4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: release.yml on rdx-lang/rdx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page