Skip to main content

RDX parser for Python — parse .rdx documents at Rust speed

Project description

rdx-py

Python bindings for the RDX parser via PyO3 and maturin. Parse .rdx documents at Rust speed, get plain Python dicts back.

Installation

pip install rdx-parser

Usage

import rdx

ast = rdx.parse("""---
title: API Reference
---

# {$title}

<Notice type="warning">
  This endpoint is deprecated.
</Notice>
""")

print(ast["frontmatter"])  # {'title': 'API Reference'}
print(ast["children"][0]["type"])  # 'heading'

API

rdx.parse(input: str) -> dict

Parse an RDX document into an AST dict.

rdx.parse_with_defaults(input: str) -> dict

Parse with built-in transforms (auto-slug headings + table of contents).

rdx.parse_with_transforms(input: str, transforms: list[str]) -> dict

Parse with selected transforms. Available: "auto-slug", "toc".

rdx.validate(ast: dict, schema: dict) -> list[dict]

Validate an AST against a component schema.

schema = {
    "strict": True,
    "components": {
        "Notice": {
            "props": {
                "type": {"type": "enum", "required": True, "values": ["info", "warning", "error"]}
            }
        }
    }
}

diagnostics = rdx.validate(ast, schema)
for d in diagnostics:
    print(f"{d['severity']}: {d['message']} at line {d['line']}")

rdx.collect_text(ast: dict) -> str

Extract all plain text from the AST. Useful for search indexing, embeddings, and reading time estimation.

text = rdx.collect_text(ast)
words = text.split()
reading_time = len(words) // 200  # minutes

rdx.query_all(ast: dict, node_type: str) -> list[dict]

Find all nodes of a given type.

headings = rdx.query_all(ast, "heading")
components = rdx.query_all(ast, "component")

rdx.version() -> str

Returns the RDX parser version.

RAG / AI Pipeline Example

import rdx

def prepare_for_embedding(rdx_source: str) -> list[str]:
    """Parse RDX and split into clean text chunks by heading."""
    ast = rdx.parse(rdx_source)
    chunks = []
    current = []

    for node in ast["children"]:
        if node["type"] == "heading":
            if current:
                chunks.append(rdx.collect_text({"type": "root", "frontmatter": None, "children": current, "position": ast["position"]}))
            current = [node]
        else:
            current.append(node)

    if current:
        chunks.append(rdx.collect_text({"type": "root", "frontmatter": None, "children": current, "position": ast["position"]}))

    return chunks

Development

pip install maturin
maturin develop
python -c "import rdx; print(rdx.parse('# Hello'))"

License

Licensed under either of Apache License, Version 2.0 or MIT License at your option.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rdx_parser-0.1.0b1.tar.gz (48.2 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

rdx_parser-0.1.0b1-cp314-cp314-macosx_11_0_arm64.whl (876.5 kB view details)

Uploaded CPython 3.14macOS 11.0+ ARM64

rdx_parser-0.1.0b1-cp312-cp312-win_amd64.whl (789.6 kB view details)

Uploaded CPython 3.12Windows x86-64

rdx_parser-0.1.0b1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

File details

Details for the file rdx_parser-0.1.0b1.tar.gz.

File metadata

  • Download URL: rdx_parser-0.1.0b1.tar.gz
  • Upload date:
  • Size: 48.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for rdx_parser-0.1.0b1.tar.gz
Algorithm Hash digest
SHA256 d3d345785d55f15e575b766813addc6423a6fdd7f959be30305ff85c2deab57d
MD5 4426d1372a65f4296d2a4fb1cc438bfc
BLAKE2b-256 be305dae4815356d0463d8c5846223c2327b82d57c99cd42058c88bc0a075997

See more details on using hashes here.

Provenance

The following attestation bundles were made for rdx_parser-0.1.0b1.tar.gz:

Publisher: release.yml on rdx-lang/rdx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rdx_parser-0.1.0b1-cp314-cp314-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for rdx_parser-0.1.0b1-cp314-cp314-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 61c395c7acd10ddfe7773b3c336517e6101503b2e9feed89efd69c2cee534f7f
MD5 87296c9a48966fc65a21d9f89b2bd2a2
BLAKE2b-256 c5209e2330d8f808bbdebddd946dd09dd88d2eb1482677d91da3614cc730c3dd

See more details on using hashes here.

Provenance

The following attestation bundles were made for rdx_parser-0.1.0b1-cp314-cp314-macosx_11_0_arm64.whl:

Publisher: release.yml on rdx-lang/rdx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rdx_parser-0.1.0b1-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for rdx_parser-0.1.0b1-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 13c4fa7ef8a7b9fafe6e7e69c6f6f5a8f15db0351ce2ad16b116a5a0ab251ca7
MD5 9d8855b95844858b5ca309860cb241ca
BLAKE2b-256 b804a3416ba0b3b550206c9847ba07e665ac697e5c1fb8bcc75fc27b07597746

See more details on using hashes here.

Provenance

The following attestation bundles were made for rdx_parser-0.1.0b1-cp312-cp312-win_amd64.whl:

Publisher: release.yml on rdx-lang/rdx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rdx_parser-0.1.0b1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rdx_parser-0.1.0b1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 bf5fe61591301bd8eb0b786fef335f7b9be769b2cc7d0195e55b687022a35b1d
MD5 b1f91c43809493508eaacc9076a335f3
BLAKE2b-256 26cdd36469653dcf20a34242731e4efac8c09c28311283129a2c70e6b3c4c715

See more details on using hashes here.

Provenance

The following attestation bundles were made for rdx_parser-0.1.0b1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: release.yml on rdx-lang/rdx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page