Skip to main content

RDX parser for Python — parse .rdx documents at Rust speed

Project description

rdx-py

Python bindings for the RDX parser via PyO3 and maturin. Parse .rdx documents at Rust speed, get plain Python dicts back.

Installation

pip install rdx-parser

Usage

import rdx

ast = rdx.parse("""---
title: API Reference
---

# {$title}

<Notice type="warning">
  This endpoint is deprecated.
</Notice>
""")

print(ast["frontmatter"])  # {'title': 'API Reference'}
print(ast["children"][0]["type"])  # 'heading'

API

rdx.parse(input: str) -> dict

Parse an RDX document into an AST dict.

rdx.parse_with_defaults(input: str) -> dict

Parse with built-in transforms (auto-slug headings + table of contents).

rdx.parse_with_transforms(input: str, transforms: list[str]) -> dict

Parse with selected transforms. Available: "auto-slug", "toc".

rdx.validate(ast: dict, schema: dict) -> list[dict]

Validate an AST against a component schema.

schema = {
    "strict": True,
    "components": {
        "Notice": {
            "props": {
                "type": {"type": "enum", "required": True, "values": ["info", "warning", "error"]}
            }
        }
    }
}

diagnostics = rdx.validate(ast, schema)
for d in diagnostics:
    print(f"{d['severity']}: {d['message']} at line {d['line']}")

rdx.collect_text(ast: dict) -> str

Extract all plain text from the AST. Useful for search indexing, embeddings, and reading time estimation.

text = rdx.collect_text(ast)
words = text.split()
reading_time = len(words) // 200  # minutes

rdx.query_all(ast: dict, node_type: str) -> list[dict]

Find all nodes of a given type.

headings = rdx.query_all(ast, "heading")
components = rdx.query_all(ast, "component")

rdx.version() -> str

Returns the RDX parser version.

RAG / AI Pipeline Example

import rdx

def prepare_for_embedding(rdx_source: str) -> list[str]:
    """Parse RDX and split into clean text chunks by heading."""
    ast = rdx.parse(rdx_source)
    chunks = []
    current = []

    for node in ast["children"]:
        if node["type"] == "heading":
            if current:
                chunks.append(rdx.collect_text({"type": "root", "frontmatter": None, "children": current, "position": ast["position"]}))
            current = [node]
        else:
            current.append(node)

    if current:
        chunks.append(rdx.collect_text({"type": "root", "frontmatter": None, "children": current, "position": ast["position"]}))

    return chunks

Development

pip install maturin
maturin develop
python -c "import rdx; print(rdx.parse('# Hello'))"

License

Licensed under either of Apache License, Version 2.0 or MIT License at your option.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rdx_parser-0.1.0b3.tar.gz (48.2 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

rdx_parser-0.1.0b3-cp314-cp314-macosx_11_0_arm64.whl (875.5 kB view details)

Uploaded CPython 3.14macOS 11.0+ ARM64

rdx_parser-0.1.0b3-cp312-cp312-win_amd64.whl (788.9 kB view details)

Uploaded CPython 3.12Windows x86-64

rdx_parser-0.1.0b3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

File details

Details for the file rdx_parser-0.1.0b3.tar.gz.

File metadata

  • Download URL: rdx_parser-0.1.0b3.tar.gz
  • Upload date:
  • Size: 48.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for rdx_parser-0.1.0b3.tar.gz
Algorithm Hash digest
SHA256 b95a61eb7cd859c835241cff0ed057eb8ae5a3e8ce5175d1d83a0485981702a8
MD5 e7ba07faa903df9034003ac9fc534d12
BLAKE2b-256 7643fcbbeb19220d28fdfbab95a9c9c0eb5242675b4e0f3b48b60512c1f2cb7c

See more details on using hashes here.

Provenance

The following attestation bundles were made for rdx_parser-0.1.0b3.tar.gz:

Publisher: release.yml on rdx-lang/rdx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rdx_parser-0.1.0b3-cp314-cp314-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for rdx_parser-0.1.0b3-cp314-cp314-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 10f9d13886bd99b4bb9e7e3cf17660cf53345ae84c426116d95aa957a99ec4e5
MD5 a9331a1588bd0f53762707ed5d68efe9
BLAKE2b-256 18498ab63d82721a76c32c121b438947c30b8fe3a4496abf836e70e5d24c5aac

See more details on using hashes here.

Provenance

The following attestation bundles were made for rdx_parser-0.1.0b3-cp314-cp314-macosx_11_0_arm64.whl:

Publisher: release.yml on rdx-lang/rdx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rdx_parser-0.1.0b3-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for rdx_parser-0.1.0b3-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 5182ba1034ba32fc5f991f0bed80d746fa3d1347301d65e78508f7a046d67d95
MD5 18f1bbd37e75449b556f029040bff453
BLAKE2b-256 f1aa13a518f8a40675f49d38c76657565da1c41824242b31ab2c8bde2faecb87

See more details on using hashes here.

Provenance

The following attestation bundles were made for rdx_parser-0.1.0b3-cp312-cp312-win_amd64.whl:

Publisher: release.yml on rdx-lang/rdx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rdx_parser-0.1.0b3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rdx_parser-0.1.0b3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 65d60a91ac669f6bede5289dab54806b775a7f3c7c1f2b61f2ea312e7d6da613
MD5 a78eb9b708b10106d266facfa78b1f39
BLAKE2b-256 2ee87e77a2d30fca350705068cd30ab54826f0179c795a904e1f13a32a034dee

See more details on using hashes here.

Provenance

The following attestation bundles were made for rdx_parser-0.1.0b3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: release.yml on rdx-lang/rdx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page