Skip to main content

RDX parser for Python — parse .rdx documents at Rust speed

Project description

rdx-py

Python bindings for the RDX parser via PyO3 and maturin. Parse .rdx documents at Rust speed, get plain Python dicts back.

Installation

pip install rdx-parser

Usage

import rdx

ast = rdx.parse("""---
title: API Reference
---

# {$title}

<Notice type="warning">
  This endpoint is deprecated.
</Notice>
""")

print(ast["frontmatter"])  # {'title': 'API Reference'}
print(ast["children"][0]["type"])  # 'heading'

API

rdx.parse(input: str) -> dict

Parse an RDX document into an AST dict.

rdx.parse_with_defaults(input: str) -> dict

Parse with built-in transforms (auto-slug headings + table of contents).

rdx.parse_with_transforms(input: str, transforms: list[str]) -> dict

Parse with selected transforms. Available: "auto-slug", "toc".

rdx.validate(ast: dict, schema: dict) -> list[dict]

Validate an AST against a component schema.

schema = {
    "strict": True,
    "components": {
        "Notice": {
            "props": {
                "type": {"type": "enum", "required": True, "values": ["info", "warning", "error"]}
            }
        }
    }
}

diagnostics = rdx.validate(ast, schema)
for d in diagnostics:
    print(f"{d['severity']}: {d['message']} at line {d['line']}")

rdx.collect_text(ast: dict) -> str

Extract all plain text from the AST. Useful for search indexing, embeddings, and reading time estimation.

text = rdx.collect_text(ast)
words = text.split()
reading_time = len(words) // 200  # minutes

rdx.query_all(ast: dict, node_type: str) -> list[dict]

Find all nodes of a given type.

headings = rdx.query_all(ast, "heading")
components = rdx.query_all(ast, "component")

rdx.version() -> str

Returns the RDX parser version.

RAG / AI Pipeline Example

import rdx

def prepare_for_embedding(rdx_source: str) -> list[str]:
    """Parse RDX and split into clean text chunks by heading."""
    ast = rdx.parse(rdx_source)
    chunks = []
    current = []

    for node in ast["children"]:
        if node["type"] == "heading":
            if current:
                chunks.append(rdx.collect_text({"type": "root", "frontmatter": None, "children": current, "position": ast["position"]}))
            current = [node]
        else:
            current.append(node)

    if current:
        chunks.append(rdx.collect_text({"type": "root", "frontmatter": None, "children": current, "position": ast["position"]}))

    return chunks

Development

pip install maturin
maturin develop
python -c "import rdx; print(rdx.parse('# Hello'))"

License

Licensed under either of Apache License, Version 2.0 or MIT License at your option.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rdx_parser-0.1.0b7.tar.gz (53.0 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

rdx_parser-0.1.0b7-cp314-cp314-macosx_11_0_arm64.whl (882.6 kB view details)

Uploaded CPython 3.14macOS 11.0+ ARM64

rdx_parser-0.1.0b7-cp312-cp312-win_amd64.whl (794.6 kB view details)

Uploaded CPython 3.12Windows x86-64

rdx_parser-0.1.0b7-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

File details

Details for the file rdx_parser-0.1.0b7.tar.gz.

File metadata

  • Download URL: rdx_parser-0.1.0b7.tar.gz
  • Upload date:
  • Size: 53.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for rdx_parser-0.1.0b7.tar.gz
Algorithm Hash digest
SHA256 4216aa4d333b9e04f7c5658743b94bb90f7a57aaddd3165f946cc6a6a9bafae1
MD5 398d757f301414d88959020ccf9bd3d8
BLAKE2b-256 e3f85882522d63b7b184de8f9bf348f2469ef219d129532711651277d4dd3dfa

See more details on using hashes here.

Provenance

The following attestation bundles were made for rdx_parser-0.1.0b7.tar.gz:

Publisher: release.yml on rdx-lang/rdx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rdx_parser-0.1.0b7-cp314-cp314-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for rdx_parser-0.1.0b7-cp314-cp314-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 10038a23a524fc8365cf0b5b49e672fcb228c8fdb8a225a717f2c3810563b4ee
MD5 f93dd6bc2e91d8d806d35a611b466dc5
BLAKE2b-256 eafc28f0245b5dd37e88747b4c29fdb7b826d02e19316f83d36e8d1416905624

See more details on using hashes here.

Provenance

The following attestation bundles were made for rdx_parser-0.1.0b7-cp314-cp314-macosx_11_0_arm64.whl:

Publisher: release.yml on rdx-lang/rdx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rdx_parser-0.1.0b7-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for rdx_parser-0.1.0b7-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 77e21d9e29592cd6d321c3a146ca33fe9f5406db6ce8090ad89aaa023f4d49f1
MD5 8894b6fb935ea2fd3b40a944ef1bd206
BLAKE2b-256 f26fedce897a25a962f96ce28d7e4df1eb4871542858db1c38ca97a7d1597e3d

See more details on using hashes here.

Provenance

The following attestation bundles were made for rdx_parser-0.1.0b7-cp312-cp312-win_amd64.whl:

Publisher: release.yml on rdx-lang/rdx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rdx_parser-0.1.0b7-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rdx_parser-0.1.0b7-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 475ef4cfa5c5691bd6cac36abd67ff613a1e71b25145d3e62e5d98ffb10321d1
MD5 92c13d0311f1c5f1c9743d949215e412
BLAKE2b-256 d10bcbb7e14070357a1b6cd25c3eb177042437b8d13faa979648507c2d26fbda

See more details on using hashes here.

Provenance

The following attestation bundles were made for rdx_parser-0.1.0b7-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: release.yml on rdx-lang/rdx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page