Skip to main content

RDX parser for Python — parse .rdx documents at Rust speed

Project description

rdx-py

Python bindings for the RDX parser via PyO3 and maturin. Parse .rdx documents at Rust speed, get plain Python dicts back.

Installation

pip install rdx-parser

Usage

import rdx

ast = rdx.parse("""---
title: API Reference
---

# {$title}

<Notice type="warning">
  This endpoint is deprecated.
</Notice>
""")

print(ast["frontmatter"])  # {'title': 'API Reference'}
print(ast["children"][0]["type"])  # 'heading'

API

rdx.parse(input: str) -> dict

Parse an RDX document into an AST dict.

rdx.parse_with_defaults(input: str) -> dict

Parse with built-in transforms (auto-slug headings + table of contents).

rdx.parse_with_transforms(input: str, transforms: list[str]) -> dict

Parse with selected transforms. Available: "auto-slug", "toc".

rdx.validate(ast: dict, schema: dict) -> list[dict]

Validate an AST against a component schema.

schema = {
    "strict": True,
    "components": {
        "Notice": {
            "props": {
                "type": {"type": "enum", "required": True, "values": ["info", "warning", "error"]}
            }
        }
    }
}

diagnostics = rdx.validate(ast, schema)
for d in diagnostics:
    print(f"{d['severity']}: {d['message']} at line {d['line']}")

rdx.collect_text(ast: dict) -> str

Extract all plain text from the AST. Useful for search indexing, embeddings, and reading time estimation.

text = rdx.collect_text(ast)
words = text.split()
reading_time = len(words) // 200  # minutes

rdx.query_all(ast: dict, node_type: str) -> list[dict]

Find all nodes of a given type.

headings = rdx.query_all(ast, "heading")
components = rdx.query_all(ast, "component")

rdx.version() -> str

Returns the RDX parser version.

RAG / AI Pipeline Example

import rdx

def prepare_for_embedding(rdx_source: str) -> list[str]:
    """Parse RDX and split into clean text chunks by heading."""
    ast = rdx.parse(rdx_source)
    chunks = []
    current = []

    for node in ast["children"]:
        if node["type"] == "heading":
            if current:
                chunks.append(rdx.collect_text({"type": "root", "frontmatter": None, "children": current, "position": ast["position"]}))
            current = [node]
        else:
            current.append(node)

    if current:
        chunks.append(rdx.collect_text({"type": "root", "frontmatter": None, "children": current, "position": ast["position"]}))

    return chunks

Development

pip install maturin
maturin develop
python -c "import rdx; print(rdx.parse('# Hello'))"

License

Licensed under either of Apache License, Version 2.0 or MIT License at your option.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rdx_parser-0.1.0.tar.gz (53.0 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

rdx_parser-0.1.0-cp314-cp314-macosx_11_0_arm64.whl (882.0 kB view details)

Uploaded CPython 3.14macOS 11.0+ ARM64

rdx_parser-0.1.0-cp312-cp312-win_amd64.whl (799.0 kB view details)

Uploaded CPython 3.12Windows x86-64

rdx_parser-0.1.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

File details

Details for the file rdx_parser-0.1.0.tar.gz.

File metadata

  • Download URL: rdx_parser-0.1.0.tar.gz
  • Upload date:
  • Size: 53.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for rdx_parser-0.1.0.tar.gz
Algorithm Hash digest
SHA256 22ecd90366ea95ff40697f629d0c81092273d849bcc99191f1780e5607b27d7a
MD5 751ce7578ad2095d88c8af899fa984bd
BLAKE2b-256 7fd7957c4d33bbf00ccfcf8197287d6349c9995d3fb0d3de89f64fe5076fe6a9

See more details on using hashes here.

Provenance

The following attestation bundles were made for rdx_parser-0.1.0.tar.gz:

Publisher: release.yml on rdx-lang/rdx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rdx_parser-0.1.0-cp314-cp314-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for rdx_parser-0.1.0-cp314-cp314-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 d1cefe96016248ebf88377a040938b8bbdd47a23e8ec12c9345db2333449f1f3
MD5 69cfbbfeeedf93edd2a6f6045a71d050
BLAKE2b-256 ffbbd51af8b34e14a127408d907ccde517d5892faccf7327c38a7dd565fc235f

See more details on using hashes here.

Provenance

The following attestation bundles were made for rdx_parser-0.1.0-cp314-cp314-macosx_11_0_arm64.whl:

Publisher: release.yml on rdx-lang/rdx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rdx_parser-0.1.0-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: rdx_parser-0.1.0-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 799.0 kB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for rdx_parser-0.1.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 1f87ac662c98105c9d5c0cf585267bd2a2d2eb5401908191d17047aef5b700c1
MD5 c161ab6c0066c04b911ba4917f0c1999
BLAKE2b-256 100159b221ef54d62e0b44bf06fdbdd5261499dd2b188d39fa44318465f4e43e

See more details on using hashes here.

Provenance

The following attestation bundles were made for rdx_parser-0.1.0-cp312-cp312-win_amd64.whl:

Publisher: release.yml on rdx-lang/rdx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rdx_parser-0.1.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rdx_parser-0.1.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 fd6d428b7a2899dfc9c366d5dac35138888429aa4d6dfea52f266452ea2f3162
MD5 21af3a5198e7cf37d34c695ee96bad17
BLAKE2b-256 6dbd11e2019bf078e1f9d29123eb02dc50a3c04f65bca15430586abff3b31d06

See more details on using hashes here.

Provenance

The following attestation bundles were made for rdx_parser-0.1.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: release.yml on rdx-lang/rdx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page