Skip to main content

RDX parser for Python — parse .rdx documents at Rust speed

Project description

rdx-py

Python bindings for the RDX parser via PyO3 and maturin. Parse .rdx documents at Rust speed, get plain Python dicts back.

Installation

pip install rdx-parser

Usage

import rdx

ast = rdx.parse("""---
title: API Reference
---

# {$title}

<Notice type="warning">
  This endpoint is deprecated.
</Notice>
""")

print(ast["frontmatter"])  # {'title': 'API Reference'}
print(ast["children"][0]["type"])  # 'heading'

API

rdx.parse(input: str) -> dict

Parse an RDX document into an AST dict.

rdx.parse_with_defaults(input: str) -> dict

Parse with built-in transforms (auto-slug headings + table of contents).

rdx.parse_with_transforms(input: str, transforms: list[str]) -> dict

Parse with selected transforms. Available: "auto-slug", "toc".

rdx.validate(ast: dict, schema: dict) -> list[dict]

Validate an AST against a component schema.

schema = {
    "strict": True,
    "components": {
        "Notice": {
            "props": {
                "type": {"type": "enum", "required": True, "values": ["info", "warning", "error"]}
            }
        }
    }
}

diagnostics = rdx.validate(ast, schema)
for d in diagnostics:
    print(f"{d['severity']}: {d['message']} at line {d['line']}")

rdx.collect_text(ast: dict) -> str

Extract all plain text from the AST. Useful for search indexing, embeddings, and reading time estimation.

text = rdx.collect_text(ast)
words = text.split()
reading_time = len(words) // 200  # minutes

rdx.query_all(ast: dict, node_type: str) -> list[dict]

Find all nodes of a given type.

headings = rdx.query_all(ast, "heading")
components = rdx.query_all(ast, "component")

rdx.version() -> str

Returns the RDX parser version.

RAG / AI Pipeline Example

import rdx

def prepare_for_embedding(rdx_source: str) -> list[str]:
    """Parse RDX and split into clean text chunks by heading."""
    ast = rdx.parse(rdx_source)
    chunks = []
    current = []

    for node in ast["children"]:
        if node["type"] == "heading":
            if current:
                chunks.append(rdx.collect_text({"type": "root", "frontmatter": None, "children": current, "position": ast["position"]}))
            current = [node]
        else:
            current.append(node)

    if current:
        chunks.append(rdx.collect_text({"type": "root", "frontmatter": None, "children": current, "position": ast["position"]}))

    return chunks

Development

pip install maturin
maturin develop
python -c "import rdx; print(rdx.parse('# Hello'))"

License

Licensed under either of Apache License, Version 2.0 or MIT License at your option.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rdx_parser-0.1.0b6.tar.gz (49.9 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

rdx_parser-0.1.0b6-cp314-cp314-macosx_11_0_arm64.whl (874.4 kB view details)

Uploaded CPython 3.14macOS 11.0+ ARM64

rdx_parser-0.1.0b6-cp312-cp312-win_amd64.whl (789.1 kB view details)

Uploaded CPython 3.12Windows x86-64

rdx_parser-0.1.0b6-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

File details

Details for the file rdx_parser-0.1.0b6.tar.gz.

File metadata

  • Download URL: rdx_parser-0.1.0b6.tar.gz
  • Upload date:
  • Size: 49.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for rdx_parser-0.1.0b6.tar.gz
Algorithm Hash digest
SHA256 77de94b674cd5f4a5481b7253e98efd9be798b5bba914068bb05d23f6653e5e1
MD5 1d68ec97a3932ebbd4633b2f254b1291
BLAKE2b-256 381faad2fcfeac8453d9bff75c9b9ed937b882bd1d2e3f27d73c86c895bd9bdd

See more details on using hashes here.

Provenance

The following attestation bundles were made for rdx_parser-0.1.0b6.tar.gz:

Publisher: release.yml on rdx-lang/rdx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rdx_parser-0.1.0b6-cp314-cp314-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for rdx_parser-0.1.0b6-cp314-cp314-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 bd76c0ad2452de5279b1ff2920df4813cf283d936142bb398df4165b92fa778d
MD5 9ad0242567b4beda01061687ca819f8e
BLAKE2b-256 bef904bb49ba1730a2be3a0199b1aba9c30b3f5dfe2dd9e51e5c0f56eb6400b1

See more details on using hashes here.

Provenance

The following attestation bundles were made for rdx_parser-0.1.0b6-cp314-cp314-macosx_11_0_arm64.whl:

Publisher: release.yml on rdx-lang/rdx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rdx_parser-0.1.0b6-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for rdx_parser-0.1.0b6-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 5f708042f6bec5dee634524529cd45961c90639d6578da47748620532601a084
MD5 1c0d5215c45f5b90ec2ae9ead1fd27c9
BLAKE2b-256 679973d58644c078ebd416023bd257182d687875be69d9db780f520354bf8c37

See more details on using hashes here.

Provenance

The following attestation bundles were made for rdx_parser-0.1.0b6-cp312-cp312-win_amd64.whl:

Publisher: release.yml on rdx-lang/rdx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rdx_parser-0.1.0b6-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rdx_parser-0.1.0b6-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 fa476070b7188ca0463bfc2c379f79cd00ab5d4e2c5c5f00a0eec488d911c5af
MD5 39fb10b39ef879d26367bbc47881a225
BLAKE2b-256 b085dd9e653203529da42b2f2a9eaf700987fd3bcdbfda8dae0009d44a0606a4

See more details on using hashes here.

Provenance

The following attestation bundles were made for rdx_parser-0.1.0b6-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: release.yml on rdx-lang/rdx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page