Skip to main content

RDX parser for Python — parse .rdx documents at Rust speed

Project description

rdx-py

Python bindings for the RDX parser via PyO3 and maturin. Parse .rdx documents at Rust speed, get plain Python dicts back.

Installation

pip install rdx-parser

Usage

import rdx

ast = rdx.parse("""---
title: API Reference
---

# {$title}

<Notice type="warning">
  This endpoint is deprecated.
</Notice>
""")

print(ast["frontmatter"])  # {'title': 'API Reference'}
print(ast["children"][0]["type"])  # 'heading'

API

rdx.parse(input: str) -> dict

Parse an RDX document into an AST dict.

rdx.parse_with_defaults(input: str) -> dict

Parse with built-in transforms (auto-slug headings + table of contents).

rdx.parse_with_transforms(input: str, transforms: list[str]) -> dict

Parse with selected transforms. Available: "auto-slug", "toc".

rdx.validate(ast: dict, schema: dict) -> list[dict]

Validate an AST against a component schema.

schema = {
    "strict": True,
    "components": {
        "Notice": {
            "props": {
                "type": {"type": "enum", "required": True, "values": ["info", "warning", "error"]}
            }
        }
    }
}

diagnostics = rdx.validate(ast, schema)
for d in diagnostics:
    print(f"{d['severity']}: {d['message']} at line {d['line']}")

rdx.collect_text(ast: dict) -> str

Extract all plain text from the AST. Useful for search indexing, embeddings, and reading time estimation.

text = rdx.collect_text(ast)
words = text.split()
reading_time = len(words) // 200  # minutes

rdx.query_all(ast: dict, node_type: str) -> list[dict]

Find all nodes of a given type.

headings = rdx.query_all(ast, "heading")
components = rdx.query_all(ast, "component")

rdx.version() -> str

Returns the RDX parser version.

RAG / AI Pipeline Example

import rdx

def prepare_for_embedding(rdx_source: str) -> list[str]:
    """Parse RDX and split into clean text chunks by heading."""
    ast = rdx.parse(rdx_source)
    chunks = []
    current = []

    for node in ast["children"]:
        if node["type"] == "heading":
            if current:
                chunks.append(rdx.collect_text({"type": "root", "frontmatter": None, "children": current, "position": ast["position"]}))
            current = [node]
        else:
            current.append(node)

    if current:
        chunks.append(rdx.collect_text({"type": "root", "frontmatter": None, "children": current, "position": ast["position"]}))

    return chunks

Development

pip install maturin
maturin develop
python -c "import rdx; print(rdx.parse('# Hello'))"

License

Licensed under either of Apache License, Version 2.0 or MIT License at your option.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rdx_parser-0.1.1.tar.gz (55.2 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

rdx_parser-0.1.1-cp314-cp314-macosx_11_0_arm64.whl (886.1 kB view details)

Uploaded CPython 3.14macOS 11.0+ ARM64

rdx_parser-0.1.1-cp312-cp312-win_amd64.whl (799.1 kB view details)

Uploaded CPython 3.12Windows x86-64

rdx_parser-0.1.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

File details

Details for the file rdx_parser-0.1.1.tar.gz.

File metadata

  • Download URL: rdx_parser-0.1.1.tar.gz
  • Upload date:
  • Size: 55.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for rdx_parser-0.1.1.tar.gz
Algorithm Hash digest
SHA256 911d1bf992ae0f515b3a0d0730192926186159eb521791574ff8c82113dfbb8b
MD5 2f479a0ff001badac95aef74022d0f52
BLAKE2b-256 076d914220fb0588c2e48c586f9446c68bd863100b8a062d9c692a426cbf6911

See more details on using hashes here.

Provenance

The following attestation bundles were made for rdx_parser-0.1.1.tar.gz:

Publisher: release.yml on rdx-lang/rdx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rdx_parser-0.1.1-cp314-cp314-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for rdx_parser-0.1.1-cp314-cp314-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 719ddbc023116b4d068ef3d6ed855ef9ffa0be7b61030809f3f7437eff4dcdd9
MD5 a7666cf969e1cbbb88b6af77e038dc76
BLAKE2b-256 6601e0dec200d3513df4a8093fb3d57e9ae036daec02d9d7c44f1e9531a15831

See more details on using hashes here.

Provenance

The following attestation bundles were made for rdx_parser-0.1.1-cp314-cp314-macosx_11_0_arm64.whl:

Publisher: release.yml on rdx-lang/rdx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rdx_parser-0.1.1-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: rdx_parser-0.1.1-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 799.1 kB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for rdx_parser-0.1.1-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 d814e08585fe487f05fb4558b954001df682c7e600f3034fc4a1bc3ba82a0719
MD5 1dc367473ee8fc7ebb8ae1db81b2bb32
BLAKE2b-256 44382b40020330dd662dfec671b76ff13b047ab1135dbeca2e5a543ed0f1fa88

See more details on using hashes here.

Provenance

The following attestation bundles were made for rdx_parser-0.1.1-cp312-cp312-win_amd64.whl:

Publisher: release.yml on rdx-lang/rdx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rdx_parser-0.1.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rdx_parser-0.1.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 4ba7678d16121df65874f370aa4baa587c9c61b569b4554d96d7c30900fb71b2
MD5 c4af4a96618f8c754f96779db2440184
BLAKE2b-256 0b1f870662d0a59eea414d7efcfed22418a8519fc96919449fc4b8f0976c667e

See more details on using hashes here.

Provenance

The following attestation bundles were made for rdx_parser-0.1.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: release.yml on rdx-lang/rdx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page