Deterministic RLang compiler with cryptographic proof generation for BoR (Blockchain of Reasoning)

These details have not been verified by PyPI

Project links

Project description

RLang Compiler — Deterministic Reasoning Pipeline with Cryptographic Proof Generation

Version Build Status Determinism BoR Verification Tests

A first-principles compiler that translates RLang source code into executable reasoning pipelines with cryptographic proof generation compatible with the BoR (Blockchain of Reasoning) system. This compiler provides bit-for-bit deterministic execution suitable for trustless verification and cryptographic auditing.

Installation: pip install rlang-compiler
Documentation: See docs/compiler_physics.md for formal specification
Playbook: See docs/compiler_expansion_playbook.md for extension guidelines

Quick Onboarding Guide (Start Here)

What RLang Is

RLang is a deterministic domain-specific language (DSL) designed for building verifiable reasoning pipelines. The compiler translates RLang source code into a canonical intermediate representation (IR) that serves as the "physics layer" for deterministic execution. Every program execution produces a cryptographically verifiable proof bundle compatible with the BoR (Blockchain of Reasoning) system, enabling trustless verification of computation results.

The compiler enforces three non-negotiable invariants: deterministic semantics (same input always produces same output), deterministic proof shape (same execution always produces same trace), and single-source specification (canonical representation ensures hash stability). These invariants are analogous to physical laws—they cannot be violated without breaking fundamental guarantees.

Installation and Setup

Install via PyPI:

pip install rlang-compiler

Install for local development:

git clone https://github.com/your-org/Compiler_implementation.git
cd Compiler_implementation
pip install -e .[dev,test]
./run_all.sh

Minimal Working Example

Create a file examples/basic.rlang:

fn inc(x: Int) -> Int;

pipeline main(Int) -> Int {
  inc
}

Compile and inspect output:

rlangc examples/basic.rlang --out out/basic.json

The output JSON contains the canonical IR representation of your program.

Proof Generation and Verification

Generate a proof bundle:

./verify_bundle.sh

Verify with BoR CLI:

borp verify-bundle --bundle out/rich_proof_bundle.json

These commands compile an RLang program, execute it with a provided input, generate a cryptographic proof bundle containing execution traces (TRP), and verify the bundle's integrity using BoR-compatible hashing (HMASTER, HRICH).

Python API Quickstart

from rlang.bor import run_program_with_proof

source = """
fn inc(x: Int) -> Int;
pipeline main(Int) -> Int { inc }
"""

bundle = run_program_with_proof(
    source=source,
    input_value=10,
    fn_registry={"inc": lambda x: x + 1}
)

print("Output:", bundle.output_value)  # 11

Determinism Demonstration (10-second test)

from rlang.bor import run_program_with_proof
import hashlib
import json

src = """
fn inc(x: Int) -> Int;
pipeline main(Int) -> Int { inc }
"""

def compute_hash():
    b = run_program_with_proof(src, 42, fn_registry={"inc": lambda x: x + 1})
    j = json.dumps(b.to_dict(), sort_keys=True)
    return hashlib.sha256(j.encode()).hexdigest()

h1 = compute_hash()
h2 = compute_hash()
assert h1 == h2  # Always true: deterministic execution
print("Determinism verified:", h1 == h2)

This works because RLang execution is purely functional and deterministic—same program and input always produce identical proof bundles, enabling cryptographic verification.

End-to-End Compiler & Proof Flow

RLang Source
    |
    v
[Parser] → [Resolver] → [Type Checker]
    |
    v
[IR Lowering] → [Canonical JSON]
    |
    v
[Execution Engine] → [Proof Bundle] → [HRICH]

Or as a Mermaid diagram:

flowchart LR
    A[RLang Source] --> B[Parser]
    B --> C[Resolver]
    C --> D[Type Checker]
    D --> E[IR Lowering]
    E --> F[Canonical JSON]
    F --> G[Execution & Proof Generation]
    G --> H[HRICH Verification]

Where to Go Next

Language Specification: See docs/language.md for complete RLang syntax and semantics
Compiler Architecture: See Architecture Overview for component classification and modification rules
Proof System Documentation: See docs/proof-system.md for BoR integration details
Developer Workflows: See Extension Guidelines for adding new language features
Tests and Golden Files: See Testing & Verification for test suite structure

First Principles: The Three Non-Negotiable Invariants
Architecture Overview
Language Semantics (Formal)
IR Specification: The Physics Layer
Canonicalization Specification
Execution Semantics
Proof System Architecture
The Untouchable Core (Frozen Physics)
Expandable Surfaces (Safe to Extend)
Quick Start
API Reference
Testing & Verification
Extension Guidelines

1. First Principles: The Three Non-Negotiable Invariants

The RLang compiler is built on three non-negotiable invariants that define the "physics layer" of deterministic computation. These invariants are analogous to physical laws—they cannot be violated without breaking fundamental guarantees.

Invariant 1: Deterministic Semantics Invariant

Formal Definition:

For any RLang program P and input value x, there exists a unique output value y such that:

Eval(P, x) = y

This must hold regardless of:

Execution environment (OS, hardware, Python version)
Execution time (today vs. tomorrow)
Execution order (if multiple valid orders exist, they must be equivalent)
Random number generators (none allowed)
External state (none allowed)

Mathematical Properties:

Functionality: ∀P, x. ∃!y. Eval(P, x) = y
Idempotency: Eval(P, x) = Eval(P, x) (always)
Compositionality: Eval(P₁; P₂, x) = Eval(P₂, Eval(P₁, x))

Violation Examples:

FORBIDDEN: Using time.time() in function registry
FORBIDDEN: Reading from /dev/urandom
FORBIDDEN: Non-deterministic iteration order
FORBIDDEN: Floating-point operations that vary by platform

ALLOWED: Pure mathematical operations
ALLOWED: Deterministic string operations
ALLOWED: Fixed-order list operations

Invariant 2: Deterministic Proof Shape Invariant

Formal Definition:

For any RLang program P and input value x, there exists a unique execution trace trace such that:

TRP(P, x) = trace

The trace must be:

Complete: Every step execution is recorded
Ordered: Steps appear in execution order
Deterministic: Same execution → same trace
Canonical: Trace structure is stable across serializations

Trace Structure (TRP v1):

trace = {
    "steps": [
        {
            "index": int,           # 0-based step index
            "step_name": str,        # Function name
            "template_id": str,     # Template reference
            "input": Any,           # Input snapshot
            "output": Any           # Output snapshot
        },
        ...
    ],
    "branches": [
        {
            "index": int,           # IF step index
            "path": "then" | "else",
            "condition_value": bool
        },
        ...
    ]
}

Hash Invariants:

Hash(canonical(P)) = H_IR          # Program IR hash
Hash(trace) = HRICH                 # Execution trace hash
Hash(H_IR | HRICH) = HMASTER        # Master hash

Violation Examples:

FORBIDDEN: Recording steps in non-deterministic order
FORBIDDEN: Including timestamps in trace
FORBIDDEN: Non-deterministic trace serialization
FORBIDDEN: Omitting steps from trace

ALLOWED: Recording all steps in execution order
ALLOWED: Canonical JSON serialization
ALLOWED: Deterministic branch recording

Invariant 3: Single-Source Specification Invariant

Formal Definition:

For any RLang program P, there exists a unique canonical representation canonical(P) such that:

canonical(P₁) = canonical(P₂) ⟺ P₁ ≡ P₂

Where ≡ denotes semantic equivalence.

Canonical Representation Rules:

Key Ordering: All dictionary keys must be sorted alphabetically
Value Normalization: Floats normalized, integers preferred where possible
Structure Stability: Same structure → same JSON string
Encoding Stability: UTF-8, no BOM, consistent line endings

Hash Stability:

Hash(canonical(P)) = H_IR

This hash must be stable across:

Different compiler versions (if semantics unchanged)
Different platforms
Different Python versions
Different serialization libraries

Violation Examples:

FORBIDDEN: Non-deterministic key ordering
FORBIDDEN: Platform-dependent float formatting
FORBIDDEN: Non-canonical JSON serialization
FORBIDDEN: Including compiler metadata in canonical form

ALLOWED: Alphabetically sorted keys
ALLOWED: Normalized float representation
ALLOWED: Consistent JSON formatting

2. Architecture Overview

Compilation Pipeline

┌─────────┐
│ Source  │
│  Code   │
└────┬────┘
     │
     ▼
┌─────────────────────────────────────────────────────────────┐
│                    FRONTEND (EXTENSION-SAFE)                │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  ┌──────────┐    ┌──────────┐    ┌──────────┐             │
│  │  Lexer   │───▶│  Parser  │───▶│ Resolver │             │
│  │          │    │          │    │          │             │
│  │ PLUGGABLE│    │ PLUGGABLE│    │ PLUGGABLE│             │
│  └──────────┘    └──────────┘    └──────────┘             │
│                                                              │
│                          │                                   │
│                          ▼                                   │
│                  ┌──────────────┐                            │
│                  │ Type Checker │                            │
│                  │              │                            │
│                  │ EXTENSION-   │                            │
│                  │ SAFE         │                            │
│                  └──────────────┘                            │
└──────────────────────────┬───────────────────────────────────┘
                           │
                           ▼
┌─────────────────────────────────────────────────────────────┐
│              MIDDLE-END (SAFE BUT STRICT)                    │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│                  ┌──────────────┐                            │
│                  │   Lowering   │                            │
│                  │              │                            │
│                  │ MUST REMAIN  │                            │
│                  │ DETERMINISTIC│                            │
│                  └──────┬───────┘                            │
│                         │                                     │
│                         ▼                                     │
│                  ┌──────────────┐                            │
│                  │      IR      │                            │
│                  │              │                            │
│                  │   PHYSICS   │                            │
│                  │    LAYER    │                            │
│                  └──────┬───────┘                            │
└─────────────────────────┼─────────────────────────────────────┘
                          │
                          ▼
┌─────────────────────────────────────────────────────────────┐
│              BACKEND (VERY SENSITIVE)                       │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  ┌──────────────┐    ┌──────────────┐                      │
│  │ Canonicalizer│───▶│   Executor   │                      │
│  │              │    │              │                      │
│  │    FIXED     │    │ MUST REMAIN  │                      │
│  │              │    │ DETERMINISTIC│                      │
│  └──────┬───────┘    └──────┬───────┘                      │
│         │                    │                               │
│         ▼                    ▼                               │
│  ┌──────────────┐    ┌──────────────┐                      │
│  │   Canonical  │    │  Proof Trace │                      │
│  │     JSON     │    │   (TRP v1)   │                      │
│  │              │    │              │                      │
│  │    FIXED     │    │    FIXED     │                      │
│  └──────┬───────┘    └──────┬───────┘                      │
│         │                    │                               │
│         └──────────┬─────────┘                               │
│                    ▼                                         │
│            ┌──────────────┐                                  │
│            │   Hashing    │                                  │
│            │              │                                  │
│            │ HMASTER/     │                                  │
│            │ HRICH        │                                  │
│            │              │                                  │
│            │    FIXED     │                                  │
│            └──────────────┘                                  │
└─────────────────────────────────────────────────────────────┘

Component Classification

Component	Classification	Rationale
Lexer	`PLUGGABLE`	Tokenization is syntax-level; can extend for new keywords/symbols
Parser	`PLUGGABLE`	AST construction is syntax-level; can add new AST nodes
Resolver	`PLUGGABLE`	Symbol resolution is syntax-level; can extend symbol table
Type Checker	`EXTENSION-SAFE`	Type checking must remain deterministic but can add new types
Lowering	`MUST REMAIN DETERMINISTIC`	IR generation must preserve semantics deterministically
IR	`PHYSICS LAYER`	IR structure defines execution model; changes break proofs
Canonicalizer	`FIXED`	Canonical JSON rules cannot change without breaking hashes
Executor	`MUST REMAIN DETERMINISTIC`	Execution semantics must remain deterministic
Proof System	`FIXED`	TRP structure is frozen; extensions via versioning
Hashing	`FIXED`	Hash algorithms and structure are frozen

3. Language Semantics (Formal)

Type System

Primitive Types

RLang defines five primitive types:

Int: 64-bit signed integers (Python int, unbounded)
Float: IEEE 754 double-precision floating-point (Python float)
String: UTF-8 encoded strings (Python str)
Bool: Boolean values true / false (Python bool)
Unit: Unit type (Python None)

Type Semantics:

Type ::= Int | Float | String | Bool | Unit

Type Equivalence:

Two types T₁ and T₂ are equivalent (T₁ ≡ T₂) if:

Both are primitive and have the same name, OR
Both are generic with same name and equivalent type arguments

Type Aliases

Type aliases provide semantic meaning:

type UserId = Int;
type Email = String;

Semantics:

type_alias ::= type IDENTIFIER = TypeExpr;

Type aliases are transparent during type checking—they resolve to their underlying types.

Expressions

Literal Expressions

Literal ::= INTEGER | FLOAT | STRING | BOOLEAN

Evaluation:

Eval(42) = 42
Eval(3.14) = 3.14
Eval("hello") = "hello"
Eval(true) = True
Eval(false) = False

Identifier Expressions

Identifier ::= IDENTIFIER

Special Identifiers:

__value: Current pipeline value (runtime context)

Evaluation:

Eval(__value, ctx) = ctx.current_value

Binary Operations

BinaryOp ::= Expr OP Expr
OP ::= + | - | * | / | > | < | >= | <= | == | !=

Arithmetic Operations:

Eval(e₁ + e₂, ctx) = Eval(e₁, ctx) + Eval(e₂, ctx)
Eval(e₁ - e₂, ctx) = Eval(e₁, ctx) - Eval(e₂, ctx)
Eval(e₁ * e₂, ctx) = Eval(e₁, ctx) * Eval(e₂, ctx)
Eval(e₁ / e₂, ctx) = Eval(e₁, ctx) / Eval(e₂, ctx)  [if Eval(e₂, ctx) ≠ 0]

Comparison Operations:

Eval(e₁ > e₂, ctx) = Eval(e₁, ctx) > Eval(e₂, ctx)
Eval(e₁ < e₂, ctx) = Eval(e₁, ctx) < Eval(e₂, ctx)
Eval(e₁ >= e₂, ctx) = Eval(e₁, ctx) >= Eval(e₂, ctx)
Eval(e₁ <= e₂, ctx) = Eval(e₁, ctx) <= Eval(e₂, ctx)
Eval(e₁ == e₂, ctx) = Eval(e₁, ctx) == Eval(e₂, ctx)
Eval(e₁ != e₂, ctx) = Eval(e₁, ctx) != Eval(e₂, ctx)

Type Rules:

Arithmetic: Int + Int → Int, Float + Float → Float, Int + Float → Float
Comparison: T × T → Bool (for comparable types)

Function Calls

Call ::= IDENTIFIER ( Expr₁, ..., Exprₙ )

Evaluation:

Eval(f(e₁, ..., eₙ), ctx) = fn_registry[f](Eval(e₁, ctx), ..., Eval(eₙ, ctx))

Type Rules:

f : T₁ × ... × Tₙ → T
e₁ : T₁, ..., eₙ : Tₙ
─────────────────────────
f(e₁, ..., eₙ) : T

Conditional Expressions (v0.2+)

IfExpr ::= if ( Expr ) { Steps } [ else { Steps } ]

Evaluation:

Eval(if (c) { s₁ } else { s₂ }, ctx) = 
    if Eval(c, ctx) then Eval(s₁, ctx) else Eval(s₂, ctx)

Type Rules:

c : Bool
s₁ : T
s₂ : T
─────────────────────────
if (c) { s₁ } else { s₂ } : T

Determinism Requirement:

The condition c must be a pure expression—no side effects, no randomness, no time-dependent operations.

Pipeline Semantics

Pipeline Definition

Pipeline ::= pipeline IDENTIFIER ( Type ) -> Type { Steps }
Steps ::= Step₁ -> Step₂ -> ... -> Stepₙ

Evaluation:

Eval(pipeline main(T_in) -> T_out { s₁ -> ... -> sₙ }, x) =
    Eval(sₙ, Eval(sₙ₋₁, ..., Eval(s₁, x)...))

Composition:

Eval(s₁ -> s₂, x) = Eval(s₂, Eval(s₁, x))

Step Semantics

Function Step:

Eval(f, x) = fn_registry[f](x)

Conditional Step:

Eval(if (c) { s₁ } else { s₂ }, x) =
    if Eval(c, x) then Eval(s₁, x) else Eval(s₂, x)

Deterministic Requirements

No Randomness

FORBIDDEN:

Random number generation
Non-deterministic algorithms
Probabilistic data structures

No I/O

FORBIDDEN:

File system access
Network operations
Standard input/output
Environment variables (except compile-time)

No Time Dependence

FORBIDDEN:

Timestamps
System time
Date/time operations

Fixed Evaluation Order

REQUIRED:

Left-to-right evaluation
Sequential pipeline execution
Deterministic branch selection

4. IR Specification: The Physics Layer

The Intermediate Representation (IR) is the physics layer of RLang. It defines:

What can be executed: Only IR nodes can appear in execution traces
How execution proceeds: IR structure determines execution order
What is provable: Only IR-level operations generate proof records

IR Invariants:

Purity: Every IR node is pure (no side effects)
Determinism: IR evaluation is deterministic
Canonicalizability: Every IR node can be serialized to canonical JSON
Completeness: All semantic constructs must lower to IR

Current IR Node Types (v0.2.2)

IRExpr

Base class for all expressions in IR.

@dataclass(frozen=True)
class IRExpr:
    kind: str  # "literal" | "identifier" | "binary_op" | "call" | "boolean_and" | "boolean_or" | "boolean_not" | "record" | "field_access" | "list"
    # ... fields depend on kind

Kinds:

literal: Literal values
identifier: Variable references (e.g., __value)
binary_op: Binary operations (+, -, *, /, >, <, etc.)
call: Function calls
boolean_and: Boolean AND (&&)
boolean_or: Boolean OR (||)
boolean_not: Boolean NOT (!)
record: Record construction { field1: expr1, ... }
field_access: Field access obj.field
list: List construction [expr1, expr2, ...]

IRIf

Conditional execution node.

@dataclass(frozen=True)
class IRIf:
    condition: IRExpr
    then_steps: list[PipelineStepIR]
    else_steps: list[PipelineStepIR]

Semantics:

Condition must evaluate to Bool
Both branches must produce same output type
Execution is deterministic based on condition value

PipelineStepIR

Single step in a pipeline.

@dataclass(frozen=True)
class PipelineStepIR:
    index: int
    name: str
    template_id: str
    arg_types: list[str]
    input_type: str | None
    output_type: str | None

PipelineIR

Complete pipeline definition.

@dataclass(frozen=True)
class PipelineIR:
    id: str
    name: str
    input_type: str | None
    output_type: str | None
    steps: list[PipelineStepIR | IRIf]

Rules for Adding New IR Nodes

Every new IR node MUST:

Be Pure: No side effects, no hidden state
Be Deterministic: Same inputs → same outputs
Be Canonicalizable: Implement to_dict() with sorted keys
Have Fixed Evaluation Order: No non-deterministic iteration
Preserve Type Information: Include type annotations

Example: Adding IRRecord (v0.3)

@dataclass(frozen=True)
class IRRecord:
    """IR representation of a record construction."""
    fields: dict[str, IRExpr]  # Field name → expression
    
    def to_dict(self) -> dict[str, Any]:
        """Canonical dictionary representation."""
        return {
            "fields": {
                k: v.to_dict() 
                for k, v in sorted(self.fields.items())  # Sorted!
            },
            "kind": "record"
        }

Key Point: Record fields must be sorted alphabetically to ensure canonical representation.

5. Canonicalization Specification

Canonical JSON is the stable serialization format that ensures:

Same data structure → same JSON string
Same JSON string → same hash
Deterministic across platforms and Python versions

Key Ordering Rule

RULE: All dictionary keys must be sorted alphabetically.

Implementation:

def canonical_dumps(obj: Any) -> str:
    return json.dumps(obj, sort_keys=True, separators=(",", ":"), ensure_ascii=False)

Example:

{"b": 2, "a": 1} → '{"a":1,"b":2}'

Why This Matters:

Non-deterministic key ordering breaks hash stability:

# WRONG
{"b": 2, "a": 1} → hash₁
{"a": 1, "b": 2} → hash₂  # Different hash!

# CORRECT
{"b": 2, "a": 1} → '{"a":1,"b":2}' → hash
{"a": 1, "b": 2} → '{"a":1,"b":2}' → hash  # Same hash!

Float Normalization Rule

RULE: Floats must be normalized to ensure platform-independent representation.

Implementation:

def _normalize_floats(obj: Any) -> Any:
    if isinstance(obj, float):
        if obj.is_integer():
            return int(obj)  # 3.0 → 3
        return round(obj, 10)  # Round to 10 decimal places
    elif isinstance(obj, dict):
        return {k: _normalize_floats(v) for k, v in obj.items()}
    elif isinstance(obj, list):
        return [_normalize_floats(item) for item in obj]
    return obj

Whitespace Rule

RULE: Minimal whitespace (compact JSON) unless indentation is explicitly requested.

Implementation:

# Compact (default)
json.dumps(obj, separators=(",", ":"))  # No spaces

# Pretty (for debugging)
json.dumps(obj, indent=2)  # 2-space indentation

Encoding Rule

RULE: UTF-8 encoding, no BOM, consistent line endings.

Implementation:

canonical_json.encode("utf-8")

What Breaks Determinism

FORBIDDEN:

Non-deterministic key ordering
Platform-dependent float representation
Non-canonical JSON serialization
Including metadata in canonical form
Non-deterministic whitespace

REQUIRED:

Alphabetically sorted keys
Normalized floats
Canonical JSON serialization
Pure data structures only
Consistent encoding

6. Execution Semantics

RLang execution is purely functional and deterministic:

No mutable state
No side effects
No I/O operations
No randomness

Function Application

Semantics:

Apply(f, x) = fn_registry[f](x)

Requirements:

fn_registry[f] must be a pure function
No side effects allowed
Deterministic output for same input

Step Execution

Sequential Execution:

Execute([s₁, ..., sₙ], x₀) =
    let x₁ = Execute(s₁, x₀) in
    let x₂ = Execute(s₂, x₁) in
    ...
    let xₙ = Execute(sₙ, xₙ₋₁) in
    xₙ

Trace Recording:

Each step execution produces a StepExecutionRecord:

StepExecutionRecord(
    index=i,
    step_name=name,
    template_id=template_id,
    input_snapshot=xᵢ,
    output_snapshot=xᵢ₊₁
)

Conditional Execution

Branch Selection:

Execute(IRIf(condition=c, then_steps=t, else_steps=e), x) =
    if Eval(c, x) then
        Execute(t, x)
    else
        Execute(e, x)

Branch Recording:

Each conditional execution produces a BranchExecutionRecord:

BranchExecutionRecord(
    index=i,
    path="then" | "else",
    condition_value=bool
)

Determinism:

Same condition value → same branch path → same execution trace.

7. Proof System Architecture

TRP v1 (Current)

TRP (Trace of Reasoning Process) is the execution trace format.

Structure

PipelineProofBundle(
    version: str,
    language: str,
    entry_pipeline: str | None,
    program_ir: PrimaryProgramIR,
    input_value: Any,
    output_value: Any,
    steps: List[StepExecutionRecord],
    branches: List[BranchExecutionRecord]
)

Step Records

StepExecutionRecord(
    index: int,           # 0-based step index
    step_name: str,        # Function name
    template_id: str,      # Template reference
    input_snapshot: Any,   # Input value
    output_snapshot: Any   # Output value
)

Branch Records

BranchExecutionRecord(
    index: int,           # IF step index
    path: str,            # "then" | "else"
    condition_value: bool  # Condition evaluation result
)

Hashing Model

HMASTER

Definition:

HMASTER = Hash(canonical(program_ir))

Computation:

def compute_HMASTER(program_ir: PrimaryProgramIR) -> str:
    canonical_json = program_ir.to_json()
    return hashlib.sha256(canonical_json.encode("utf-8")).hexdigest()

Invariant:

Same program IR → same HMASTER.

HRICH

Definition:

HRICH = Hash(canonical(proof_bundle))

Computation:

def compute_HRICH(proof_bundle: PipelineProofBundle) -> str:
    # Convert to rich bundle format
    rich_bundle = {
        "primary": {
            "master": HMASTER,
            "steps": [step.to_dict() for step in proof_bundle.steps],
            "branches": [branch.to_dict() for branch in proof_bundle.branches]
        },
        "H_RICH": None  # Computed below
    }
    
    # Compute subproof hashes
    subproof_hashes = compute_subproof_hashes(subproofs)
    
    # Compute HRICH from subproof hashes
    HRICH = compute_HRICH_from_subproof_hashes(subproof_hashes)
    
    return HRICH

Subproof Hashes:

subproof_hashes = {
    "DIP": Hash(DIP_subproof),
    "DP": Hash(DP_subproof),
    "PEP": Hash(PEP_subproof),
    "PoPI": Hash(PoPI_subproof),
    "CCP": Hash(CCP_subproof),
    "CMIP": Hash(CMIP_subproof),
    "PP": Hash(PP_subproof),
    "TRP": Hash(TRP_subproof)
}

HRICH Computation:

HRICH = SHA256(
    sorted(subproof_hashes.values()).join("|")
)

Invariant:

Same execution trace → same HRICH.

8. The Untouchable Core (Frozen Physics)

These components MUST NEVER BE MODIFIED without breaking determinism guarantees:

Component	Frozen?	Why?
Canonical JSON Rules	YES	Breaks HMASTER stability
Hash Algorithms	YES	Breaks verification
TRP Structure Rules	YES	Breaks proof compatibility
Branch Decision Semantics	YES	Breaks determinism
Deterministic Data Structures	YES	Breaks execution determinism
No Non-Deterministic Iteration	YES	Breaks execution determinism
No Mutation in IR	YES	Breaks purity

Partially Frozen Components

These components can be extended but must preserve determinism:

Component	Frozen?	Why?
AST → IR Lowering	PARTIAL	Must remain deterministic
Type System	PARTIAL	Can add types, but rules must be deterministic
Executor	PARTIAL	Semantics must remain deterministic
Parser	NO	Extensions allowed (new syntax)
Resolver	NO	Extensions allowed (new symbols)

Modification Rules

Canonical JSON

NEVER CHANGE:

Key sorting algorithm
Float normalization rules
JSON encoding (UTF-8)
Whitespace rules

ALLOWED:

Adding new fields to existing structures (if canonicalized correctly)

Hash Algorithms

NEVER CHANGE:

SHA-256 algorithm
Hash computation order
Subproof hash structure

ALLOWED:

Adding new hash types (with new names)
Extending hash inputs (additive only)

TRP Structure

NEVER CHANGE:

Step record structure (v1)
Branch record structure (v1)
Record field names

ALLOWED:

Adding new record types (TRP v2)
Extending existing records (additive fields)

9. Expandable Surfaces (Safe to Extend)

Frontend Extensions

Lexer

Safe to Add:

New keywords
New operators
New literal types
New comment styles

Parser

Safe to Add:

New AST nodes
New expression forms
New statement types

Resolver

Safe to Add:

New symbol kinds
New scoping rules
New name resolution strategies

Middle-End Extensions

Type System

Safe to Add:

New primitive types
New generic types
New type constructors

Lowering

Safe to Add:

New AST → IR lowering rules
New IR node types (following IR invariants)

Backend Extensions

Executor

Safe to Add:

New execution strategies
New optimization passes
New proof recording formats

Proof System

Safe to Add:

New proof record types (TRP v2)
New subproof types
New verification strategies

Extension Guidelines

Before Adding:

Verify determinism (same input → same output)
Verify canonicalizability (can serialize to JSON)
Verify purity (no side effects)
Add tests (determinism tests required)
Update documentation

After Adding:

Run full test suite
Verify hash stability
Update golden files
Document extension

10. Quick Start

Installation

pip install rlang-compiler

Basic Usage

Example 1: Simple Pipeline

fn inc(x: Int) -> Int;

pipeline main(Int) -> Int { inc }

Compile:

rlangc examples/simple.rlang --out out/simple.json

Example 2: Conditional Execution

fn double(x: Int) -> Int;
fn half(x: Int) -> Int;

pipeline main(Int) -> Int {
  if (__value > 10) {
    double
  } else {
    half
  }
}

Example 3: Proof Generation

from rlang.bor import run_program_with_proof, RLangBoRCrypto

source = """
fn inc(x: Int) -> Int;
pipeline main(Int) -> Int { inc }
"""

bundle = run_program_with_proof(
    source=source,
    input_value=10,
    fn_registry={"inc": lambda x: x + 1}
)

crypto = RLangBoRCrypto(bundle)
rich = crypto.to_rich_bundle()

print("HMASTER:", rich.rich["primary"]["master"])
print("HRICH:", rich.rich["H_RICH"])

Verification

# Generate proof bundle
python verify_proof_bundle.py

# Verify with BoR CLI
borp verify-bundle --bundle out/rich_proof_bundle.json

11. API Reference

Core Compiler API

from rlang import compile_source_to_ir, compile_source_to_json

# Compile to IR
result = compile_source_to_ir(
    source="fn inc(x: Int) -> Int; pipeline main(Int) -> Int { inc }",
    version="v0",
    language="rlang"
)

# Compile to JSON
json_str = compile_source_to_json(
    source="fn inc(x: Int) -> Int; pipeline main(Int) -> Int { inc }"
)

Proof Generation API

from rlang.bor import run_program_with_proof, RLangBoRCrypto

# Generate proof bundle
bundle = run_program_with_proof(
    source=source,
    input_value=10,
    fn_registry={"inc": lambda x: x + 1}
)

# Convert to rich bundle
crypto = RLangBoRCrypto(bundle)
rich_bundle = crypto.to_rich_bundle()

CLI Usage

# Compile to stdout
rlangc program.rlang

# Compile to file
rlangc program.rlang --out output.json

# Specify entry pipeline
rlangc program.rlang --entry main --out output.json

12. Testing & Verification

Test Suite

The compiler includes 190+ tests covering:

Lexer (tokenization, comments, floats)
Parser (AST construction, operator precedence)
Type Checker (type inference, type aliases, control flow)
IR (lowering, primary IR construction)
Emitter (end-to-end compilation)
CLI (command-line interface)
BoR Integration (proof generation, crypto hashing, CLI compatibility)
Determinism (SHA256 comparison, tamper detection)

Running Tests

# Run all tests
pytest -q --disable-warnings

# Run specific test file
pytest tests/test_parser.py -v

# Run with coverage
pytest --cov=rlang

Determinism Verification

# Run deterministic test suite
./next_tests.sh

# Verify proof bundles
./verify_bundle.sh

Release Audit

# Run comprehensive release audit
./scripts/run_release_audit.sh

The audit checks:

Environment reset
Static code consistency
Full test suite
Determinism tests
Golden file verification
Canonical JSON boundary audit
IR shape stability
TRP audit
Hash boundary tests
CLI verification
Packaging readiness

13. Extension Guidelines

For detailed extension guidelines, see docs/compiler_expansion_playbook.md.

Quick Checklist

When adding a new feature:

Update grammar in docs/compiler_physics.md
Add lexer tokens
Add parser AST nodes
Add resolver logic
Add type checking rules
Add IR node (if needed)
Add lowering rules
Add execution logic
Verify canonicalization
Add proof recording (if needed)
Add comprehensive tests
Update golden files
Update documentation

Test Matrix

For each new construct:

Parser tests (basic, nested, empty, invalid, edge cases)
Typechecker tests (valid, invalid, inference, nested, edge cases)
Lowering tests (basic, nested, deterministic, edge cases)
IR tests (structure, canonical, deterministic, edge cases)
Executor tests (basic, proof, deterministic, edge cases)
Determinism tests (IR, H_IR, TRP, HRICH, cross-platform)
Canonical JSON tests (sorted keys, float normalization, stable representation)
Proof stability tests (branching, loops, collections, pattern matching)

References

Formal Specification: docs/compiler_physics.md — Complete deterministic execution & proof architecture specification
Extension Playbook: docs/compiler_expansion_playbook.md — Implementation checklists, test matrices & modularization guide
Language Specification: docs/language.md — RLang language syntax and semantics
Proof System: docs/proof-system.md — BoR proof system integration

Status

Compiler: Fully functional (190+ tests passing)
Control Flow: Deterministic if/else in pipelines with type-checked branches
Proof Generation: Complete and deterministic, including branch-aware TRP subproofs
BoR Integration: Verified with borp verify-bundle
Determinism: Bit-for-bit reproducible including branch traces
Security: Tamper detection working for both steps and branches
Version: 0.2.2 (published to PyPI)

License: MIT License
Author: Kushagra Bhatnagar
Last Updated: November 2025

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.5

Nov 18, 2025

0.2.4

Nov 18, 2025

This version

0.2.3

Nov 16, 2025

0.2.2

Nov 16, 2025

0.2.1

Nov 16, 2025

0.2.0

Nov 16, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rlang_compiler-0.2.3.tar.gz (177.4 kB view details)

Uploaded Nov 16, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

rlang_compiler-0.2.3-py3-none-any.whl (71.9 kB view details)

Uploaded Nov 16, 2025 Python 3

File details

Details for the file rlang_compiler-0.2.3.tar.gz.

File metadata

Download URL: rlang_compiler-0.2.3.tar.gz
Upload date: Nov 16, 2025
Size: 177.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.8

File hashes

Hashes for rlang_compiler-0.2.3.tar.gz
Algorithm	Hash digest
SHA256	`cd5c2ae47daa9111732c43f14e22c2b45bdede22d5546e5ff7afa08dce66f57a`
MD5	`48f2465f18dcd93af97dbc87abb47e36`
BLAKE2b-256	`2f20c6a5d2c2d0a557ef826df2b47dca499cac68716ecedc27bde011558e1771`

See more details on using hashes here.

File details

Details for the file rlang_compiler-0.2.3-py3-none-any.whl.

File metadata

Download URL: rlang_compiler-0.2.3-py3-none-any.whl
Upload date: Nov 16, 2025
Size: 71.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.8

File hashes

Hashes for rlang_compiler-0.2.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2db8f0e97d9400b7252fd70ddadb111999b717fd5b0561573b2d561f4592639f`
MD5	`248fb7128fbf3dca172e71eeb73379e5`
BLAKE2b-256	`0a10b9732b65f4ace8a2354f1b26391f457e756e938db369698394f3fcd02ea9`

See more details on using hashes here.

rlang-compiler 0.2.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

RLang Compiler — Deterministic Reasoning Pipeline with Cryptographic Proof Generation

Quick Onboarding Guide (Start Here)

What RLang Is

Installation and Setup

Minimal Working Example

Proof Generation and Verification

Python API Quickstart

Determinism Demonstration (10-second test)

End-to-End Compiler & Proof Flow

Where to Go Next

Table of Contents

1. First Principles: The Three Non-Negotiable Invariants

Invariant 1: Deterministic Semantics Invariant

Invariant 2: Deterministic Proof Shape Invariant

Invariant 3: Single-Source Specification Invariant

2. Architecture Overview

Compilation Pipeline

Component Classification

3. Language Semantics (Formal)

Type System

Primitive Types

Type Aliases

Expressions

Literal Expressions

Identifier Expressions

Binary Operations

Function Calls

Conditional Expressions (v0.2+)

Pipeline Semantics

Pipeline Definition

Step Semantics

Deterministic Requirements

No Randomness

No I/O

No Time Dependence

Fixed Evaluation Order

4. IR Specification: The Physics Layer

Current IR Node Types (v0.2.2)

IRExpr

IRIf

PipelineStepIR

PipelineIR

Rules for Adding New IR Nodes

5. Canonicalization Specification

Key Ordering Rule

Float Normalization Rule

Whitespace Rule

Encoding Rule

What Breaks Determinism

6. Execution Semantics

Function Application

Step Execution

Conditional Execution

7. Proof System Architecture

TRP v1 (Current)

Structure

Step Records

Branch Records

Hashing Model

HMASTER

HRICH

8. The Untouchable Core (Frozen Physics)

Partially Frozen Components

Modification Rules

Canonical JSON

Hash Algorithms

TRP Structure

9. Expandable Surfaces (Safe to Extend)

Frontend Extensions

Lexer

Parser