Skip to main content

Standalone XAML workflow parser for automation projects (CPRIMA Forge)

Project description

XAML Parser - Python Implementation

Python implementation of the XAML workflow parser for automation projects.

Installation

From PyPI (when published)

pip install cpmf-uips-xaml

For Development

# Clone the repository
git clone https://github.com/rpapub/cpmf-uips-xaml.git
cd cpmf-uips-xaml/python

# Install with uv (recommended)
uv sync

# Or with pip in editable mode
pip install -e .

Quick Start

Python API

from pathlib import Path
from cpmf_uips_xaml import XamlParser

# Parse a workflow file
parser = XamlParser()
result = parser.parse_file(Path("workflow.xaml"))

if result.success:
    content = result.content
    print(f"Workflow: {content.root_annotation}")
    print(f"Arguments: {len(content.arguments)}")
    print(f"Activities: {len(content.activities)}")

    # Access arguments
    for arg in content.arguments:
        print(f"  {arg.direction} {arg.name}: {arg.type}")
        if arg.annotation:
            print(f"    -> {arg.annotation}")

    # Access activities with annotations
    for activity in content.activities:
        if activity.annotation:
            print(f"{activity.activity_type}: {activity.annotation}")
else:
    print("Parsing failed:", result.errors)

Command Line Interface

Project Parsing (Primary Mode):

# Parse entire project from project.json
cpmf-uips-xaml project.json
cpmf-uips-xaml /path/to/project.json
cpmf-uips-xaml /path/to/project        # Directory containing project.json

# Show workflow dependency graph
cpmf-uips-xaml project.json --graph

# Parse only entry points (no recursive discovery)
cpmf-uips-xaml project.json --entry-points-only

# Save to file
cpmf-uips-xaml project.json --json -o output.json

Individual Workflow Files:

# Parse single workflow
cpmf-uips-xaml Main.xaml

# JSON output
cpmf-uips-xaml Main.xaml --json

# List only arguments
cpmf-uips-xaml Main.xaml --arguments

# Show activity tree
cpmf-uips-xaml Main.xaml --tree

# Process multiple files
cpmf-uips-xaml *.xaml --summary

# Recursive search
cpmf-uips-xaml **/*.xaml --summary

Using with uv (development):

uv run cpmf-uips-xaml project.json
uv run cpmf-uips-xaml workflow.xaml

CLI Options:

All modes:

  • --json - Output as JSON
  • -o FILE - Write output to file
  • --no-expressions - Skip expression extraction (faster)
  • --strict - Fail on any error
  • --help - Show all options

Project mode:

  • --graph - Show workflow dependency graph
  • --entry-points-only - Parse only entry points (no recursive discovery)

File mode:

  • --arguments - Show only arguments
  • --activities - Show only activities
  • --tree - Show activity tree with nesting
  • --summary - Summary for multiple files

Python API for Projects:

from pathlib import Path
from cpmf_uips_xaml import ProjectParser

# Parse entire project
parser = ProjectParser()
result = parser.parse_project(Path("path/to/project"))

if result.success:
    print(f"Project: {result.project_config.name}")
    print(f"Workflows: {result.total_workflows}")

    # Access entry points
    for workflow in result.get_entry_points():
        print(f"Entry: {workflow.relative_path}")

    # Access dependency graph
    for workflow_path, dependencies in result.dependency_graph.items():
        print(f"{workflow_path} invokes:")
        for dep in dependencies:
            print(f"  -> {dep}")
else:
    print("Project parsing failed:", result.errors)

How it works:

  1. Reads project.json to find entry points
  2. Parses entry point workflows
  3. Recursively discovers workflows via InvokeWorkflowFile activities
  4. Builds complete dependency graph
  5. Returns all workflows with parse results

Features

  • Minimal Dependencies: Single required dependency (defusedxml for secure XML parsing)
  • Complete Extraction: Arguments, variables, activities, expressions, annotations
  • Project Parsing: Auto-discover and parse entire UiPath projects with dependency analysis
  • Type Safety: Full type hints for all APIs
  • Error Handling: Graceful degradation with detailed error reporting
  • Schema Validation: Output validates against JSON schemas
  • Performance: Fast parsing even for large workflows
  • CLI Tool: Full-featured command-line interface for batch processing

Configuration

config = {
    'extract_expressions': True,
    'extract_viewstate': False,
    'strict_mode': False,
    'max_depth': 50
}

parser = XamlParser(config)
result = parser.parse_file(file_path)

API Reference

XamlParser

Main workflow parser class:

parser = XamlParser(config=None)
result = parser.parse_file(Path("workflow.xaml"))
result = parser.parse_content(xaml_string)

ProjectParser

Project-level parser class:

parser = ProjectParser(config=None)
result = parser.parse_project(
    project_dir=Path("path/to/project"),
    recursive=True,              # Follow InvokeWorkflowFile references
    entry_points_only=False      # Only parse entry points
)

Models

Data models for parsed content:

Workflow Models:

  • ParseResult: Top-level result with success/error info
  • WorkflowContent: Complete workflow metadata
  • WorkflowArgument: Argument definition
  • WorkflowVariable: Variable definition
  • Activity: Activity with full metadata
  • Expression: Expression with language detection

Project Models:

  • ProjectResult: Complete project parsing result
  • ProjectConfig: Parsed project.json configuration
  • WorkflowResult: Individual workflow result in project context

Validation

Schema-based validation:

from cpmf_uips_xaml.validation import validate_output

errors = validate_output(result)
if errors:
    print("Validation failed:", errors)

Library API (v0.3.0+)

Starting in v0.3.0, the package provides a stable orchestration API that coordinates parsing, analysis, and output generation. This API is the recommended way to integrate XAML parsing into libraries and tools.

Architecture

The package follows a layered architecture:

Your Application
      ↓
API Layer (orchestration) ← You are here
      ↓
Core, UiPS, Emitters, Views (internal implementation)

The API layer provides stable entry points while internal implementation details may change between versions.

Core API Functions

parse_and_analyze_project()

Parse a project and build queryable index in one step:

from pathlib import Path
from cpmf_uips_xaml.api import parse_and_analyze_project

# Parse project and build complete analysis
project_result, analyzer, index = parse_and_analyze_project(
    Path("./MyProject"),
    recursive=True,              # Follow InvokeWorkflowFile references
    entry_points_only=False,     # Parse all workflows, not just entry points
    show_progress=False          # Show progress bars
)

# Access project info
if project_result.project_config:
    print(f"Project: {project_result.project_config.name}")
    print(f"Main workflow: {project_result.project_config.main}")

# Query workflows
workflow_ids = index.list_workflows()
print(f"Total workflows: {len(workflow_ids)}")

# Traverse call graph
for workflow_id in index.list_workflows():
    callees = index.get_callees(workflow_id)
    if callees:
        print(f"{workflow_id} calls: {callees}")

render_project_view()

Transform analysis results into different view formats:

from cpmf_uips_xaml.api import parse_and_analyze_project, render_project_view

# Parse and analyze
project_result, analyzer, index = parse_and_analyze_project(Path("./MyProject"))

# Render nested view (hierarchical structure)
nested = render_project_view(
    analyzer, index,
    view_type="nested"
)

# Render execution view (call graph traversal from entry point)
execution = render_project_view(
    analyzer, index,
    view_type="execution",
    entry_point="Main.xaml",
    max_depth=10
)

# Render slice view (context window around focal activity)
slice_view = render_project_view(
    analyzer, index,
    view_type="slice",
    focus="LogMessage_abc123",
    radius=2
)

emit_workflows()

Output workflows in different formats:

from pathlib import Path
from cpmf_uips_xaml.api import parse_and_analyze_project, emit_workflows

# Parse project
project_result, analyzer, index = parse_and_analyze_project(Path("./MyProject"))

# Get workflow DTOs from analyzer
workflows = list(analyzer.workflows.values())

# Emit as JSON
result = emit_workflows(
    workflows,
    format="json",
    output_path=Path("output.json"),
    pretty=True,
    exclude_none=True
)

if result.success:
    print(f"Written {len(result.files_written)} files")
else:
    print(f"Errors: {result.errors}")

# Emit as Mermaid diagram
emit_workflows(
    workflows,
    format="mermaid",
    output_path=Path("output.mmd")
)

# Emit as Markdown documentation
emit_workflows(
    workflows,
    format="doc",
    output_path=Path("output.md")
)

Available formats: json, mermaid, doc

normalize_parse_results()

Convert raw ParseResult objects to structured WorkflowDto objects:

from pathlib import Path
from cpmf_uips_xaml import XamlParser
from cpmf_uips_xaml.api import normalize_parse_results

# Parse files
parser = XamlParser()
parse_results = [
    parser.parse_file(Path("Main.xaml")),
    parser.parse_file(Path("GetConfig.xaml"))
]

# Normalize to DTOs
workflows = normalize_parse_results(
    parse_results,
    project_dir=Path("./MyProject"),
    sort_output=True,
    calculate_metrics=True,
    detect_anti_patterns=True
)

# Now you have structured DTOs ready for emission or analysis
for workflow in workflows:
    print(f"Workflow: {workflow.name}")
    print(f"  Activities: {len(workflow.activities)}")
    print(f"  Arguments: {len(workflow.arguments)}")

parse_file_to_dto()

Single-file parsing with DTO normalization:

from pathlib import Path
from cpmf_uips_xaml.api import parse_file_to_dto

# Parse and normalize in one call
workflow = parse_file_to_dto(
    Path("Main.xaml"),
    project_dir=Path("./MyProject")
)

print(f"Workflow: {workflow.name}")
print(f"Activities: {len(workflow.activities)}")

Configuration Helpers

from cpmf_uips_xaml.api import load_default_config, create_emitter_config

# Load default parser config
config = load_default_config()
print(config)  # Shows default settings

# Create emitter config with overrides
emitter_config = create_emitter_config(
    pretty=True,
    exclude_none=True,
    field_profile="minimal"
)

Complete Example: Project Analysis Pipeline

from pathlib import Path
from cpmf_uips_xaml.api import (
    parse_and_analyze_project,
    render_project_view,
    emit_workflows
)

# 1. Parse and analyze entire project
project_result, analyzer, index = parse_and_analyze_project(
    Path("./MyProject"),
    recursive=True,
    show_progress=True
)

# 2. Generate execution view from main entry point
execution_view = render_project_view(
    analyzer, index,
    view_type="execution",
    entry_point="Main.xaml",
    max_depth=15
)

# 3. Export workflows as JSON
workflows = list(analyzer.workflows.values())
emit_result = emit_workflows(
    workflows,
    format="json",
    output_path=Path("output.json"),
    pretty=True
)

print(f"Analyzed {len(workflows)} workflows")
print(f"Exported to {emit_result.files_written}")

Migration from v0.2.x

If you were using internal APIs that are no longer exported, use direct imports:

# ❌ v0.2.x - No longer works
from cpmf_uips_xaml import XmlUtils, ActivityExtractor

# ✅ v0.3.0+ - Use direct imports if needed
from cpmf_uips_xaml.core.utils import XmlUtils
from cpmf_uips_xaml.core.extractors import ActivityExtractor

# ✅ v0.3.0+ - Or better, use the API layer
from cpmf_uips_xaml.api import parse_and_analyze_project

Recommended approach: Use the API layer functions instead of reaching into internal modules. The API provides stable contracts while internals may change.

Data Models (DTOs)

The API works with strongly-typed DTO models for all data exchange:

Workflow DTOs:

  • WorkflowDto - Complete workflow with metadata, activities, edges
  • WorkflowCollectionDto - Multiple workflows with project context
  • ActivityDto - Activity with arguments and properties
  • ArgumentDto - Workflow or activity argument
  • VariableDto - Workflow variable
  • EdgeDto - Control flow edge between activities

Project DTOs:

  • ProjectInfo - Project metadata (name, version, dependencies)
  • EntryPointInfo - Entry point definition
  • ProvenanceInfo - Parser version and author tracking

Analysis DTOs:

  • QualityMetrics - Workflow quality scores
  • AntiPattern - Detected anti-patterns
  • IssueDto - Parse errors or warnings

All DTOs are immutable dataclasses with full type hints.

Development

Running Tests

# Run all tests
uv run pytest tests/ -v

# Run with coverage
uv run pytest tests/ --cov=xaml_parser --cov-report=html

# Run specific test file
uv run pytest tests/test_parser.py -v

# Run corpus tests only
uv run pytest tests/test_corpus.py -v -m corpus

Code Quality

# Format code
uv run black xaml_parser/ tests/

# Sort imports
uv run isort xaml_parser/ tests/

# Lint
uv run ruff check xaml_parser/ tests/

# Type check
uv run mypy xaml_parser/

Building

# Build distribution
uv build

# Check package
twine check dist/*

Project Structure

python/
├── xaml_parser/          # Source package
│   ├── __init__.py       # Public API
│   ├── __version__.py    # Version info
│   ├── parser.py         # Main workflow parser
│   ├── project.py        # Project parser (NEW)
│   ├── cli.py            # Command-line interface
│   ├── models.py         # Data models
│   ├── extractors.py     # Extraction logic
│   ├── utils.py          # Utilities
│   ├── validation.py     # Schema validation
│   ├── visibility.py     # ViewState handling
│   └── constants.py      # Configuration
├── tests/                # Test suite
│   ├── conftest.py       # Pytest fixtures
│   ├── test_parser.py    # Parser tests
│   ├── test_project.py   # Project parser tests (NEW)
│   ├── test_corpus.py    # Corpus tests
│   └── test_validation.py
├── pyproject.toml        # Package configuration
├── uv.lock               # Dependency lock
└── README.md             # This file

Requirements

  • Python 3.11+
  • defusedxml (for secure XML parsing)
  • pytest (for development)

Testing Philosophy

Tests reference shared test data in ../testdata/:

  • ../testdata/golden/: Golden freeze test pairs (XAML + JSON)
  • ../testdata/corpus/: Structured test projects

This ensures consistency across language implementations.

Contributing

See the main repository CONTRIBUTING.md for guidelines.

License

This project is dual-licensed:

You may choose which license applies to your use case.

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cpmf_uips_xaml-0.1.1.tar.gz (257.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cpmf_uips_xaml-0.1.1-py3-none-any.whl (198.7 kB view details)

Uploaded Python 3

File details

Details for the file cpmf_uips_xaml-0.1.1.tar.gz.

File metadata

  • Download URL: cpmf_uips_xaml-0.1.1.tar.gz
  • Upload date:
  • Size: 257.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for cpmf_uips_xaml-0.1.1.tar.gz
Algorithm Hash digest
SHA256 78e77d2c7395edc9e1dd32c6dceec416d73e5a9988c9cb3208c87f3edc798435
MD5 ae742bf2d035e052aa57eb34644867d0
BLAKE2b-256 5f14efca84d998d9dcaee39f8bde1cf617a48422a407433fe1fe375f43f215c3

See more details on using hashes here.

File details

Details for the file cpmf_uips_xaml-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: cpmf_uips_xaml-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 198.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for cpmf_uips_xaml-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 8aaeebef7ce00dd8864899dbbdb92fdaf0059339ed99b92a2e1e184f141f1542
MD5 454020fe17924dce12ed74b525a4aac1
BLAKE2b-256 ac58725462cbae3e3ef1dddb01280c400ec798a0244973fa9f6880e1863ceb3b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page