Standalone XAML workflow parser for automation projects (CPRIMA Forge)
Project description
XAML Parser - Python Implementation
Python implementation of the XAML workflow parser for automation projects.
Installation
From PyPI (when published)
pip install cpmf-uips-xaml
For Development
# Clone the repository
git clone https://github.com/rpapub/cpmf-uips-xaml.git
cd cpmf-uips-xaml/python
# Install with uv (recommended)
uv sync
# Or with pip in editable mode
pip install -e .
Quick Start
Python API
from pathlib import Path
from cpmf_uips_xaml import XamlParser
# Parse a workflow file
parser = XamlParser()
result = parser.parse_file(Path("workflow.xaml"))
if result.success:
content = result.content
print(f"Workflow: {content.root_annotation}")
print(f"Arguments: {len(content.arguments)}")
print(f"Activities: {len(content.activities)}")
# Access arguments
for arg in content.arguments:
print(f" {arg.direction} {arg.name}: {arg.type}")
if arg.annotation:
print(f" -> {arg.annotation}")
# Access activities with annotations
for activity in content.activities:
if activity.annotation:
print(f"{activity.activity_type}: {activity.annotation}")
else:
print("Parsing failed:", result.errors)
Command Line Interface
Project Parsing (Primary Mode):
# Parse entire project from project.json
cpmf-uips-xaml project.json
cpmf-uips-xaml /path/to/project.json
cpmf-uips-xaml /path/to/project # Directory containing project.json
# Show workflow dependency graph
cpmf-uips-xaml project.json --graph
# Parse only entry points (no recursive discovery)
cpmf-uips-xaml project.json --entry-points-only
# Save to file
cpmf-uips-xaml project.json --json -o output.json
Individual Workflow Files:
# Parse single workflow
cpmf-uips-xaml Main.xaml
# JSON output
cpmf-uips-xaml Main.xaml --json
# List only arguments
cpmf-uips-xaml Main.xaml --arguments
# Show activity tree
cpmf-uips-xaml Main.xaml --tree
# Process multiple files
cpmf-uips-xaml *.xaml --summary
# Recursive search
cpmf-uips-xaml **/*.xaml --summary
Using with uv (development):
uv run cpmf-uips-xaml project.json
uv run cpmf-uips-xaml workflow.xaml
CLI Options:
All modes:
--json- Output as JSON-o FILE- Write output to file--no-expressions- Skip expression extraction (faster)--strict- Fail on any error--help- Show all options
Project mode:
--graph- Show workflow dependency graph--entry-points-only- Parse only entry points (no recursive discovery)
File mode:
--arguments- Show only arguments--activities- Show only activities--tree- Show activity tree with nesting--summary- Summary for multiple files
Python API for Projects:
from pathlib import Path
from cpmf_uips_xaml import ProjectParser
# Parse entire project
parser = ProjectParser()
result = parser.parse_project(Path("path/to/project"))
if result.success:
print(f"Project: {result.project_config.name}")
print(f"Workflows: {result.total_workflows}")
# Access entry points
for workflow in result.get_entry_points():
print(f"Entry: {workflow.relative_path}")
# Access dependency graph
for workflow_path, dependencies in result.dependency_graph.items():
print(f"{workflow_path} invokes:")
for dep in dependencies:
print(f" -> {dep}")
else:
print("Project parsing failed:", result.errors)
How it works:
- Reads
project.jsonto find entry points - Parses entry point workflows
- Recursively discovers workflows via
InvokeWorkflowFileactivities - Builds complete dependency graph
- Returns all workflows with parse results
Features
- Minimal Dependencies: Single required dependency (defusedxml for secure XML parsing)
- Complete Extraction: Arguments, variables, activities, expressions, annotations
- Project Parsing: Auto-discover and parse entire UiPath projects with dependency analysis
- Type Safety: Full type hints for all APIs
- Error Handling: Graceful degradation with detailed error reporting
- Schema Validation: Output validates against JSON schemas
- Performance: Fast parsing even for large workflows
- CLI Tool: Full-featured command-line interface for batch processing
Configuration
config = {
'extract_expressions': True,
'extract_viewstate': False,
'strict_mode': False,
'max_depth': 50
}
parser = XamlParser(config)
result = parser.parse_file(file_path)
API Reference
XamlParser
Main workflow parser class:
parser = XamlParser(config=None)
result = parser.parse_file(Path("workflow.xaml"))
result = parser.parse_content(xaml_string)
ProjectParser
Project-level parser class:
parser = ProjectParser(config=None)
result = parser.parse_project(
project_dir=Path("path/to/project"),
recursive=True, # Follow InvokeWorkflowFile references
entry_points_only=False # Only parse entry points
)
Models
Data models for parsed content:
Workflow Models:
ParseResult: Top-level result with success/error infoWorkflowContent: Complete workflow metadataWorkflowArgument: Argument definitionWorkflowVariable: Variable definitionActivity: Activity with full metadataExpression: Expression with language detection
Project Models:
ProjectResult: Complete project parsing resultProjectConfig: Parsed project.json configurationWorkflowResult: Individual workflow result in project context
Validation
Schema-based validation:
from cpmf_uips_xaml.validation import validate_output
errors = validate_output(result)
if errors:
print("Validation failed:", errors)
Library API (v0.3.0+)
Starting in v0.3.0, the package provides a stable orchestration API that coordinates parsing, analysis, and output generation. This API is the recommended way to integrate XAML parsing into libraries and tools.
Architecture
The package follows a layered architecture:
Your Application
↓
API Layer (orchestration) ← You are here
↓
Core, UiPS, Emitters, Views (internal implementation)
The API layer provides stable entry points while internal implementation details may change between versions.
Core API Functions
parse_and_analyze_project()
Parse a project and build queryable index in one step:
from pathlib import Path
from cpmf_uips_xaml.api import parse_and_analyze_project
# Parse project and build complete analysis
project_result, analyzer, index = parse_and_analyze_project(
Path("./MyProject"),
recursive=True, # Follow InvokeWorkflowFile references
entry_points_only=False, # Parse all workflows, not just entry points
show_progress=False # Show progress bars
)
# Access project info
if project_result.project_config:
print(f"Project: {project_result.project_config.name}")
print(f"Main workflow: {project_result.project_config.main}")
# Query workflows
workflow_ids = index.list_workflows()
print(f"Total workflows: {len(workflow_ids)}")
# Traverse call graph
for workflow_id in index.list_workflows():
callees = index.get_callees(workflow_id)
if callees:
print(f"{workflow_id} calls: {callees}")
render_project_view()
Transform analysis results into different view formats:
from cpmf_uips_xaml.api import parse_and_analyze_project, render_project_view
# Parse and analyze
project_result, analyzer, index = parse_and_analyze_project(Path("./MyProject"))
# Render nested view (hierarchical structure)
nested = render_project_view(
analyzer, index,
view_type="nested"
)
# Render execution view (call graph traversal from entry point)
execution = render_project_view(
analyzer, index,
view_type="execution",
entry_point="Main.xaml",
max_depth=10
)
# Render slice view (context window around focal activity)
slice_view = render_project_view(
analyzer, index,
view_type="slice",
focus="LogMessage_abc123",
radius=2
)
emit_workflows()
Output workflows in different formats:
from pathlib import Path
from cpmf_uips_xaml.api import parse_and_analyze_project, emit_workflows
# Parse project
project_result, analyzer, index = parse_and_analyze_project(Path("./MyProject"))
# Get workflow DTOs from analyzer
workflows = list(analyzer.workflows.values())
# Emit as JSON
result = emit_workflows(
workflows,
format="json",
output_path=Path("output.json"),
pretty=True,
exclude_none=True
)
if result.success:
print(f"Written {len(result.files_written)} files")
else:
print(f"Errors: {result.errors}")
# Emit as Mermaid diagram
emit_workflows(
workflows,
format="mermaid",
output_path=Path("output.mmd")
)
# Emit as Markdown documentation
emit_workflows(
workflows,
format="doc",
output_path=Path("output.md")
)
Available formats: json, mermaid, doc
normalize_parse_results()
Convert raw ParseResult objects to structured WorkflowDto objects:
from pathlib import Path
from cpmf_uips_xaml import XamlParser
from cpmf_uips_xaml.api import normalize_parse_results
# Parse files
parser = XamlParser()
parse_results = [
parser.parse_file(Path("Main.xaml")),
parser.parse_file(Path("GetConfig.xaml"))
]
# Normalize to DTOs
workflows = normalize_parse_results(
parse_results,
project_dir=Path("./MyProject"),
sort_output=True,
calculate_metrics=True,
detect_anti_patterns=True
)
# Now you have structured DTOs ready for emission or analysis
for workflow in workflows:
print(f"Workflow: {workflow.name}")
print(f" Activities: {len(workflow.activities)}")
print(f" Arguments: {len(workflow.arguments)}")
parse_file_to_dto()
Single-file parsing with DTO normalization:
from pathlib import Path
from cpmf_uips_xaml.api import parse_file_to_dto
# Parse and normalize in one call
workflow = parse_file_to_dto(
Path("Main.xaml"),
project_dir=Path("./MyProject")
)
print(f"Workflow: {workflow.name}")
print(f"Activities: {len(workflow.activities)}")
Configuration Helpers
from cpmf_uips_xaml.api import load_default_config, create_emitter_config
# Load default parser config
config = load_default_config()
print(config) # Shows default settings
# Create emitter config with overrides
emitter_config = create_emitter_config(
pretty=True,
exclude_none=True,
field_profile="minimal"
)
Complete Example: Project Analysis Pipeline
from pathlib import Path
from cpmf_uips_xaml.api import (
parse_and_analyze_project,
render_project_view,
emit_workflows
)
# 1. Parse and analyze entire project
project_result, analyzer, index = parse_and_analyze_project(
Path("./MyProject"),
recursive=True,
show_progress=True
)
# 2. Generate execution view from main entry point
execution_view = render_project_view(
analyzer, index,
view_type="execution",
entry_point="Main.xaml",
max_depth=15
)
# 3. Export workflows as JSON
workflows = list(analyzer.workflows.values())
emit_result = emit_workflows(
workflows,
format="json",
output_path=Path("output.json"),
pretty=True
)
print(f"Analyzed {len(workflows)} workflows")
print(f"Exported to {emit_result.files_written}")
Migration from v0.2.x
If you were using internal APIs that are no longer exported, use direct imports:
# ❌ v0.2.x - No longer works
from cpmf_uips_xaml import XmlUtils, ActivityExtractor
# ✅ v0.3.0+ - Use direct imports if needed
from cpmf_uips_xaml.core.utils import XmlUtils
from cpmf_uips_xaml.core.extractors import ActivityExtractor
# ✅ v0.3.0+ - Or better, use the API layer
from cpmf_uips_xaml.api import parse_and_analyze_project
Recommended approach: Use the API layer functions instead of reaching into internal modules. The API provides stable contracts while internals may change.
Data Models (DTOs)
The API works with strongly-typed DTO models for all data exchange:
Workflow DTOs:
WorkflowDto- Complete workflow with metadata, activities, edgesWorkflowCollectionDto- Multiple workflows with project contextActivityDto- Activity with arguments and propertiesArgumentDto- Workflow or activity argumentVariableDto- Workflow variableEdgeDto- Control flow edge between activities
Project DTOs:
ProjectInfo- Project metadata (name, version, dependencies)EntryPointInfo- Entry point definitionProvenanceInfo- Parser version and author tracking
Analysis DTOs:
QualityMetrics- Workflow quality scoresAntiPattern- Detected anti-patternsIssueDto- Parse errors or warnings
All DTOs are immutable dataclasses with full type hints.
Development
Running Tests
# Run all tests
uv run pytest tests/ -v
# Run with coverage
uv run pytest tests/ --cov=xaml_parser --cov-report=html
# Run specific test file
uv run pytest tests/test_parser.py -v
# Run corpus tests only
uv run pytest tests/test_corpus.py -v -m corpus
Code Quality
# Format code
uv run black xaml_parser/ tests/
# Sort imports
uv run isort xaml_parser/ tests/
# Lint
uv run ruff check xaml_parser/ tests/
# Type check
uv run mypy xaml_parser/
Building
# Build distribution
uv build
# Check package
twine check dist/*
Project Structure
python/
├── xaml_parser/ # Source package
│ ├── __init__.py # Public API
│ ├── __version__.py # Version info
│ ├── parser.py # Main workflow parser
│ ├── project.py # Project parser (NEW)
│ ├── cli.py # Command-line interface
│ ├── models.py # Data models
│ ├── extractors.py # Extraction logic
│ ├── utils.py # Utilities
│ ├── validation.py # Schema validation
│ ├── visibility.py # ViewState handling
│ └── constants.py # Configuration
├── tests/ # Test suite
│ ├── conftest.py # Pytest fixtures
│ ├── test_parser.py # Parser tests
│ ├── test_project.py # Project parser tests (NEW)
│ ├── test_corpus.py # Corpus tests
│ └── test_validation.py
├── pyproject.toml # Package configuration
├── uv.lock # Dependency lock
└── README.md # This file
Requirements
- Python 3.11+
- defusedxml (for secure XML parsing)
- pytest (for development)
Testing Philosophy
Tests reference shared test data in ../testdata/:
../testdata/golden/: Golden freeze test pairs (XAML + JSON)../testdata/corpus/: Structured test projects
This ensures consistency across language implementations.
Contributing
See the main repository CONTRIBUTING.md for guidelines.
License
This project is dual-licensed:
- Code: Apache License 2.0 (see LICENSE-APACHE)
- Documentation & Output: Creative Commons Attribution 4.0 (see LICENSE-CC-BY)
You may choose which license applies to your use case.
Links
- Repository: https://github.com/rpapub/cpmf-uips-xaml
- Issues: https://github.com/rpapub/cpmf-uips-xaml/issues
- PyPI: https://pypi.org/project/cpmf-uips-xaml/ (coming soon)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cpmf_uips_xaml-0.1.4.tar.gz.
File metadata
- Download URL: cpmf_uips_xaml-0.1.4.tar.gz
- Upload date:
- Size: 259.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c57392519a1c865995366b5b543bd4a65df3bcdcad6ce337f4322bc5274ae57c
|
|
| MD5 |
239cf2a08b48833b6bb6919c75757fe3
|
|
| BLAKE2b-256 |
50e0240d3dad77a4096384c3caef7c70e1e4eb8cfb4bdfbb9e01653c95246ee9
|
File details
Details for the file cpmf_uips_xaml-0.1.4-py3-none-any.whl.
File metadata
- Download URL: cpmf_uips_xaml-0.1.4-py3-none-any.whl
- Upload date:
- Size: 199.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bf7a9f5ed3a1183ecace71cf036d36f32a077d87969f5d8d8e57a1a2ae2c98ac
|
|
| MD5 |
5d35831cc3f8f6eb40b37898c2586617
|
|
| BLAKE2b-256 |
eb78b521b50f14d07ffc4364ecc407d497d6224af3cf4558120e5632a0ecd570
|