Playwright selectors and utilities for rpachallenge.com automation
Project description
cpmf-rpachallenge
Playwright selectors and utilities for automating rpachallenge.com.
Version 0.3.0 - Complete architectural refactoring with functional and procedural paradigms.
Installation
pip install cpmf-rpachallenge
Breaking Changes in 0.3.0
This release introduces a complete architectural refactoring:
- New directory structure: Code organized into
procedural/,functional/,actions/, anddomain/ - Pure functional data sources:
HtmlTableSourceis now sync and accepts HTML strings (no async I/O) - Side effects taxonomy: Actions are decorated with
@side_effects()to declare I/O operations - Deprecated APIs: Old imports still work but issue
DeprecationWarning
Migration Guide
Old (Deprecated):
from cpmf_rpachallenge import Downloads, FormFields, Buttons
from cpmf_rpachallenge import from_html_table # REMOVED
records = Downloads.get_challenge_data()
await page.fill(FormFields.FIRST_NAME, "John")
New (Recommended):
from cpmf_rpachallenge.domain import from_xlsx, load_records
from cpmf_rpachallenge.domain.selectors import Pages
from cpmf_rpachallenge.actions import scrape_table_html, parse_html_table
from cpmf_rpachallenge import fetch_challenge_excel
path = fetch_challenge_excel()
source = from_xlsx(path)
records = load_records(source)
await page.fill(Pages.ChallengePage.Fields.FIRST_NAME, "John")
# HTML table scraping (two-step: scrape + parse)
html = await scrape_table_html(page, "table#dataTable") # I/O action
dicts = parse_html_table(html) # Pure transformation
records = [ChallengeRecord.from_dict(d) for d in dicts]
Architecture
The library is organized into four layers:
- procedural/ - Imperative, step-by-step workflows (RPAChallengeClient)
- functional/ - Pure transformations, composable data sources
- actions/ - Discrete I/O operations with declared side effects
- domain/ - Business logic, schemas, validation, results, selectors
- backends/ - Driver implementations (Playwright, API)
Quick Start (Procedural)
High-level client for imperative workflows:
from cpmf_rpachallenge.procedural import RPAChallengeClient
from cpmf_rpachallenge.backends import PlaywrightBackend
from playwright.async_api import async_playwright
async with async_playwright() as p:
browser = await p.chromium.launch()
page = await browser.new_page()
await page.goto("https://rpachallenge.com")
backend = PlaywrightBackend(page)
client = RPAChallengeClient(backend=backend)
# Run complete challenge
result = await client.run_async()
print(f"Score: {result.success_rate}% in {result.time_ms}ms")
await browser.close()
Quick Start (Functional)
Composable data sources and pure transformations:
from cpmf_rpachallenge import fetch_challenge_excel
from cpmf_rpachallenge.domain import from_xlsx, load_records, ChallengeRecord
from cpmf_rpachallenge.functional import filter_records
from cpmf_rpachallenge.domain.selectors import Pages
from playwright.async_api import async_playwright
# Functional data access
path = fetch_challenge_excel()
source = from_xlsx(path)
# Composable filtering
filtered = filter_records(source, lambda r: r["role"] == "Manager")
records = load_records(filtered, as_dataclass=True)
# Use with Playwright
async with async_playwright() as p:
browser = await p.chromium.launch()
page = await browser.new_page()
await page.goto("https://rpachallenge.com")
await page.click(Pages.ChallengePage.Buttons.START)
for record in records:
for field_name, value in record.as_form_data().items():
await page.fill(f'input[ng-reflect-name="{field_name}"]', value)
await page.click(Pages.ChallengePage.Buttons.SUBMIT)
# Parse results
message = await page.inner_text(Pages.ChallengePage.Results.MESSAGE_DETAILS)
result = Results.parse_results(message)
print(f"Score: {result.success_rate}%")
await browser.close()
API Reference
Domain Layer
Page Selectors (Page Object Pattern)
from cpmf_rpachallenge.domain.selectors import Pages
# Challenge page (main form)
Pages.ChallengePage.Fields.FIRST_NAME
Pages.ChallengePage.Fields.LAST_NAME
Pages.ChallengePage.Fields.PHONE
Pages.ChallengePage.Fields.EMAIL
Pages.ChallengePage.Fields.ADDRESS
Pages.ChallengePage.Fields.COMPANY_NAME
Pages.ChallengePage.Fields.ROLE
Pages.ChallengePage.Buttons.START
Pages.ChallengePage.Buttons.SUBMIT
Pages.ChallengePage.Buttons.RESET
Pages.ChallengePage.Results.MESSAGE_CONTAINER
Pages.ChallengePage.Results.MESSAGE_TITLE
Pages.ChallengePage.Results.MESSAGE_DETAILS
# Data table page (paginated tables)
Pages.DataTablePage.TABLE
Pages.DataTablePage.HEADERS
Pages.DataTablePage.ROWS
Pages.DataTablePage.Navigation.NEXT
Pages.DataTablePage.Navigation.PREV
Records and Schemas
from cpmf_rpachallenge.domain import (
ChallengeRecord,
RPA_CHALLENGE_SCHEMA,
from_xlsx,
load_records,
)
# Load from Excel
source = from_xlsx("challenge.xlsx")
records = load_records(source)
# Create record
record = ChallengeRecord(
first_name="John",
last_name="Doe",
company_name="Acme Corp",
role="Developer",
address="123 Main St",
email="john@example.com",
phone="1234567890",
)
# Convert to form data
form_data = record.as_form_data() # {"labelFirstName": "John", ...}
Validation
from cpmf_rpachallenge.domain import DataValidator
records = load_records(from_xlsx("challenge.xlsx"))
result = DataValidator.validate(records)
if not result.is_valid:
print(f"Data issues: {result.summary}")
for record in result.invalid_records:
print(f" {record.summary}")
for error in record.errors:
print(f" - {error.field}: {error.message}")
Results
from cpmf_rpachallenge.domain import Results
message = await page.inner_text(Pages.ChallengePage.Results.MESSAGE_DETAILS)
result = Results.parse_results(message)
print(f"Success rate: {result.success_rate}%")
print(f"Time: {result.time_seconds}s")
print(f"Fields correct: {result.fields_correct}/{result.total_fields}")
Functional Layer
Data Sources
from cpmf_rpachallenge.functional import XlsxSource, HtmlTableSource
from cpmf_rpachallenge.domain import RPA_CHALLENGE_SCHEMA, EXCEL_HEADER_MAP
# Excel source (pure, sync)
source = XlsxSource("challenge.xlsx", RPA_CHALLENGE_SCHEMA, header_map=EXCEL_HEADER_MAP)
for record in source.load():
print(record)
# HTML table source (pure, sync - accepts HTML string)
html = "<table>...</table>"
source = HtmlTableSource(html, RPA_CHALLENGE_SCHEMA, header_map=HTML_TABLE_HEADER_MAP)
records = list(source.load())
Combinators
from cpmf_rpachallenge.functional import filter_records, map_records, collect
# Composable filtering
source = from_xlsx("challenge.xlsx")
filtered = filter_records(source, lambda r: r["role"] == "Manager")
records = collect(filtered)
# Composable mapping
mapped = map_records(filtered, lambda r: {**r, "full_name": f"{r['first_name']} {r['last_name']}"})
records = collect(mapped)
Actions Layer
Actions handle I/O boundaries and are decorated with @side_effects():
from cpmf_rpachallenge.actions import scrape_table_html, parse_html_table, read_excel
# DOM I/O action (async)
html = await scrape_table_html(page, "table#dataTable")
# Pure transformation (sync)
dicts = parse_html_table(html)
# File system action (sync)
dicts = read_excel("challenge.xlsx")
Procedural Layer
High-level client for imperative workflows:
from cpmf_rpachallenge.procedural import RPAChallengeClient
backend = PlaywrightBackend(page)
client = RPAChallengeClient(backend=backend)
# High-level operations
records = client.get_records()
validation = client.validate_records(records)
if validation.is_valid:
client.start()
for record in records:
client.fill_form(record)
client.submit()
result = client.get_result()
Readiness Checks
from cpmf_rpachallenge.domain import ReadinessCheck
result = await ReadinessCheck.run_async(page)
if result.is_automatable:
# Proceed with automation
pass
else:
print(f"Page not ready: {result.summary}")
for name in result.failed_checks:
print(f" - {result.checks[name].message}")
Screenshots
from cpmf_rpachallenge import ScreenshotCapture, ScreenshotFormat
capture = ScreenshotCapture()
# Capture screenshots
await capture.take_async(page, label="form_filled")
await capture.take_pdf_async(page, label="result_pdf")
# Save all
paths = capture.collection.save_all("./screenshots")
# Create montage
montage = capture.collection.create_montage(columns=5)
Path("montage.png").write_bytes(montage)
Configuration
Configuration hierarchy (highest to lowest priority):
- Explicit parameters passed to functions
- Environment variables (
RPACHALLENGE_*) - Default values
Environment Variables
| Variable | Default | Description |
|---|---|---|
RPACHALLENGE_BASE_URL |
https://rpachallenge.com |
Base URL |
RPACHALLENGE_EXCEL_URL |
https://rpachallenge.com/assets/downloadFiles/challenge.xlsx |
Excel download URL |
RPACHALLENGE_HEADLESS |
true |
Run browser headless |
RPACHALLENGE_TIMEOUT_MS |
30000 |
Timeout in milliseconds |
RPACHALLENGE_DOWNLOAD_DIR |
(temp dir) | Download directory |
RPACHALLENGE_SLOW_MO |
0 |
Slow motion delay (ms) |
Using Config
from cpmf_rpachallenge import get_config, RpaChallengeConfig
config = get_config() # Reads from environment
custom = RpaChallengeConfig(headless=False, slow_mo=100)
debug = config.with_overrides(headless=False)
Deprecated APIs
These APIs still work but issue DeprecationWarning:
# DEPRECATED - use domain.selectors.Pages instead
from cpmf_rpachallenge import FormFields, Buttons
# DEPRECATED - use domain imports
from cpmf_rpachallenge import ChallengeRecord, DataValidator
# DEPRECATED - use fetch_challenge_excel() + from_xlsx()
from cpmf_rpachallenge import Downloads
License
Apache-2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cpmf_rpachallenge-0.3.1.tar.gz.
File metadata
- Download URL: cpmf_rpachallenge-0.3.1.tar.gz
- Upload date:
- Size: 66.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3166b953b99e1f68185f3c66e772343b0454cda5274504e687c8d7455bb40bbf
|
|
| MD5 |
ed694dd844801a55a1025a66b2a2edd2
|
|
| BLAKE2b-256 |
b99e5808866bd385eaa4205750010407cd5e7b4ff511c54f8ab266b2ef9d51a5
|
File details
Details for the file cpmf_rpachallenge-0.3.1-py3-none-any.whl.
File metadata
- Download URL: cpmf_rpachallenge-0.3.1-py3-none-any.whl
- Upload date:
- Size: 48.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5b1c8c1df3642056ab65d6c4ee5a6c230fb3d990e3bae5fb8d8e97761b811027
|
|
| MD5 |
379e6b7a6d371f822928c4d416adef20
|
|
| BLAKE2b-256 |
e88c8b901ff880f9f06f8028c4dc269090f57ed00ff65b3bb2ef57a0c9837388
|