Skip to main content

Deterministic DSL-first browser automation platform powered by Playwright heuristics with optional local AI self-healing (Ollama)

Project description

ManulEngine

PyPI VS Code Marketplace Status: Alpha

Status: Alpha

Developed by a single person.

This is an experimental automation runtime with a companion VS Code extension in this repository. There are no promises or guarantees of stability. Bugs are expected, APIs will change, and it is currently meant for exploration and technical feedback, not production CI/CD pipelines.

Deterministic, DSL-first web and desktop automation on top of Playwright, with explainable heuristics and optional local AI fallback.

Status: Alpha

This project is currently in Alpha. While the core architecture is solid, it is actively being battle-tested. Bugs are expected, APIs may evolve, and there are no promises about stability.

ManulEngine is deliberately positioned as an engineering tool, not a marketing story. The core claim is transparency: when a step works, you should understand why; when it fails, you should have enough signal to diagnose it.

Core Philosophy

ManulEngine is an interpreter for the .hunt DSL. A hunt file expresses intent in plain English, the runtime snapshots the DOM, ranks candidates with heuristics, and executes through Playwright.

Determinism first

The primary resolver is not an LLM. It is a deterministic scoring system backed by DOM traversal and weighted heuristics:

  • DOM collection uses a native TreeWalker in injected JavaScript.
  • Candidate ranking is handled by DOMScorer.
  • Scores are normalized on a 0.0 to 1.0 confidence scale.
  • Weighted channels include cache, semantics, text, attributes, and proximity.

That means the engine can explain more than "element not found". It can show whether a target lost because the text affinity was weak, semantic alignment was poor, the candidate was hidden, or another channel outweighed it.

Transparency instead of AI magic

The recommended default is heuristics-only mode:

{
  "model": null,
  "browser": "chromium",
  "controls_cache_enabled": true,
  "semantic_cache_enabled": true
}

When a local Ollama model is enabled, it acts as a fallback for ambiguous cases rather than the primary execution path.

Dual-persona workflow

The authoring model is intentionally split across two layers:

  • QA, analysts, and operators write plain-English .hunt steps.
  • SDETs extend those flows with Python hooks, lifecycle setup, and custom controls when a UI or backend dependency should not be forced into the generic DSL path.

The intended boundary is straightforward:

  • Keep business intent and readable flow in the DSL.
  • Keep environment setup, backend interaction, and custom widget handling in Python.

Why ManulEngine

Most tools sold as AI automation are cloud wrappers around selectors and retries. ManulEngine is aiming at the opposite design.

Deterministic first, not AI-first

The runtime resolves DOM elements through a native JavaScript TreeWalker plus a weighted DOMScorer. That gives you a repeatable result from page state plus step text, not from prompt variance.

Explainable instead of opaque

When the engine chooses the wrong target, you should be able to inspect the actual scoring channels that drove the result. The point is not just success cases. The point is actionable failure analysis.

One artifact for two personas

QA, ops, and analysts can keep the flow readable in .hunt. SDETs can attach Python, lifecycle hooks, and custom controls without splitting the scenario into two separate systems.

Optional AI fallback, off by default

"model": null remains the recommended default. When a local Ollama model is enabled, it is a fallback for ambiguous cases, not the primary execution engine.

Four Pillars

ManulEngine is not only a test runner. The same runtime and the same DSL can cover four adjacent use cases:

QA and E2E testing

Write plain-English flows, verify outcomes, attach reports and screenshots when needed, and keep selectors out of the test source.

RPA workflows

Use the same DSL to log into portals, download files, fill forms, extract values, and hand work to Python when a backend or filesystem step is involved.

Synthetic monitoring

Pair .hunt files with @schedule: and manul daemon to run scheduled health checks with the same execution model as your test flows.

AI agent execution targets

If an external agent needs to drive the browser, .hunt is a safer constrained target than raw Playwright code because the runtime still owns validation, scoring, retries, and reporting.

Key Features

Explainability layers

The runtime and the companion VS Code extension now expose multiple explainability layers instead of forcing you to inspect a terminal dump.

CLI: --explain

manul --explain tests/saucedemo.hunt
manul --explain --headless tests/ --html-report

That mode prints the candidate rankings and per-channel scoring breakdown for each resolved step.

Representative CLI explain output:

┌─ EXPLAIN: Target = "Login"
│  Step: Click the 'Login' button
│
│  #1 <button> "Login"
│     total:      0.593
│     text:       0.281
│     attributes: 0.050
│     semantics:  0.225
│     proximity:  0.037
│     cache:      0.000
│
└─ Decision: selected "Login" with score 0.593

VS Code: title bar action

During a debug pause, the extension exposes 🔍 Explain Current Step in the editor title bar so you can request explanation data for the paused step without leaving the editor.

VS Code: hover tooltips in debug mode

Run a hunt in Debug mode through Test Explorer, then hover over any resolved step line in the .hunt file. The extension shows the stored per-channel breakdown directly on that line. That is usually the fastest way to understand why one candidate outranked another.

VS Code hover debugger and explain mode

The VS Code extension is the primary editor integration in this repository for .hunt files.

Important debugging workflows:

  • Run a hunt in Debug mode through Test Explorer.
  • Use the 🔍 Explain Current Step title bar action during a debug pause.
  • Hover over a step line to see the stored scoring breakdown for the resolved target.

The hover flow is the killer DX feature because it keeps the reasoning attached to the exact step line instead of sending you to a separate console search.

Desktop and Electron automation via executable_path

ManulEngine is not limited to browser tabs. Because it runs on Playwright, it can also drive Electron-based desktop applications.

Set executable_path in the runtime config and use OPEN APP instead of NAVIGATE:

{
  "model": null,
  "browser": "chromium",
  "executable_path": "/path/to/YourElectronApp"
}
@context: Desktop smoke test
@title: Desktop Smoke

STEP 1: Attach to the window
    OPEN APP
    VERIFY that 'Welcome' is present

STEP 2: Exercise the main screen
    Click the 'Settings' button
    VERIFY that 'Preferences' is present

DONE.

This is practical for Electron-based apps such as Slack, Discord, VS Code, or internal desktop shells.

Platform-specific examples are straightforward:

{
  "model": null,
  "browser": "chromium",
  "executable_path": "/usr/share/discord/Discord",
  "controls_cache_enabled": true
}
{
  "model": null,
  "browser": "chromium",
  "executable_path": "C:\\Users\\YourUser\\AppData\\Local\\Discord\\app-1.0.9051\\Discord.exe",
  "controls_cache_enabled": true
}

OPEN APP waits for the default window, attaches to it, and then the rest of the hunt runs the same way as a web scenario.

Smart recorder for native controls

The recorder is meant to capture intent, not just raw pointer activity. A concrete example is native <select> handling: the injected recorder observes semantic change events and emits DSL such as Select 'Option' from 'Dropdown' instead of recording a brittle chain of low-level clicks on <option> elements.

Python hooks and custom controls

When the generic resolver should not be forced to understand a bespoke widget, ManulEngine provides an explicit SDET escape hatch:

  • [SETUP] / [TEARDOWN] hooks for environment and data setup.
  • CALL PYTHON for backend lookups or computed values.
  • @before_all / @after_all lifecycle hooks for suite-wide orchestration.
  • @custom_control handlers for complex UI elements.

That balance is intentional: keep the common path readable, and keep the edge cases programmable.

State, variables, and scope

Variable handling is strict rather than ad hoc. The runtime supports @var:, EXTRACT, SET, and CALL PYTHON ... into {var} with deterministic placeholder substitution in downstream steps.

Useful patterns:

  • @var: for static test data at the top of the file.
  • EXTRACT ... into {var} for values pulled from the UI.
  • SET {var} = value for mid-run assignment.
  • CALL PYTHON module.func into {var} for backend-generated runtime values such as OTPs or tokens.

Scope precedence is explicit:

Priority Scope Source
1 Row vars @data: iteration values
2 Step vars EXTRACT, SET, CALL PYTHON ... into {var}
3 Mission vars @var: declarations
4 Global vars lifecycle hooks and process-level state

That means row data overrides mission defaults, and step-local captures do not silently leak across data rows.

Tags and data-driven runs

The runtime also supports selective execution and data-driven loops without changing the DSL model.

@tags: smoke, auth
@data: users.csv
manul tests/ --tags smoke

@tags: filters which hunt files run. @data: reruns the same hunt for every row in a JSON or CSV dataset while keeping scope isolation intact.

Lifecycle orchestration and hooks

There are two levels of Python orchestration:

  • Per-file [SETUP] / [TEARDOWN] and inline CALL PYTHON for file-local setup or backend calls.
  • Suite-level manul_hooks.py with @before_all, @after_all, @before_group, and @after_group for shared state across multiple hunts.

That split matters because UI setup should not be abused for infrastructure setup when deterministic Python can do it more directly.

Benchmarks and test coverage

The repo ships with both synthetic tests and adversarial fixtures. The point is not to claim maturity. The point is to show that the scoring model, parser, hooks, recorder, scheduler, and reporter are exercised against concrete failure modes.

  • python manul.py test runs the synthetic and unit suite.
  • benchmarks/run_benchmarks.py exercises dynamic IDs, overlapping traps, nested tables, and custom dropdown fixtures.
  • tests/*.hunt holds integration-style hunts for real browser flows.

Getting Started

Install

pip install manul-engine==0.0.9.6
playwright install

Optional local AI fallback:

pip install "manul-engine[ai]==0.0.9.6"
ollama pull qwen2.5:0.5b
ollama serve

Configuration

Create manul_engine_configuration.json in the workspace root. All keys are optional, but the public README should still show the current runtime surface area because this file is the main runtime control plane:

{
  "model": null,
  "browser": "chromium",
  "browser_args": [],
  "headless": false,

  "ai_always": false,
  "ai_policy": "prior",
  "ai_threshold": null,

  "timeout": 5000,
  "nav_timeout": 30000,

  "controls_cache_enabled": true,
  "controls_cache_dir": "cache",
  "semantic_cache_enabled": true,

  "log_name_maxlen": 0,
  "log_thought_maxlen": 0,
  "tests_home": "tests",
  "auto_annotate": false,

  "executable_path": null,
  "channel": null,

  "workers": 1,

  "retries": 0,
  "screenshot": "on-fail",
  "html_report": false
}

Notes:

  • model: null keeps the runtime fully heuristics-only.
  • browser_args passes extra launch flags to the browser.
  • ai_always, ai_policy, and ai_threshold only matter when a model is enabled.
  • controls_cache_dir, tests_home, and auto_annotate control runtime filesystem behavior.
  • channel targets an installed browser such as Chrome or Edge.
  • executable_path targets a custom executable such as an Electron app.

Environment variables always win over JSON config. That is useful for CI, shell aliases, and one-off runs:

export MANUL_HEADLESS=true
export MANUL_BROWSER=firefox
export MANUL_MODEL=qwen2.5:0.5b
export MANUL_WORKERS=4
export MANUL_EXPLAIN=true

Configuration reference:

Key Default Description
model null Ollama model name. null keeps the runtime heuristics-only.
headless false Hide the browser window.
browser "chromium" Browser engine: chromium, firefox, or webkit.
browser_args [] Extra launch flags for the browser.
ai_threshold auto Score threshold before optional LLM fallback.
ai_always false Always ask the LLM picker. Only makes sense when model is set.
ai_policy "prior" Treat heuristic score as a prior hint or as a strict constraint.
controls_cache_enabled true Enable the persistent per-site controls cache.
controls_cache_dir "cache" Cache directory relative to CWD or absolute path.
semantic_cache_enabled true Enable in-session semantic cache reuse.
timeout 5000 Default action timeout in ms.
nav_timeout 30000 Navigation timeout in ms.
log_name_maxlen 0 Truncate element names in logs. 0 means no limit.
log_thought_maxlen 0 Truncate LLM thought strings in logs. 0 means no limit.
workers 1 Max hunt files to run in parallel. Debug mode forces sequential execution.
tests_home "tests" Default output directory for new hunts and scan output.
auto_annotate false Insert # 📍 Auto-Nav: comments after URL changes during a run.
channel null Installed browser channel such as chrome or msedge.
executable_path null Absolute path to a custom executable such as Electron.
retries 0 Retry failed hunt files this many times.
screenshot "on-fail" Screenshot mode: none, on-fail, or always.
html_report false Generate reports/manul_report.html after the run.

Runtime notes:

  • model: null is still the recommended default.
  • workers can come from CLI, JSON config, or MANUL_WORKERS; CLI wins.
  • --debug and --break-lines force workers = 1 because interactive pauses cannot be parallelised safely.
  • Relative controls_cache_dir paths resolve from the current working directory, not from the package install path.
  • channel and executable_path solve different problems: installed browser channel versus custom desktop executable.

First hunt file

@context: Smoke test for a login flow
@title: Login Smoke
@var: {email} = admin@example.com
@var: {password} = secret123

STEP 1: Open the app
    NAVIGATE to https://example.com/login
    VERIFY that 'Sign In' is present

STEP 2: Authenticate
    Fill 'Email' field with '{email}'
    Fill 'Password' field with '{password}'
    Click the 'Sign In' button
    VERIFY that 'Dashboard' is present

DONE.

Run it

manul tests/login.hunt

Useful commands:

python manul.py test
manul tests/
manul --headless tests/saucedemo.hunt
manul --html-report --screenshot on-fail tests/
manul --explain tests/saucedemo.hunt

Runtime reference

Useful capabilities that were getting lost when the README was trimmed too aggressively:

  • OPEN APP plus executable_path lets the same DSL drive Electron apps such as Discord, Slack, or internal desktop shells.
  • @schedule: plus manul daemon turns a hunt into a built-in monitor or RPA task without external cron wiring.
  • @var:, EXTRACT, SET, and CALL PYTHON ... into {var} give you deterministic variable flow without hardcoding runtime values.
  • [SETUP], [TEARDOWN], inline CALL PYTHON, and manul_hooks.py cover environment setup, backend calls, and suite-wide orchestration.
  • @custom_control is the explicit escape hatch when a widget should be handled with raw Playwright instead of generic heuristics.
  • SCAN PAGE and manul record are there to accelerate authoring, not to replace the readable DSL with low-level recordings.

Static variables and hooks

@var: {email} = admin@example.com
@var: {password} = secret123

[SETUP]
    CALL PYTHON db_helpers.seed_admin_user
[END SETUP]

STEP 1: Login
    NAVIGATE to https://example.com/login
    Fill 'Email' field with '{email}'
    Fill 'Password' field with '{password}'
    Click the 'Sign In' button
    VERIFY that 'Dashboard' is present

STEP 2: OTP verification
    Click the 'Send OTP' button
    CALL PYTHON api_helpers.fetch_otp "{email}" into {otp}
    Fill 'OTP' field with '{otp}'
    Click the 'Verify' button
    VERIFY that 'Welcome' is present

[TEARDOWN]
    CALL PYTHON db_helpers.clean_database
[END TEARDOWN]

That pattern keeps test data, backend calls, and cleanup explicit instead of hiding them behind UI setup steps.

Tags, scheduler, and execution controls

@tags: smoke, regression
@schedule: every 5 minutes
manul tests/ --tags smoke
manul daemon tests/ --headless

Use tags to target subsets of hunts. Use schedules when the same file should run as a monitor or recurring automation job. Use CLI/config flags for retries, screenshots, HTML reports, and worker count.

Global lifecycle hooks

from manul_engine import before_all, after_all, GlobalContext

@before_all
def setup(ctx: GlobalContext) -> None:
  ctx.variables["BASE_URL"] = "https://staging.example.com"

@after_all
def teardown(ctx: GlobalContext) -> None:
  pass

Use manul_hooks.py when the concern is suite-wide, not file-local.

Command quick reference

Category Command Syntax
Navigation NAVIGATE to [URL], OPEN APP
Input Fill [Field] with [Text], Type [Text] into [Field]
Click Click [Element], DOUBLE CLICK [Element], RIGHT CLICK [Element]
Selection Select [Option] from [Dropdown], Check [Checkbox], Uncheck [Checkbox]
Keyboard PRESS ENTER, PRESS [Key], PRESS [Key] on [Element]
File Upload UPLOAD 'File' to 'Element'
Extraction EXTRACT [Target] into {variable}
Verification VERIFY ..., VERIFY VISUAL 'Element', VERIFY SOFTLY ...
Network MOCK METHOD "url" with 'file', WAIT FOR RESPONSE "url"
Flow control WAIT [seconds], SCROLL DOWN, SET {variable} = value
Debug DEBUG, PAUSE, --explain, VS Code hover explainability
Finish DONE.

Append if exists or optional outside the quoted target text to make a step non-blocking.

Scheduler and reporting

@schedule: every 5 minutes
manul daemon tests/ --headless
manul tests/ --retries 2 --screenshot on-fail --html-report

The scheduler is built into the runtime. Reporting, screenshots, retries, and concurrency are execution controls, not DSL syntax.

Coverage and benchmarks

The runtime is not documented as production-stable, but it is documented as heavily exercised. The repo ships with:

  • python manul.py test for the synthetic DOM and unit suite.
  • benchmarks/run_benchmarks.py for adversarial fixtures such as overlapping elements, dynamic IDs, nested tables, and custom dropdowns.
  • integration hunts under tests/ for real-world flows.

Representative coverage areas include Shadow DOM, iframe routing, DOMScorer weight ordering, lifecycle hooks, scoped variables, recorder output, scheduler parsing, HTML reporting, verify states, uploads, key presses, and desktop app attachment through OPEN APP.

That matters because the point of the README is not just positioning. It is also the shortest complete runtime reference.

What's New in v0.0.9.6

  • The public README was rewritten around the actual current posture of the project: alpha-stage, technically ambitious, but still being battle-tested.
  • The messaging now emphasizes determinism, transparency, and DX instead of broad marketing claims.
  • The documentation now frames DOMScorer explicitly as a normalized 0.0 to 1.0 heuristic system rather than vague AI behavior.
  • The README now highlights the VS Code extension debugging workflow, especially [🔍 Explain] and hover-based scoring inspection.
  • Desktop automation via executable_path and OPEN APP is now documented as a first-class workflow.
  • The smart recorder's handling of native <select> change events is documented explicitly.
  • Install examples are now version-pinned to 0.0.9.6.

License

Version: 0.0.9.6

Apache-2.0.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

manul_engine-0.0.9.6.tar.gz (120.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

manul_engine-0.0.9.6-py3-none-any.whl (120.7 kB view details)

Uploaded Python 3

File details

Details for the file manul_engine-0.0.9.6.tar.gz.

File metadata

  • Download URL: manul_engine-0.0.9.6.tar.gz
  • Upload date:
  • Size: 120.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for manul_engine-0.0.9.6.tar.gz
Algorithm Hash digest
SHA256 c012495f213a6a8090b2fc3046695ff5b9b793c57e373d90edff1f2ee6ddc2d6
MD5 ea86de94b533db2d5ab0a858e15a9222
BLAKE2b-256 74d9dcabb35308c6a20c61ea94322b0d4a763e87689fbadeb490537d3a589878

See more details on using hashes here.

File details

Details for the file manul_engine-0.0.9.6-py3-none-any.whl.

File metadata

  • Download URL: manul_engine-0.0.9.6-py3-none-any.whl
  • Upload date:
  • Size: 120.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for manul_engine-0.0.9.6-py3-none-any.whl
Algorithm Hash digest
SHA256 73857c308a9640fa83cc8fe3670c5bbb6e07b8517b904e31177497af5c974d6a
MD5 6970c72827968d29b512805a191267be
BLAKE2b-256 ef9d7c5970f6556ac8804f30657072588e519eb7f7d554dc48a9565f576c6fcd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page