Skip to main content

AI-driven functional testing agent. Point it at a URL; it explores the app, generates a test plan, runs it, and reports findings. Web (Playwright) + visual regression + accessibility (axe-core) + REST API contract tests, all driven by your choice of LLM.

Project description

Sentinel

Point it at a URL. It explores the app, generates a test plan, runs it, and reports findings. Web + visual regression + accessibility in v0.1; API + mobile in later versions.

PyPI License: MIT Status Built by ThinkNext

Status: alpha, live on PyPI as sentinel-agent==0.1.0a1. Web + visual regression + accessibility ship today. API testing in v0.1.0a2; mobile (React Native) in v0.1.0a3.

Install: pip install sentinel-agent · Repo: GitHub · Issues: file one

What it does

Point Sentinel at a URL:

sentinel run https://your-app.com

In one command, the agent:

  1. Opens the URL in headless Chromium
  2. Reads the rendered HTML + visible text
  3. Asks the LLM to generate a focused test plan (2-5 scenarios, 3-8 steps each)
  4. Runs the plan in fresh browser sessions per scenario
  5. Captures screenshots and compares against baselines (visual regression)
  6. Scans each page state for WCAG 2.1 AA violations (axe-core)
  7. Reports findings: failed scenarios, visual diffs, accessibility issues, with cost

Why this exists

The same teams that need Cascade (meeting-to-PR) and Relay (issue-to-PR) need a way to verify that the PRs those agents produce actually work. Hand-writing Playwright tests for every feature is the bottleneck. Sentinel removes the bottleneck: generate tests with the same LLM that writes the code.

Sentinel is fully standalone. It carries its own LLM-client layer and config so it does not depend on any other ThinkNext package at runtime.

Install

# Core install + the LLM provider you want:
pip install 'sentinel-agent[anthropic]'        # Anthropic Claude
pip install 'sentinel-agent[openai]'           # OpenAI
pip install 'sentinel-agent[google]'           # Google Gemini
pip install 'sentinel-agent[claude-code]'      # Local Claude Code subscription, no API key
pip install 'sentinel-agent[all]'              # All providers

# One-time: install the Chromium binary Playwright needs
playwright install chromium

Configure

# Set up an LLM provider. Credentials live at ~/.config/sentinel/config.yaml.
sentinel configure llm anthropic --key sk-ant-xxx --set-default

# Or, if you have Claude Code installed locally (no API key needed):
sentinel configure llm claude_code --set-default

If you want a project-local config (highly recommended; lets you set viewport, baseline directory, accessibility thresholds):

sentinel init

This scaffolds sentinel.yaml with sensible defaults you can edit.

Run

sentinel run https://cascadeagent.dev

# Output (truncated):
#   ✓  3/3 scenarios passed, 0 visual diff(s), 2 a11y violation(s)
#
#   ✓  Homepage loads and primary CTA is visible  (1.42s)
#   ✓  Get-started link navigates to /getting-started/  (1.83s)
#   ✓  Docs sidebar contains all expected sections  (2.10s)
#
#   Accessibility violations:
#     [moderate] color-contrast: Elements must meet minimum color contrast...
#       sample: .text-slate-500
#       (3 node(s) affected)
#     [minor] image-alt: Images must have alt text...
#       sample: img.hero-illustration
#       (1 node(s) affected)
#
#   cost:    $0.04 (5,210 in / 980 out tokens)

What ships in v0.1.0

Capability Module
Web testing via Playwright sentinel.browser, sentinel.runner
LLM-driven test plan generation sentinel.planner
Self-healing tests (LLM re-plan on failed step + retry once) sentinel.planner.regenerate_step
Multi-page exploration (up to 4 same-origin links) sentinel.agent
Visual regression (PIL pixel diff) sentinel.visual
Accessibility scan (axe-core 4.10, WCAG 2.1 AA) sentinel.a11y
REST API contract testing (OpenAPI + URL-probe modes) sentinel.api_*
Multi-LLM (Anthropic / OpenAI / Google / Claude Code / Ollama) sentinel.llm
Mobile (React Native via Detox) planned for a future release

How it differs from existing tools

Playwright Codegen Pytest + Playwright Percy / Chromatic Sentinel
Generates tests from a URL partial (record/replay)
Self-hosted
Bring your own LLM n/a n/a n/a
Visual regression
Accessibility scan partial (plugin)
Open source

Sentinel is for teams who want test coverage without spending the engineering hours to author it. The trade-off is that AI-generated tests have failure modes hand-written tests do not (e.g. an LLM picks a fragile selector). The self-healing v0.1.0a2 feature is the answer to that.

Configuration

sentinel.yaml (after sentinel init):

version: 1

agent:
  provider: anthropic
  model: claude-opus-4-7
  temperature: 0.2

browser:
  headless: true
  viewport_width: 1280
  viewport_height: 720
  timeout_ms: 30000

visual:
  enabled: true
  baseline_dir: sentinel-baselines
  diff_threshold_percent: 0.5

a11y:
  enabled: true
  fail_on:
    - critical
    - serious

Architecture

   sentinel run <url>
          │
          ▼
   ┌──────────────┐
   │ explore page │  Playwright opens URL, grabs HTML + visible text
   └──────┬───────┘
          │
          ▼
   ┌──────────────┐
   │   planner    │  LLM produces TestPlan (2-5 scenarios, 3-8 steps each)
   └──────┬───────┘
          │
          ▼
   ┌──────────────┐
   │    runner    │  Fresh browser session per scenario
   │              │  Each step is one Playwright action
   │              │  screenshot steps → visual regression check
   │              │  a11y_scan steps → axe-core injection
   └──────┬───────┘
          │
          ▼
   ┌──────────────┐
   │ SentinelReport │  Scenarios + visual diffs + a11y violations + cost
   └──────────────┘

Roadmap

Version Status Highlights
v0.1.0a1 Shipped (2026-05-26) Web testing, visual regression, accessibility
v0.1.0a2 Planned Multi-page exploration, self-healing tests, API contract testing
v0.1.0a3 Planned Mobile (React Native via Detox or Maestro)
v0.2 Q4 2026 CI integration (GitHub Actions / GitLab CI / Bitbucket / Azure), parallel execution
v1.0 Mid-2027 Stable API, full coverage of web + API + mobile + visual + a11y, baselined

License

MIT. See LICENSE.

About

Built and maintained by ThinkNext Software Solutions, alongside our other open-source projects Cascade (meeting-to-PR) and Relay (issue-to-PR).

Follow along: @ThinkNextHQ · LinkedIn · Blog

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sentinel_agent-0.1.6.tar.gz (60.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sentinel_agent-0.1.6-py3-none-any.whl (62.1 kB view details)

Uploaded Python 3

File details

Details for the file sentinel_agent-0.1.6.tar.gz.

File metadata

  • Download URL: sentinel_agent-0.1.6.tar.gz
  • Upload date:
  • Size: 60.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for sentinel_agent-0.1.6.tar.gz
Algorithm Hash digest
SHA256 52057ad29873b3e050ea7146867976463fa7dc9af5a9725638010a6e21c5055f
MD5 4de66d0c86318b547bb491446404c3a4
BLAKE2b-256 97cc8db3e02cd99541f46d21b38705dc513a057e29c06f54c5ce4d8970fa0c6c

See more details on using hashes here.

File details

Details for the file sentinel_agent-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: sentinel_agent-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 62.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for sentinel_agent-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 2d3d55117bafa280709dfe062c7a4841fe7ee9b12791ea505a56f5ee552d7d81
MD5 03b65fa772ee22e9fa43fee0e406535d
BLAKE2b-256 100d05c37c7ca6950e1e938f24641fa64d1c7c4fdc14c4ed8e22ee64781bdd63

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page