Autonomous web testing agent: crawls a live site and generates a runnable Playwright E2E/functional test suite plus a human-readable report.
Project description
Anjalikastra
An open-source CLI that points at a live website URL and produces:
- A human-readable report — pass/fail per page and endpoint, coverage %, and a drafted list of likely bugs, queued for human review.
- A runnable end-to-end / functional test suite (TypeScript + Playwright Test) — organized test files, config, README, and dependency manifest — that installs and runs with a single command on first try.
The tool discovers what to test by crawling from the URL you give it. It has no access to the target's source code, repo, or CI — everything is inferred from the live site, treating every target as black-box.
These are end-to-end / functional / smoke tests, not unit tests — unit tests require source access, and this tool never has that.
Full documentation: https://khushaljethava.github.io/Anjalikastra/ (built
from docs/ — deploys automatically from main).
Install
pip install -e .
playwright install --with-deps chromium # for the tool's own crawler/endpoint capture
Configure an LLM for full-quality classification and generation: set
ANTHROPIC_API_KEY for Claude, or OPENAI_API_KEY/OPENAI_BASE_URL for any
OpenAI-compatible provider — OpenAI, Ollama, OpenRouter, Gemini, and more (see
"Configuring LLM models" below). Without any key, the tool still runs end-to-end
using a heuristic/template-only fallback — smaller and less targeted, but still
a working suite.
Usage
anjalikastra https://example.com
anjalikastra <url> [options]
--output-dir PATH Where run artifacts are written (default: output/)
--max-pages N Cap on pages crawled (default: 40)
--throttle-ms N Minimum delay between requests, in ms (default: 500)
--openapi PATH Optional OpenAPI spec to enrich endpoint discovery
--public-only / --allow-auth v1: only crawl unauthenticated pages (default: on)
--dry-run Print the plan and exit without making network requests
--resume RUN_ID Resume a previous run (output/<run-id>), skipping discovery if it already finished
--cheap-model NAME Model for classification/summaries (default: claude-haiku-4-5-20251001)
--capable-model NAME Model for test generation/triage (default: claude-sonnet-5)
--llm-provider NAME 'anthropic' or 'openai' (any OpenAI-compatible endpoint); auto-detected by default
--verbose, -v Verbose logging
Run anjalikastra <url> --dry-run first on a new target — it prints exactly what
the tool would do (crawl scope, throttle, model routing) without making a single
network request.
Resuming a crashed or interrupted run
Discovery (crawling + endpoint capture) is checkpointed to output/<run-id>/checkpoint.json.
If the process crashes or is interrupted after discovery completes, resume with:
anjalikastra <url> --resume <run-id>
This skips re-crawling the site — avoiding hitting the target's infrastructure a second time — and picks up from classification/generation. Classification results are also cached independently by content-hash (see "How it works" below), so even a fresh run against an unchanged site is cheap.
What you get
output/<run-id>/
├── report.md # human-readable: coverage, failures, drafted bugs, cost
├── report.json # same data, machine-readable
└── suite/ # the deliverable — yours to keep and maintain
├── package.json
├── playwright.config.ts
├── README.md
└── tests/
├── pages/*.spec.ts
└── api/*.spec.ts
Run the generated suite:
cd output/<run-id>/suite
npm install && npx playwright install --with-deps chromium
npm test
Coverage honesty
The report always states "tested N of M known pages" and lists what wasn't reached — auth-gated, blocked by robots.txt, sitemap-only, or crawl-truncated — next to why. There is no bare green checkmark implying full coverage.
Scope (v1)
- Crawls and tests public pages only. Login-gated flows are not tested;
they're reported as "not covered," never silently skipped or falsely passed.
See
anjalikastra/discovery/auth.pyfor the v2 design. - No bot-detection evasion. If a target blocks the crawler, the tool tells you to allowlist its User-Agent on your own site rather than trying to get around it.
- Nothing is auto-filed or auto-fixed. The tool drafts a bug list; a human decides.
How it works
URL -> Discovery (sitemap + crawl + network capture)
-> Analysis (page/endpoint classification)
-> Test generation (assertions -> Playwright files -> review gate)
-> Execution (run the suite, capture baseline)
-> Triage (classify failures, draft bug reports)
-> Reporting (report.md / report.json)
Classification and routine summaries use a cheap/small model; test generation and
failure triage use a more capable model. A content-hash cache under
output/.cache/ means a second run against an unchanged page skips re-classifying
and re-generating it — see the "Cost" section of report.md for the token delta.
Configuring LLM models
The tool splits LLM work across two tiers, each independently configurable:
| Tier | Used for | Default | Override |
|---|---|---|---|
| cheap | page/endpoint classification, routine summaries | claude-haiku-4-5-20251001 |
--cheap-model flag or ANJALIKASTRA_CHEAP_MODEL env var |
| capable | test generation, failure triage | claude-sonnet-5 |
--capable-model flag or ANJALIKASTRA_CAPABLE_MODEL env var |
export ANTHROPIC_API_KEY=sk-ant-...
anjalikastra https://example.com \
--cheap-model claude-haiku-4-5-20251001 \
--capable-model claude-opus-4-8
CLI flags take precedence over env vars; env vars take precedence over the defaults.
--dry-run shows exactly which models a run would use.
Supported providers
Two backends are supported natively, selected with --llm-provider (or
ANJALIKASTRA_LLM_PROVIDER), and auto-detected from your credentials if you
don't specify one:
| Provider | Covers | Credentials |
|---|---|---|
anthropic |
Claude models via the Anthropic API | ANTHROPIC_API_KEY |
openai |
any OpenAI-compatible endpoint: OpenAI, Ollama (local models), OpenRouter, Gemini, vLLM, LM Studio, ... | OPENAI_API_KEY and/or OPENAI_BASE_URL |
Auto-detection: ANTHROPIC_API_KEY set → anthropic; otherwise
OPENAI_API_KEY or OPENAI_BASE_URL set → openai; neither → heuristic mode.
When using a non-Anthropic provider, pass model names your endpoint actually
serves via --cheap-model / --capable-model.
OpenAI:
export OPENAI_API_KEY=sk-...
anjalikastra https://example.com --cheap-model gpt-5-mini --capable-model gpt-5
Ollama (local models, no API key needed):
export OPENAI_BASE_URL=http://localhost:11434/v1
anjalikastra https://example.com --cheap-model llama3.2 --capable-model qwen2.5-coder:32b
OpenRouter (one key, hundreds of models):
export OPENAI_BASE_URL=https://openrouter.ai/api/v1
export OPENAI_API_KEY=sk-or-...
anjalikastra https://example.com \
--cheap-model google/gemini-2.5-flash --capable-model anthropic/claude-sonnet-4.5
Gemini (via Google's OpenAI-compatible endpoint):
export OPENAI_BASE_URL=https://generativelanguage.googleapis.com/v1beta/openai/
export OPENAI_API_KEY=your-gemini-api-key
anjalikastra https://example.com --cheap-model gemini-2.5-flash --capable-model gemini-2.5-pro
Any other server that speaks the OpenAI Chat Completions protocol (vLLM,
LM Studio, LiteLLM proxy, Together, Groq, ...) works the same way: set
OPENAI_BASE_URL to its address and pass its model names.
--dry-run shows the resolved provider, base URL, and models before anything runs.
No key at all? The tool still runs end-to-end in heuristic/template-only mode — classification falls back to URL/DOM heuristics and generation uses the deterministic templates. The suite is smaller and less targeted but still valid and runnable.
Developing this tool
pip install -e ".[dev]"
pytest
anjalikastra/ is the Python orchestrator; it emits TypeScript/Playwright as
output artifacts. See anjalikastra/generation/review_gate.py for the
minimal-code discipline applied to every generated test file before it ships.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file anjalikastra-0.1.0.tar.gz.
File metadata
- Download URL: anjalikastra-0.1.0.tar.gz
- Upload date:
- Size: 54.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c024ed34969ab210b012528432337a59f0d38275675e5b4bfb4f2d28c148db26
|
|
| MD5 |
cab7c10db2ef0484ff41e9d7fffed545
|
|
| BLAKE2b-256 |
ad2d98086ee1b4ae5bd92dcd4525589e85972c298edb123761eea4ad4992c1b3
|
File details
Details for the file anjalikastra-0.1.0-py3-none-any.whl.
File metadata
- Download URL: anjalikastra-0.1.0-py3-none-any.whl
- Upload date:
- Size: 47.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2180fa897df830a3f0a5dea4ab510c42c6bdc39d1688963a09a3cf7fbfd68b40
|
|
| MD5 |
e7080e86ea315b88338e9dbd2d5ea583
|
|
| BLAKE2b-256 |
bda8e3213d4a818a2bc1be5aa8e8b9af5c41ec277f0587e0f8508250e1f734fb
|