Accessibility-first browser automation. Zero mouse telemetry. Works with any LLM.

These details have not been verified by PyPI

Project links

Project description

Fantoma

The undetectable browser automation library. Drives browsers via the accessibility API — the same channel used by screen readers. No mouse movements, no screenshots, no pixel coordinates.

Two classes. Use whichever fits:

from fantoma import Fantoma, Agent

# Tool API — drive the browser step by step
browser = Fantoma()
state = browser.start("https://news.ycombinator.com")
# state["aria_tree"] → feed to your LLM, get back an action
result = browser.click(3)
# result["state"]["aria_tree"] → updated page
browser.stop()

# Convenience API — describe a task, the agent does it
agent = Agent(llm_url="http://localhost:8080/v1")
result = agent.run("Go to github.com/trending and tell me the top repo")

# Login — no LLM needed
browser = Fantoma()
browser.start()
result = browser.login("https://github.com/login", email="me@example.com", password="...")
browser.stop()

Fantoma Demo

Getting Started

pip install fantoma
fantoma setup        # Guided wizard: pick your LLM, done
fantoma test         # Verify it works

Need an LLM? Install Ollama, run ollama pull phi3.5, done. Works on CPU or GPU (8GB+ GPU recommended for speed). Or use a cloud API (OpenAI, Anthropic, DeepSeek) — the wizard handles it.

Requirements: Python 3.10+, Linux or macOS (Windows via WSL). No other dependencies — everything installs automatically.

What It Does

Gets through the gate — login, signup, CAPTCHA solving. Code handles the forms, LLM handles the unexpected.
LLM as brain, code as hands — Code matches form fields by label (fast, zero tokens). When it can't match, one LLM call labels all fields at once. Code fills based on the LLM's answer. Results cached in SQLite — LLM never called twice for the same site.
Signup forms — fills first name, last name, email, username, password, confirm password. Clicks terms checkboxes. Tracks what's been filled to avoid double-submission.
25 real sites tested — GitHub, HN, Etsy, eBay, Reddit, Discord, Spotify, and 18 more. Zero bot detections.
Camoufox anti-detection — passes bot.sannysoft.com and nowsecure.nl. 2,241 stress tests, zero fingerprint detections.
ARIA + raw DOM — always reads both. No form is invisible, even old-school HTML without ARIA labels.
Form Memory — SQLite database records every login page. Gets smarter with every visit.
Universal form filling — one approach for React, Vue, Angular, vanilla HTML. No framework detection.
Resilience — 3-level model escalation (local → cloud → back), retry on slow SPAs.
Multi-API compatible — JSON mode (response_format) only sent to local endpoints. Cloud APIs (DeepSeek, OpenAI, Anthropic) work without 400 errors.
Sequential session safety — after each browser session closes, the asyncio "running loop" pointer is cleared so the next session starts clean. Prevents "Event loop is closed" errors when running many tests back-to-back.
Playwright traces — Agent(trace=True) records full debug sessions
Fingerprint self-test — fantoma test fingerprint runs 7 in-browser checks
Chromium fallback — Agent(browser="chromium") via Patchright for sites that block Firefox
Multi-tab sessions, proxy rotation, CAPTCHA solving, verification code extraction
Session persistence — cookies + localStorage saved to encrypted files per domain + account. Login once, skip forms forever. pip install fantoma[sessions] for encryption.
Unified login pipeline — signup → CAPTCHA → email verification → login-back, all in one login() call. Tries saved session first.
Sensitive data — pass credentials as sensitive_data={"email": "...", "password": "..."}. They appear as <secret:email> in LLM prompts and logs. Real values injected only at execution time.
Inline error detection — JS scans for role="alert", aria-invalid, error CSS classes, and common error text patterns. No LLM needed.
Smart element pruning — relevance-based scoring replaces the hard cap. The LLM sees the most relevant elements for the current task, not the first N on the page.
Tree diffing — new elements (from dropdowns, modals, next form steps) marked with * prefix so the LLM sees what just appeared.
Iframe ARIA extraction — payment forms, embedded logins, and consent dialogs inside iframes are visible. Up to 5 iframes scanned per page.
Inline field state — aria-invalid, required, current value, and error text shown directly in the element list. LLM sees [3] textbox "Email" [invalid: "Please enter a valid email"] instead of guessing why a submit failed.
Adaptive DOM modes — three extraction modes (form/content/navigate) inferred per step from task keywords and page state. Form mode boosts inputs to top with tighter caps. Content mode strips UI for scraping.
ARIA landmark grouping — interactive elements grouped under their nearest ARIA landmark ([form: Login], [navigation: Main nav]). LLM sees structural context, not a flat list.
Cookie consent auto-dismiss — detects and closes consent banners without LLM involvement.

Accessibility-First Stealth

Fantoma interacts via the browser's accessibility API (ARIA tree) — the same channel used by screen readers like JAWS, NVDA, and VoiceOver.

Zero mouse telemetry. No mouse movements, no click coordinates, no scroll velocity. Anti-bot systems that fingerprint pointer behaviour see nothing because there is no pointer.

Zero visual layer interaction. No screenshots, no pixel coordinates. The browser processes accessibility API calls — identical to what it sees from a screen reader user.

Legally protected channel. WCAG, ADA, and the EU Accessibility Act require websites to support accessibility APIs. Blocking accessibility API access means blocking disabled users — sites cannot do this without legal exposure.

Competitors produce detectable signals. browser-use takes screenshots. Stagehand uses CDP. Skyvern combines LLM with computer vision. All three produce signals that anti-bot systems can fingerprint. Fantoma produces none.

Login & Signup (No LLM)

login() handles the full flow: saved session check → form fill → CAPTCHA → email verification → login-back. No LLM needed for known forms. Sessions saved to encrypted files — login once, instant access next time. Available on both Fantoma and Agent.

# Tool API — no LLM needed
browser = Fantoma()
browser.start()
result = browser.login("https://github.com/login", email="me@example.com", password="...")
browser.stop()

# Convenience API
agent = Agent(llm_url="http://localhost:8080/v1")
result = agent.login("https://github.com/login", email="me@example.com", password="...")

# Login with username instead of email
result = browser.login("https://news.ycombinator.com/login", username="myuser", password="pass")

# Signup with name fields
result = browser.login(
    "https://demo.nopcommerce.com/register",
    first_name="Fantoma", last_name="Agent",
    email="me@example.com", password="SecurePass123!"
)
# Fills: FirstName, LastName, Email, Password, ConfirmPassword — all by code

# Result
print(result.success)       # True if login detected
print(result.data)          # {"fields_filled": [...], "url": "...", "steps": 1}

Tested on: the-internet.herokuapp.com (logged in), GitHub (React), HN (vanilla HTML), OrangeHRM (logged in), SauceDemo, DemoQA (4-field signup), nopCommerce (5-field signup), Parabank (logged in), Automationexercise (multi-step).

Limitations

CAPTCHAs: Proof-of-work types (ALTCHA) are solved automatically for free. reCAPTCHA and hCaptcha need a paid solver like CapSolver. Most sites never show CAPTCHAs because Camoufox prevents detection.
Context window: Local LLMs need at least 8K tokens. Set --ctx-size 8192 in llama.cpp or num_ctx: 8192 in Ollama.
Small models: A 3.8B model handles browsing, extraction, and simple forms. Complex multi-step signups work better with a larger model. The escalation chain handles this — your local model tries first, and if it gets stuck, Fantoma automatically switches to your cloud API.
IP rate limiting: Reddit detects repeated visits from the same IP after 2+ hours. Use proxy rotation for heavy scraping.

Examples

# Run a task from the command line
fantoma run "Go to amazon.co.uk and tell me the top deal"

# Interactive mode
fantoma
fantoma> /session https://booking.com
session> /act Search for hotels in London
session> /read What is the cheapest hotel?
session> /done

# Extract structured data
fantoma> /extract https://books.toscrape.com First 3 books with title and price

# Python: structured extraction with schema validation
agent = Agent(llm_url="http://localhost:8080/v1")
books = agent.extract(
    "https://books.toscrape.com",
    "First 3 books",
    schema={"title": str, "price": str}
)

# Python: automatic email verification (IMAP polling)
agent = Agent(
    llm_url="http://localhost:8080/v1",
    email_imap={
        "host": "127.0.0.1", "port": 1143,
        "user": "me@example.com", "password": "bridge-pass",
        "security": "starttls",
    },
)
result = agent.login("https://example.com/register",
                     email="me@example.com", password="SecurePass123!")
# If the site sends a verification email, Fantoma polls IMAP,
# extracts the code/link, and completes verification automatically.

# Python: session persistence — login once, saved for next time
browser = Fantoma()
browser.start()
result = browser.login("https://github.com/login", email="me@example.com", password="...")
browser.stop()
# First call: fills form, logs in, saves session to ~/.local/share/fantoma/sessions/
# Next call: loads saved cookies, skips the form entirely

# Python: sensitive data — credentials never in logs or LLM history
agent = Agent(
    llm_url="http://localhost:8080/v1",
    sensitive_data={"email": "me@example.com", "password": "SecurePass123!"},
)
result = agent.run("Sign up at https://example.com/register")
# LLM sees: TYPE [3] "<secret:email>" — real value injected at execution time

# Python: local model with cloud fallback
agent = Agent(
    llm_url="http://localhost:8080/v1",
    escalation=["http://localhost:8080/v1", "https://api.openai.com/v1"],
)

# Python: with proxy
agent = Agent(
    llm_url="http://localhost:8080/v1",
    proxy="socks5://user:pass@proxy:1080",
)

# Python: debug with traces
agent = Agent(llm_url="http://localhost:8080/v1", trace=True)
# Trace saved to ~/.local/share/fantoma/traces/<domain>-<timestamp>.zip
# View: playwright show-trace <file>.zip

# Python: Chromium instead of Firefox
agent = Agent(llm_url="http://localhost:8080/v1", browser="chromium")
# Requires: pip install fantoma[chromium]

Troubleshooting

Problem	Fix
LLM connection fails	Check it's running: `curl http://localhost:8080/v1/models`
Browser won't start	Run `fantoma test` again — Camoufox downloads on first run
Task times out	`Agent(timeout=120)` or use a faster model
Empty LLM responses	Context window too small — need at least 8192 tokens
CAPTCHA blocks you	`Agent(captcha_api="capsolver", captcha_key="...")`
Site detects the bot	`Agent(proxy="socks5://user:pass@host:port")`
Small model misses buttons	Add escalation to a cloud API for hard steps
Form not filled	Check `fantoma logs --trace` for debug data
Login fields invisible	Fantoma falls back to raw DOM — check trace for details
LLM says DONE without acting	Fixed in v0.5.0 — prompt fix included
Same action repeating	Agent has built-in loop detection and escalation
"Event loop is closed" on second run	Fixed — `stop()` cleans up the asyncio event loop
Camoufox SIGSEGV / "Page crashed" on Fedora 43	Use Docker (recommended) or LD_PRELOAD shim. See Fedora 43 / glibc 2.42 below.

Fedora 43 / glibc 2.42 — Camoufox Crash

If Camoufox crashes immediately with TargetClosedError: Page crashed or SIGSEGV on Fedora 43 (or any distro with glibc 2.42+), this is a known compatibility issue.

Root cause: glibc 2.42 calls madvise(MADV_GUARD_INSTALL) during pthread_create for thread stack guard pages. Camoufox's seccomp BPF filter was built before this madvise argument existed — child browser processes (content, RDD, utility) receive SIGSYS and die.

Fix — LD_PRELOAD shim:

// madvise_shim.c
#define _GNU_SOURCE
#include <sys/mman.h>
#include <sys/prctl.h>
#include <linux/seccomp.h>
#include <linux/filter.h>
#include <stdarg.h>
#include <syscall.h>

// Intercept madvise — pass through everything except MADV_GUARD_INSTALL (102) and MADV_GUARD_REMOVE (103)
int madvise(void *addr, size_t length, int advice) {
    if (advice == 102 || advice == 103) return 0;
    return (int)syscall(SYS_madvise, addr, length, advice);
}

// Intercept prctl to block seccomp installation
int prctl(int option, ...) {
    va_list args;
    va_start(args, option);
    unsigned long a2 = va_arg(args, unsigned long);
    unsigned long a3 = va_arg(args, unsigned long);
    unsigned long a4 = va_arg(args, unsigned long);
    unsigned long a5 = va_arg(args, unsigned long);
    va_end(args);
    if (option == PR_SET_SECCOMP) return 0;
    return (int)syscall(SYS_prctl, option, a2, a3, a4, a5);
}

// Intercept syscall() for the SYS_seccomp path (inline assembly to avoid va_arg issues)
long syscall(long number, ...) __attribute__((weak));

# Build
gcc -shared -fPIC -O2 -o madvise_shim.so madvise_shim.c -ldl

# Test
LD_PRELOAD=/path/to/madvise_shim.so python3 -c "from fantoma import Agent; a = Agent(); print('OK')"

Fantoma sets LD_PRELOAD automatically when it detects the shim at ~/.local/share/fantoma/madvise_shim.so. Copy your compiled shim there and Fantoma will use it without any other config changes.

You also need Xvfb running and glxtest available:

sudo dnf install xorg-x11-server-Xvfb mesa-libGL
Xvfb :99 -screen 0 1920x1080x24 &
# Copy glxtest from your Firefox install
cp /usr/lib64/firefox/glxtest ~/.cache/camoufox/

After a Camoufox upgrade: upgrades wipe ~/.cache/camoufox/, so re-copy glxtest and run one test to confirm the shim still works.

What does NOT work: binary-patching camoufox-bin or libxul.so, or intercepting madvise at the glibc wrapper level (glibc uses inline syscalls internally, so the wrapper is never called).

Docker API

Fantoma runs in a Docker container (Ubuntu 22.04 + Camoufox + Xvfb). Single session at a time. This is the recommended approach on Fedora 43+ to avoid the glibc/seccomp issue.

docker compose -f docker-compose.fantoma.yml up -d

Endpoint	Method	Purpose
/health	GET	Status check
/start	POST	Start session: `{"url": "..."}`
/stop	POST	End session
/state	GET	Current ARIA tree + page info
/screenshot	GET	PNG screenshot
/click	POST	`{"element_id": 0}`
/type	POST	`{"element_id": 0, "text": "..."}`
/navigate	POST	`{"url": "..."}`
/scroll	POST	`{"direction": "down"}`
/press_key	POST	`{"key": "Enter"}`
/login	POST	LLM-free login (manages own session)
/extract	POST	Structured extraction (requires session)
/run	POST	Full agent task (manages own lifecycle)

Test Results

Tested across 25 real sites with 6 different LLMs. 355 unit tests. Passed fingerprint checks on bot.sannysoft.com and nowsecure.nl. Zero bot detections across 2,241 stress tests. Full results below.

v0.7.0 live test — 25 sites, Hermes 9B local model (2026-03-31):

#	Site	Result	Time
1	The Guardian	PASS	44s
2	Reuters	FAIL	2s (stale context)
3	TechCrunch	PASS	181s
4	PyPI	PASS	44s
5	npm / npmcharts	PASS	119s
6	Regex101	FAIL	457s (custom code editor)
7	Python docs	PASS	249s
8	Wayback Machine	PASS	150s
9	CodePen	PASS	25s
10	Reddit	PASS	63s
11	GitLab	PASS	34s
12	WordPress.com	PASS	75s
13	Twitch	PASS	52s
14	Discord	PASS	55s
15	Spotify	PASS	27s
16	Dev.to	PASS	99s
17	Disqus	PASS	78s
18	Etsy	PASS	151s
19	eBay UK	PASS	16s
20	Argos	PASS	56s
21	Reed.co.uk	PASS	43s
22	Glassdoor UK	PASS	34s
23	Rightmove	PASS	19s
24	Ticketmaster UK	PASS	38s
25	TotalJobs	PASS	144s

23/25 (92%). Zero browser crashes. Both failures are agent logic, not browser stability.

Detailed test breakdown

Login/signup tests (v0.4.0, code path + LLM brain):

Site	Type	Fields Filled	Result
the-internet.herokuapp.com	Login	Username, Password	Logged in
GitHub	Login (React)	Email, Password	Form filled
OrangeHRM	Login (SPA)	Username, Password	Logged in
Parabank	Signup	FirstName, LastName, Username, Password	Account created
MongoDB Atlas	Signup (5 fields)	FirstName, LastName, Email, Password	All filled
Stripe	Signup	Full name, Email, Password	All filled
Twilio	Signup (4 fields)	FirstName, LastName, Email, Password	All filled
Ghost	Signup	Name, Email, Password	All filled
Zapier	Signup (4 fields)	FirstName, LastName, Email, Password	All filled
Postman	Signup (3 fields)	Email, Username, Password	All filled
nopCommerce	Signup (5 fields)	FirstName, LastName, Email, Password, ConfirmPassword	All filled
Supabase	Signup	Email, Password	All filled
PlanetScale	Signup	Email, Password, Confirm	All filled
Clerk	Signup	Email, Password	All filled
Wandb	Signup	Email, Password	All filled

15 login/signup sites tested on v0.4, zero bot detections, zero form failures.

Overnight stress test (7 hours, 3 cloud APIs):

Provider	Tests	Pass Rate
OpenAI GPT-4o-mini	180	100%
Claude Sonnet	1,159	99.9%
Kimi Moonshot	902	96.7%

Anti-bot systems bypassed: Cloudflare (X.com, Reddit, Indeed), DataDome (Amazon), PerimeterX (Walmart, Zillow), Akamai (Nike), Meta (Instagram, Facebook), custom (LinkedIn, Booking.com, TikTok, Craigslist, GitHub).

Small model (Phi-3.5-mini 3.8B): 15/15 bot-protected sites passed. Logged into ProtonMail. Created Reddit account with email verification.

6 LLMs tested:

Model	Size	Pass Rate
Qwen3.5-122B	122B	100%
Qwen3-Coder	45B	100%
Phi-3.5-mini	3.8B	100%
Claude Sonnet	Cloud	99.9%
Kimi Moonshot	Cloud	96.7%
GPT-4o-mini	Cloud	100%

Configuration

# Tool API — drive the browser step by step
Fantoma(
    llm_url=None,           # Optional — only needed for extract() and field labelling
    headless=True,
    proxy=None,
    browser="camoufox",
    captcha_api=None,
    captcha_key=None,
    email_imap=None,
    verification_callback=None,
    timeout=300,
)

# Convenience API — describe a task, the agent does it
Agent(
    llm_url="http://localhost:8080/v1",  # Required for Agent
    escalation=None,
    escalation_keys=None,
    max_steps=50,
    timeout=300,
    sensitive_data=None,
    **fantoma_kwargs,        # All Fantoma params passed through
)

CLI Commands

fantoma setup              # Guided setup wizard
fantoma test               # Quick check
fantoma test full           # Test against 10 real sites
fantoma test fingerprint    # Validate anti-detection (7 checks)
fantoma run "task"          # Run a task
fantoma logs               # View recent activity and errors
fantoma logs --trace        # List saved Playwright traces
fantoma                    # Interactive mode

Interactive mode: /help, /run, /session, /act, /read, /observe, /tab, /switch, /status, /history, /logs, /quit

All activity is logged to ~/.fantoma/fantoma.log — check it with fantoma logs or /logs in interactive mode.

Architecture

fantoma/
├── browser_tool.py      # Fantoma class — the browser tool (start, stop, click, type, login, extract)
├── agent.py             # Agent class — convenience wrapper with run() for vibe coders
├── session.py           # Encrypted session persistence
├── cli.py               # CLI + interactive mode (uses Agent)
├── config.py            # Settings
├── dom/                 # Page reading (ARIA tree + raw DOM fallback)
├── browser/             # Browser engine, anti-detection, forms, CAPTCHA, consent
├── captcha/             # Detection + solving (PoW, API, human fallback)
├── llm/                 # Thin OpenAI-compatible client (for field labelling + extract)
└── resilience/          # Escalation chain (used by Agent only)

Example Scripts

File	What it does
`examples/simple_search.py`	Search Hacker News
`examples/local_llm.py`	Ollama / llama.cpp / vLLM
`examples/data_extraction.py`	Structured data extraction
`examples/form_filling.py`	Fill and submit forms
`examples/multi_tab.py`	Signup with email verification
`examples/with_proxy.py`	Browse through a proxy
`examples/escalation.py`	Local model + cloud fallback

Contributing

Contributions welcome. Fork, branch, test, PR.

Acknowledgments

Built on top of these projects:

Camoufox — anti-detect browser (hardened Firefox with fingerprint rotation)
Patchright — patched Chromium (optional)
Playwright — browser automation framework
httpx — HTTP client for LLM API calls

Inspired by these projects and research:

browser-use — the leading open-source browser agent. Fantoma's credential placeholder injection pattern was informed by their approach. Reimplemented from scratch to fit Fantoma's code-first architecture.
WebVoyager — web agent benchmark. Tree diffing (marking new elements with * prefix) was inspired by their set-of-marks approach, adapted for DOM-only operation without screenshots.
Playwright — iframe frame traversal and ARIA snapshot APIs used for iframe element extraction.

License

MIT — Steam Vibe Ltd

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.7.0

Apr 1, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fantoma-0.7.0.tar.gz (362.6 kB view details)

Uploaded Apr 1, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

fantoma-0.7.0-py3-none-any.whl (108.9 kB view details)

Uploaded Apr 1, 2026 Python 3

File details

Details for the file fantoma-0.7.0.tar.gz.

File metadata

Download URL: fantoma-0.7.0.tar.gz
Upload date: Apr 1, 2026
Size: 362.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for fantoma-0.7.0.tar.gz
Algorithm	Hash digest
SHA256	`68e4b3c319c06c8372562f6a6cd8db9a1d46c6a05bff5b5022b572faf9cb4869`
MD5	`df29800366e05fdbeafa0ac4a5dda20a`
BLAKE2b-256	`8c20082ebb2774ce634d56b322aef2d8d694dbd04cdd8549318af4e64ef19814`

See more details on using hashes here.

File details

Details for the file fantoma-0.7.0-py3-none-any.whl.

File metadata

Download URL: fantoma-0.7.0-py3-none-any.whl
Upload date: Apr 1, 2026
Size: 108.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for fantoma-0.7.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`aadf8a9a65da0b2d6cc15c71c92b28ef18393397d03f58ae9253c3addc4c4d99`
MD5	`8a1a0f977b77f22a6a1a979158449643`
BLAKE2b-256	`d45fcc81838e7edca5700c2751a80181f1531c7e7f99ea259803ac7c63e2d274`

See more details on using hashes here.

fantoma 0.7.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Fantoma

Getting Started

What It Does

Accessibility-First Stealth

Login & Signup (No LLM)

Limitations

Examples

Troubleshooting

Fedora 43 / glibc 2.42 — Camoufox Crash

Docker API

Test Results

Configuration

CLI Commands

Architecture

Example Scripts

Contributing

Acknowledgments

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes