One API. Any backend. Full browser automation to lightweight scraping.

These details have not been verified by PyPI

Project links

Project description

crawlix

One API. Any backend. Full browser automation to lightweight scraping.

crawlix is a Python browser automation and web scraping library with a unified API across multiple backends. Write your code once — switch between lightweight HTTP scraping and full browser automation without changing a single line.

from crawlix import Browser

# Zero-setup scraping
with Browser() as b:
    page = b.open("https://example.com")
    print(page.find("h1").text)

# Full browser automation — same API
with Browser(backend="playwright") as b:
    page = b.open("https://example.com")
    page.type("#email", "user@example.com")
    page.click("[type=submit]")
    page.wait_for(".dashboard")
    page.screenshot("result.png")

Install

pip install crawlix                    # core — requests + BeautifulSoup
pip install crawlix[playwright]        # full browser via Playwright
pip install crawlix[selenium]          # full browser via Selenium
pip install crawlix[async]             # async support via httpx
pip install crawlix[full]              # everything above
pip install crawlix[termux]           # for Termux/Android (no Playwright)

[!TIP] pip install crawlix with no extras always succeeds — optional backends are imported on demand with clear install hints.

Why crawlix?

Problem	Solution
Rewriting code when switching from HTTP to browser scraping	Same API — change `backend=` not your code
Heavy dependencies for small tasks	Zero hard deps — core uses only requests + bs4
Bot detection blocking your scrapers	Stealth by default — realistic headers, UA rotation
Remembering which backend does what	Auto-detect — picks the best available backend
Confusing error messages	Helpful errors — `BackendError` tells you exactly what to install

# Auto-detect picks the best backend installed on your system
# Priority: playwright > selenium > requests+bs4
with Browser() as b:
    print(b.backend_name)  # "requests" — or "playwright" if installed

Quick Start

Scrape a page

from crawlix import Browser

with Browser() as b:
    page = b.open("https://news.ycombinator.com")
    for item in page.find_all(".titleline > a"):
        print(item.text, item.attr("href"))

Extract data from APIs

from crawlix import get, fetch

data = get("https://api.github.com/users/keyreyla").json()
html = fetch("https://example.com")

Automate a login flow

with Browser(backend="playwright") as b:
    b.open("https://github.com/login")
    b.type("#login_field", "username")
    b.type("#password", "password")
    b.click("[type=submit]")
    b.wait_for(".dashboard-sidebar")
    print("Logged in:", b.url)

Async usage

import asyncio
from crawlix.async_api import AsyncBrowser, aget

async def main():
    async with AsyncBrowser() as b:
        page = await b.open("https://example.com")
        print(page.title)

    page = await aget("https://api.github.com/users/keyreyla")
    print(page.url)

asyncio.run(main())

API at a Glance

Browser

Browser(
    backend="auto",   # "playwright", "selenium", "requests", "httpx"
    headless=True,
    stealth=True,
    timeout=30,
    proxy=None,       # "http://user:pass@host:port"
    locale="en-US",
    user_agent=None,
)

b.open(url)          # -> Page
b.new_page()         # -> Page
b.close()
b.backend_name       # -> str
b.supports_js        # -> bool

Page

All interaction methods return self for chaining:

page.find(selector)           # -> Element | None
page.find_all(selector)       # -> list[Element]
page.click(selector)          # -> Page
page.type(selector, text)     # -> Page
page.wait_for(selector)       # -> Page
page.screenshot(path=None)    # -> bytes
page.html                     # -> str
page.text                     # -> str
page.json()                   # -> dict
page.links()                  # -> list[str]
page.tables()                 # -> list[list[list[str]]]
page.evaluate("document.title")  # -> any

Element

el.text               # -> str
el.attr(name)         # -> str
el.attrs              # -> dict
el.find(selector)     # -> Element | None
el.click()            # -> Element
el.is_visible()       # -> bool
el.bounding_box()     # -> dict
if el:                # always True — natural presence checks
    ...

Backend Feature Matrix

Feature	requests	playwright	selenium	httpx
JS execution		✅	✅
Click/type/hover		✅	✅
Screenshot/PDF		✅	✅
Network intercept		✅
Async		✅		✅
Wait/retry		✅	✅
File upload		✅	✅
Proxy	✅	✅	✅	✅

Examples

Proxy

with Browser(proxy="http://user:pass@proxy:8080") as b:
    page = b.open("https://ipinfo.io/json")
    print(page.json()["ip"])

Table extraction

with Browser() as b:
    page = b.open("https://en.wikipedia.org/wiki/Python_(programming_language)")
    for row in page.tables()[0]:
        print(row)

File upload

with Browser(backend="playwright") as b:
    page = b.open("https://example.com/upload")
    page.upload("#file-input", "/path/to/file.pdf")
    page.click("#submit")
    page.wait_for(".success")

Network intercept

with Browser(backend="playwright") as b:
    page = b.open("https://example.com")
    page.intercept("**/api/**", lambda req: print(req.url))

Exceptions

from crawlix.exceptions import (
    CrawlixError,       # base — catch-all
    BackendError,       # backend unavailable or op not supported
    TimeoutError,       # wait exceeded timeout
    NavigationError,    # page failed to load
    SelectorError,      # invalid selector or element not found
    NetworkError,       # connection error
    JavaScriptError,    # JS evaluation failed
)

[!NOTE] BackendError always includes an install hint. For example, calling screenshot() on the requests backend raises: BackendError: screenshot() requires a browser backend. Install: pip install crawlix[playwright]

Development

git clone https://github.com/keyreyla/crawlix.git
python -m venv .venv && source .venv/bin/activate
pip install -e ".[full]"
pip install pytest ruff mypy
pytest

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.3.0

May 17, 2026

0.2.1

May 17, 2026

0.2.0

May 16, 2026

0.1.2

May 16, 2026

This version

0.1.1

May 16, 2026

0.1.0

May 16, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

crawlix-0.1.1.tar.gz (44.1 kB view details)

Uploaded May 16, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

crawlix-0.1.1-py3-none-any.whl (19.7 kB view details)

Uploaded May 16, 2026 Python 3

File details

Details for the file crawlix-0.1.1.tar.gz.

File metadata

Download URL: crawlix-0.1.1.tar.gz
Upload date: May 16, 2026
Size: 44.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for crawlix-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`24cbe8543ab29087df7483bb5380ae1705c2212529aa7495c8e155ebbeaf3563`
MD5	`b9a1915305b50a1e9c7798407e6d4d28`
BLAKE2b-256	`df295201480f8ed75d3f41f875954e26f78d56e46fceeeff7d019678f0bd8955`

See more details on using hashes here.

File details

Details for the file crawlix-0.1.1-py3-none-any.whl.

File metadata

Download URL: crawlix-0.1.1-py3-none-any.whl
Upload date: May 16, 2026
Size: 19.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for crawlix-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`780abd499d9bfc2a59f6c6008d0cc522029247e4e4529728f9338cf50cc96979`
MD5	`0fa014da2e604e15838b7ba21f5b2d95`
BLAKE2b-256	`2d952cf9c71cadb077ae7d50a402fb4ce8d004ff9b2528ef3cc454f1acd8dd85`

See more details on using hashes here.

crawlix 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

crawlix

Install

Why crawlix?

Quick Start

Scrape a page

Extract data from APIs

Automate a login flow

Async usage

API at a Glance

Browser

Page

Element

Backend Feature Matrix

Examples

Exceptions

Development

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes