Playwright-based browser automation service with HTTP API and Docker support

These details have not been verified by PyPI

Project description

browsy

What is browsy?

browsy is a service that lets you run browser automation tasks without managing browser instances yourself. It provides:

Simple Job Definition: Write Playwright-powered automation tasks in Python
HTTP API: Queue jobs and retrieve results through HTTP endpoints
Docker Ready: Run everything in containers without worrying about browser dependencies
Queue System: Jobs are processed in order, with automatic retries and status tracking
Extensible: Create any browser automation task - from screenshots and PDFs to complex scraping operations

Think of it as a way to turn your Playwright scripts into HTTP services that can be called from anywhere.

Quick Start

Download files

To get started, download the necessary files using the following script:

curl -LsSf https://raw.githubusercontent.com/mbroton/browsy/main/scripts/get.sh | sh

This script will download the docker-compose file and example jobs. Once downloaded, navigate to the browsy/ directory.

Alternatively, you can clone the repository directly and navigate to the quickstart directory.

Start browsy

To start the service, run:

docker compose up --build --scale worker=3

You can adjust the number of workers by modifying the --scale worker parameter.

Visit http://localhost:8000/docs to access the interactive API documentation provided by FastAPI.

Defining custom jobs

A job is defined as any class that inherits from browsy.BaseJob. Browsy will automatically search for these classes within the jobs/ directory.

Here's an example implementation:

from browsy import BaseJob, Page

class ScreenshotJob(BaseJob):
    NAME = "screenshot"

    url: str | None = None
    html: str | None = None
    full_page: bool = False

    async def execute(self, page: Page) -> bytes:
        if self.url:
            await page.goto(self.url)
        elif self.html:
            await page.set_content(self.html)
        return await page.screenshot(full_page=self.full_page)

    async def validate_logic(self) -> bool:
        return bool(self.url) != bool(self.html)

Class Definition: The ScreenshotJob class inherits from BaseJob, which itself is based on Pydantic's BaseModel. This provides automatic data validation and serialization.
Job Name: The NAME attribute uniquely identifies the job type when making API calls.
Parameters: Defined as class attributes, these are automatically validated by Pydantic during API calls. This ensures that input data meets the expected types and constraints before processing.
Validation Logic: The validate_logic method runs during API calls to verify that the job's input parameters satisfy specific conditions. This validation occurs before the job is submitted for execution, allowing for early detection of configuration errors.
Execution Method: The execute method carries out the browser automation using a Playwright Page object. Workers use this method to execute jobs.

Refer to Playwright's documentation for more details on what you can do with page.

Client

To interact with the service using Python, you can use browsy client:

pip install browsy

Here's how you can use it:

from browsy import BrowsyClient

client = BrowsyClient("http://127.0.0.1")
job_id = client.submit_job("screenshot", {
    "url": "https://example.com",
    "full_page": True
})
screenshot = client.get_result(job_id=job_id)

with open("screenshot.png", "wb") as f:
    f.write(screenshot)

This example demonstrates how to submit a screenshot job, retrieve the result, and save it locally.

API

You can explore and interact with the API using the Swagger UI documentation provided by FastAPI. Visit http://localhost:8000/docs to access it.

If you prefer to interact with the service directly via HTTP, here are example requests:

Submit a job

POST /api/v1/jobs

Example request:

{
  "name": "screenshot",
  "parameters": {
    // Job's parameters (see job definition shown above)
    "url": "https://example.com",
    "full_page": true
  }
}

Example response:

{
  "id": 1,
  "name": "screenshot",
  "input": {
    "url": "https://example.com",
    "html": null,
    "full_page": true
  },
  "status": "pending",
  "created_at": "2024-12-30T15:02:04.720000",
  "updated_at": null,
  "worker": null
}

Check job status

GET /api/v1/jobs/{job_id}

Example response:

{
  "id": 1,
  "name": "screenshot",
  "input": {
    "url": "https://example.com",
    "html": null,
    "full_page": true
  },
  "status": "done",
  "created_at": "2024-12-30T15:06:39.204000",
  "updated_at": "2024-12-30T15:06:44.743000",
  "worker": "worker_lze4nFMy"
}

Retrieve job result

To retrieve the result of a job, use the following endpoint:

GET /api/v1/jobs/{job_id}/result

Status Codes:

200: The job is complete, and the output is available. The response type is application/octet-stream.
202: The job is pending or currently in progress.
204: The job is complete or has failed, and there is no output available.
404: No job exists with the provided ID.

Response Headers:

X-Job-Status: Indicates the current status of the job.
X-Job-Last-Updated: Shows the last time the job's status was updated.

How it works

flow

You define jobs using Playwright's API
Send job requests through HTTP
Workers execute jobs in Docker containers
Get results when ready

Documentation

For detailed setup and usage, check out the documentation.

License

MIT License - see LICENSE for details.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.0.8

Jan 13, 2025

0.0.7

Jan 13, 2025

0.0.6

Jan 12, 2025

0.0.5

Jan 6, 2025

This version

0.0.4

Jan 3, 2025

0.0.3

Dec 29, 2024

0.0.2

Dec 29, 2024

0.0.1

Dec 29, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

browsy-0.0.4.tar.gz (213.0 kB view details)

Uploaded Jan 3, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

browsy-0.0.4-py3-none-any.whl (12.8 kB view details)

Uploaded Jan 3, 2025 Python 3

File details

Details for the file browsy-0.0.4.tar.gz.

File metadata

Download URL: browsy-0.0.4.tar.gz
Upload date: Jan 3, 2025
Size: 213.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for browsy-0.0.4.tar.gz
Algorithm	Hash digest
SHA256	`4d504c1a4f39fca09b70066da5ce423f88da96fbe4513484ba0dbfc4c92aac1b`
MD5	`b6a89a8f6479dbca8afc2308eed7ac42`
BLAKE2b-256	`96e740a7aa7d5856dd6410d385664e06a0f4af7a85e5e716956a4e99dbf1a620`

See more details on using hashes here.

Provenance

The following attestation bundles were made for browsy-0.0.4.tar.gz:

Publisher: publish-pypi-docker.yml on mbroton/browsy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: browsy-0.0.4.tar.gz
- Subject digest: 4d504c1a4f39fca09b70066da5ce423f88da96fbe4513484ba0dbfc4c92aac1b
- Sigstore transparency entry: 159266650
- Sigstore integration time: Jan 3, 2025
Source repository:
- Permalink: mbroton/browsy@c727193291fd6e6f039828a2ea43cf90fd1832c9
- Branch / Tag: refs/tags/v0.0.4
- Owner: https://github.com/mbroton
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi-docker.yml@c727193291fd6e6f039828a2ea43cf90fd1832c9
- Trigger Event: release

File details

Details for the file browsy-0.0.4-py3-none-any.whl.

File metadata

Download URL: browsy-0.0.4-py3-none-any.whl
Upload date: Jan 3, 2025
Size: 12.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for browsy-0.0.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2352af54d41a9efb60dd4c49acabe06450fbe4e68d0fcb5dc866d722c95cb94b`
MD5	`912b5d7fb2a0b909c6e52ef937d7c953`
BLAKE2b-256	`bca46f6c4c35dcca79ac7b7c4a20b158d0a2595660aac4ca4244ac0cfee4ec07`

See more details on using hashes here.

Provenance

The following attestation bundles were made for browsy-0.0.4-py3-none-any.whl:

Publisher: publish-pypi-docker.yml on mbroton/browsy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: browsy-0.0.4-py3-none-any.whl
- Subject digest: 2352af54d41a9efb60dd4c49acabe06450fbe4e68d0fcb5dc866d722c95cb94b
- Sigstore transparency entry: 159266651
- Sigstore integration time: Jan 3, 2025
Source repository:
- Permalink: mbroton/browsy@c727193291fd6e6f039828a2ea43cf90fd1832c9
- Branch / Tag: refs/tags/v0.0.4
- Owner: https://github.com/mbroton
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi-docker.yml@c727193291fd6e6f039828a2ea43cf90fd1832c9
- Trigger Event: release

browsy 0.0.4

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

browsy

What is browsy?

Quick Start

Download files

Start browsy

Defining custom jobs

Client

API

Submit a job

Check job status

Retrieve job result

How it works

Documentation

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance