Skip to main content

Quokka is a powerful Python library built on top of Playwright, designed to simplify browser automation and web scraping tasks. With Quokka, you can easily navigate web pages, extract data, and interact with page elements using an intuitive API. Quokka supports asynchronous and parallel execution, making it suitable for a wide range of IO and CPU-bound workloads. Get started with Quokka to streamline your browser automation and web scraping workflows.

Project description

Quokka - Browser Automation Library with Playwright

Quokka is a powerful Python library built on top of Playwright, designed to simplify browser automation and manipulation tasks. It provides a convenient facade for various browser interactions, making it easier to navigate web pages, extract data, and interact with page elements.

Key Features

  • Asynchronous and Parallel Execution: Quokka operates entirely in an asynchronous manner. Leveraging the power of Playwright, it utilizes multiple processes, each containing a single coroutine, for efficient parallel execution. This architecture excels in handling both IO and CPU-bound workloads when ample resources are available.
  • Multi-threaded Crawling with Ease: Quokka's BaseCrawler class enables users to effortlessly transition from single-threaded to multi-threaded crawling. By taking advantage of the provided crawler template, you can seamlessly convert a single-threaded crawler into a multi-threaded one.
  • Easy Browser Management: Quokka's Agent class provides a streamlined interface for managing browser instances, including starting, stopping, and page navigation.
  • Data Extraction: With the data_extractor module, Quokka allows you to easily extract data from web pages using customizable selectors and extraction patterns.
  • Page Interaction: The page_interactor module enables you to interact with web page elements, such as clicking, typing, and scrolling, making automation tasks a breeze.
  • Custom Hooks: Quokka supports customizable hooks, allowing you to extend and customize the behavior of the Agent class to fit your specific needs.
  • Extensible: Quokka exposes Playwright's playwright and page instances, enabling users to extend the library's functionality as required.

Installation

pip install quokka-web

Getting Started

Quokka's intuitive API makes browser automation a straightforward process. Here's a simple example:

from quokka_web import Agent


async def main():
    agent = await Agent.instantiate(headless=True)
    await agent.start()

    # Your automation code here

    await agent.stop()


if __name__ == "__main__":
    import asyncio

    asyncio.run(main())

Documentation

For detailed usage instructions, examples, and customization options, please refer to the Documentation.

Examples

Base Crawler Example:

from quokka_web import BaseCrawler, Debugger


class MyCrawler(BaseCrawler):
    async def _crawl(self, *args, **kwargs):
# Core crawling logic using browser_agent


if __name__ == "__main__":
    import asyncio


    async def main():
        crawler = await MyCrawler.instantiate(debug_tool=Debugger(verbose=True))
        await crawler.start()
        await crawler.crawl()
        await crawler.stop()


    asyncio.run(main())

Contributing

Contributions to Quokka are welcome! Please read our Contribution Guidelines for more information on how to contribute to the project.

License

This project is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

quokka-web-0.0.2.0.tar.gz (17.3 kB view details)

Uploaded Source

Built Distribution

quokka_web-0.0.2.0-py3-none-any.whl (22.9 kB view details)

Uploaded Python 3

File details

Details for the file quokka-web-0.0.2.0.tar.gz.

File metadata

  • Download URL: quokka-web-0.0.2.0.tar.gz
  • Upload date:
  • Size: 17.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.11.8

File hashes

Hashes for quokka-web-0.0.2.0.tar.gz
Algorithm Hash digest
SHA256 f00a52ef95fd0d84fc721d04d559d8e75b85cff93b6b1fab32d211187645214b
MD5 e276452a45beea2a20bde21f13f37953
BLAKE2b-256 4a7d9f851284d48e62f567a8d516865f82ec8c00b2208048db04bf5a47a48672

See more details on using hashes here.

File details

Details for the file quokka_web-0.0.2.0-py3-none-any.whl.

File metadata

  • Download URL: quokka_web-0.0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 22.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.11.8

File hashes

Hashes for quokka_web-0.0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6b2c84c4daecdf37282ae0eb2c3700f6c2eec3252a97efd0e53299b27c88c9eb
MD5 ccd8f6fc04b7298e6bb16f8ac32cb92b
BLAKE2b-256 38a39ad7952b2f7e1a26d8d5ea2894c2bdc4d4c84569b2ff115249055c015f19

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page