Quokka is a powerful Python library built on top of Playwright, designed to simplify browser automation and web scraping tasks. With Quokka, you can easily navigate web pages, extract data, and interact with page elements using an intuitive API. Quokka supports asynchronous and parallel execution, making it suitable for a wide range of IO and CPU-bound workloads. Get started with Quokka to streamline your browser automation and web scraping workflows.
Project description
Quokka - Browser Automation Library with Playwright
Quokka is a powerful Python library built on top of Playwright, designed to simplify browser automation and manipulation tasks. It provides a convenient facade for various browser interactions, making it easier to navigate web pages, extract data, and interact with page elements.
Key Features
- Asynchronous and Parallel Execution: Quokka operates entirely in an asynchronous manner. Leveraging the power of Playwright, it utilizes multiple processes, each containing a single coroutine, for efficient parallel execution. This architecture excels in handling both IO and CPU-bound workloads when ample resources are available.
- Multi-threaded Crawling with Ease: Quokka's
BaseCrawler
class enables users to effortlessly transition from single-threaded to multi-threaded crawling. By taking advantage of the provided crawler template, you can seamlessly convert a single-threaded crawler into a multi-threaded one. - Easy Browser Management: Quokka's
Agent
class provides a streamlined interface for managing browser instances, including starting, stopping, and page navigation. - Data Extraction: With the
data_extractor
module, Quokka allows you to easily extract data from web pages using customizable selectors and extraction patterns. - Page Interaction: The
page_interactor
module enables you to interact with web page elements, such as clicking, typing, and scrolling, making automation tasks a breeze. - Custom Hooks: Quokka supports customizable hooks, allowing you to extend and customize the behavior of the
Agent
class to fit your specific needs. - Extensible: Quokka exposes Playwright's
playwright
andpage
instances, enabling users to extend the library's functionality as required.
Installation
pip install quokka-web
Getting Started
Quokka's intuitive API makes browser automation a straightforward process. Here's a simple example:
from quokka_web import Agent
async def main():
agent = await Agent.instantiate(headless=True)
await agent.start()
# Your automation code here
await agent.stop()
if __name__ == "__main__":
import asyncio
asyncio.run(main())
Documentation
For detailed usage instructions, examples, and customization options, please refer to the Documentation.
Examples
Base Crawler Example:
from quokka_web import BaseCrawler, Debugger
class MyCrawler(BaseCrawler):
async def _crawl(self, *args, **kwargs):
# Core crawling logic using browser_agent
if __name__ == "__main__":
import asyncio
async def main():
crawler = await MyCrawler.instantiate(debug_tool=Debugger(verbose=True))
await crawler.start()
await crawler.crawl()
await crawler.stop()
asyncio.run(main())
Contributing
Contributions to Quokka are welcome! Please read our Contribution Guidelines for more information on how to contribute to the project.
License
This project is licensed under the MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file quokka-web-0.0.2.0.tar.gz
.
File metadata
- Download URL: quokka-web-0.0.2.0.tar.gz
- Upload date:
- Size: 17.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.11.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f00a52ef95fd0d84fc721d04d559d8e75b85cff93b6b1fab32d211187645214b |
|
MD5 | e276452a45beea2a20bde21f13f37953 |
|
BLAKE2b-256 | 4a7d9f851284d48e62f567a8d516865f82ec8c00b2208048db04bf5a47a48672 |
File details
Details for the file quokka_web-0.0.2.0-py3-none-any.whl
.
File metadata
- Download URL: quokka_web-0.0.2.0-py3-none-any.whl
- Upload date:
- Size: 22.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.11.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6b2c84c4daecdf37282ae0eb2c3700f6c2eec3252a97efd0e53299b27c88c9eb |
|
MD5 | ccd8f6fc04b7298e6bb16f8ac32cb92b |
|
BLAKE2b-256 | 38a39ad7952b2f7e1a26d8d5ea2894c2bdc4d4c84569b2ff115249055c015f19 |