Skip to main content

A simple web scraping and automation tool

Project description

RakuScraper

RakuScraper is a powerful yet easy-to-use Python library for automated web scraping. The name "Raku" is inspired by the Japanese word that means "easy." True to its name, RakuScraper is devoted to simplifying the automation and scraping of websites. With a clean and intuitive API, you can collect the data you need without the hassle.

Installation

You can easily install RakuScraper via pip:

pip install raku-scraper

Features

  • Ease of Use: A user-friendly API that makes web scraping a breeze.
  • Flexibility: Easily customizable for a wide range of web scraping tasks.
  • Robust: Built to handle errors gracefully and continue the scraping process.

Quick Start

Here’s a quick example that demonstrates the simplicity and power of RakuScraper. In this example, we will automate the process of accepting cookies on a webpage, then scrape the title of the webpage before and after changing the language of the webpage.

# Import the necessary classes from RakuScraper
from raku_scraper import ScrapingTask, RakuScraper  

# Create a scraping task for a specific URL
task = ScrapingTask(url="https://www.example.com", title="Example Page", description="A test page")

# Add steps to the scraping task
task.add_action(step_id="accept_cookies", selector="#cookie_accept", action="click", description="Accept cookies")
task.add_selector(step_id="page_title", selector="h1.title", attribute="text", description="Get the title of the page")
task.add_action(step_id="change_language", selector="#language_toggle", action="click", description="Change language")
task.add_selector(step_id="page_title_after_language_change", selector="h1.title", attribute="text", description="Get the title of the page after language change")

# Create a RakuScraper instance and execute the scraping task
scraper = RakuScraper(task, headless=False)  # Set headless to True for headless mode
results = scraper.scrape()

# Print the scraped data
print("Results:", results)

Contributing

We welcome contributions from the community. If you'd like to contribute, feel free to open an issue or create a pull request. For major changes, please open an issue first to discuss what you would like to change.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

raku-scraper-0.1.1.tar.gz (16.2 kB view details)

Uploaded Source

Built Distribution

raku_scraper-0.1.1-py3-none-any.whl (16.7 kB view details)

Uploaded Python 3

File details

Details for the file raku-scraper-0.1.1.tar.gz.

File metadata

  • Download URL: raku-scraper-0.1.1.tar.gz
  • Upload date:
  • Size: 16.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.11

File hashes

Hashes for raku-scraper-0.1.1.tar.gz
Algorithm Hash digest
SHA256 d9d5ac042558ee3062162035339c138bef9614547e9605ad266a29114bd989dd
MD5 65777db5a620eda0fb0a503804ebe607
BLAKE2b-256 d04efc409e3dbdd9dcd4f88bdbc2f51a93c9db2cad65ab26f4a56349ccf712b5

See more details on using hashes here.

File details

Details for the file raku_scraper-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: raku_scraper-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 16.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.11

File hashes

Hashes for raku_scraper-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 e1e05125b27eac69f0c761d145835b1577ba3b23ae227cf9dbf44933141f8477
MD5 327c2f30a1b556c60fb547649c1731ee
BLAKE2b-256 417338f83f718b0f3b8579cb5e11f97774387e728243444a86e2bcd23d219baf

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page