Skip to main content

A simple web scraping and automation tool

Project description

RakuScraper

RakuScraper is a powerful yet easy-to-use Python library for automated web scraping. The name "Raku" is inspired by the Japanese word that means "easy." True to its name, RakuScraper is devoted to simplifying the automation and scraping of websites. With a clean and intuitive API, you can collect the data you need without the hassle.

Installation

You can easily install RakuScraper via pip:

pip install raku-scraper

Features

  • Ease of Use: A user-friendly API that makes web scraping a breeze.
  • Flexibility: Easily customizable for a wide range of web scraping tasks.
  • Robust: Built to handle errors gracefully and continue the scraping process.

Quick Start

Here’s a quick example that demonstrates the simplicity and power of RakuScraper. In this example, we will automate the process of accepting cookies on a webpage, then scrape the title of the webpage before and after changing the language of the webpage.

# Import the necessary classes from RakuScraper
from raku_scraper import ScrapingTask, RakuScraper  

# Create a scraping task for a specific URL
task = ScrapingTask(url="https://www.example.com", title="Example Page", description="A test page")

# Add steps to the scraping task
task.add_action(step_id="accept_cookies", selector="#cookie_accept", action="click", description="Accept cookies")
task.add_selector(step_id="page_title", selector="h1.title", attribute="text", description="Get the title of the page")
task.add_action(step_id="change_language", selector="#language_toggle", action="click", description="Change language")
task.add_selector(step_id="page_title_after_language_change", selector="h1.title", attribute="text", description="Get the title of the page after language change")

# Create a RakuScraper instance and execute the scraping task
scraper = RakuScraper(task, headless=False)  # Set headless to True for headless mode
results = scraper.scrape()

# Print the scraped data
print("Results:", results)

Contributing

We welcome contributions from the community. If you'd like to contribute, feel free to open an issue or create a pull request. For major changes, please open an issue first to discuss what you would like to change.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

raku-scraper-0.1.1.tar.gz (16.2 kB view hashes)

Uploaded Source

Built Distribution

raku_scraper-0.1.1-py3-none-any.whl (16.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page