Skip to main content

A simple web scraping and automation tool

Project description

RakuScraper

RakuScraper is a powerful yet easy-to-use Python library for automated web scraping. The name "Raku" is inspired by the Japanese word that means "easy." True to its name, RakuScraper is devoted to simplifying the automation and scraping of websites. With a clean and intuitive API, you can collect the data you need without the hassle.

Installation

You can easily install RakuScraper via pip:

pip install raku-scraper

Features

  • Ease of Use: A user-friendly API that makes web scraping a breeze.
  • Flexibility: Easily customizable for a wide range of web scraping tasks.
  • Robust: Built to handle errors gracefully and continue the scraping process.

Quick Start

Here’s a quick example that demonstrates the simplicity and power of RakuScraper. In this example, we will automate the process of accepting cookies on a webpage, then scrape the title of the webpage before and after changing the language of the webpage.

# Import the necessary classes from RakuScraper
from raku_scraper import ScrapingTask, RakuScraper  

# Create a scraping task for a specific URL
task = ScrapingTask(url="https://www.example.com", title="Example Page", description="A test page")

# Add steps to the scraping task
task.add_action(step_id="accept_cookies", selector="#cookie_accept", action="click", description="Accept cookies")
task.add_selector(step_id="page_title", selector="h1.title", attribute="text", description="Get the title of the page")
task.add_action(step_id="change_language", selector="#language_toggle", action="click", description="Change language")
task.add_selector(step_id="page_title_after_language_change", selector="h1.title", attribute="text", description="Get the title of the page after language change")

# Create a RakuScraper instance and execute the scraping task
scraper = RakuScraper(task, headless=False)  # Set headless to True for headless mode
results = scraper.scrape()

# Print the scraped data
print("Results:", results)

Contributing

We welcome contributions from the community. If you'd like to contribute, feel free to open an issue or create a pull request. For major changes, please open an issue first to discuss what you would like to change.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

raku-scraper-0.1.0.tar.gz (16.2 kB view details)

Uploaded Source

Built Distribution

raku_scraper-0.1.0-py3-none-any.whl (16.6 kB view details)

Uploaded Python 3

File details

Details for the file raku-scraper-0.1.0.tar.gz.

File metadata

  • Download URL: raku-scraper-0.1.0.tar.gz
  • Upload date:
  • Size: 16.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.11

File hashes

Hashes for raku-scraper-0.1.0.tar.gz
Algorithm Hash digest
SHA256 d1d97a16870b1d323a31b30f9b993b4eede9609ad1a9cda3fa1052f5fa547311
MD5 1a5fc364f05f469e4b0a23e5dd47f3c1
BLAKE2b-256 4e2ce2eab24d16cc985a3f7bf1b22a758e8d4df626295571831330abbcb52127

See more details on using hashes here.

File details

Details for the file raku_scraper-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: raku_scraper-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 16.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.11

File hashes

Hashes for raku_scraper-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 635a47ad99755d3ee49efcff596d9f80bbb01a17fbeeedde847be4674a639ed2
MD5 abd0504f5dcf0dd98c44877f6e4e130d
BLAKE2b-256 4b7aabd6703da1040767855ab9c8ac8bb6875605c0b7bf0c1f5c38649f7a2174

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page