A simple web scraping and automation tool
Project description
RakuScraper
RakuScraper
is a powerful yet easy-to-use Python library for automated web scraping. The name "Raku" is inspired by the Japanese word that means "easy." True to its name, RakuScraper
is devoted to simplifying the automation and scraping of websites. With a clean and intuitive API, you can collect the data you need without the hassle.
Installation
You can easily install RakuScraper
via pip:
pip install raku-scraper
Features
- Ease of Use: A user-friendly API that makes web scraping a breeze.
- Flexibility: Easily customizable for a wide range of web scraping tasks.
- Robust: Built to handle errors gracefully and continue the scraping process.
Quick Start
Here’s a quick example that demonstrates the simplicity and power of RakuScraper
. In this example, we will automate the process of accepting cookies on a webpage, then scrape the title of the webpage before and after changing the language of the webpage.
# Import the necessary classes from RakuScraper
from raku_scraper import ScrapingTask, RakuScraper
# Create a scraping task for a specific URL
task = ScrapingTask(url="https://www.example.com", title="Example Page", description="A test page")
# Add steps to the scraping task
task.add_action(step_id="accept_cookies", selector="#cookie_accept", action="click", description="Accept cookies")
task.add_selector(step_id="page_title", selector="h1.title", attribute="text", description="Get the title of the page")
task.add_action(step_id="change_language", selector="#language_toggle", action="click", description="Change language")
task.add_selector(step_id="page_title_after_language_change", selector="h1.title", attribute="text", description="Get the title of the page after language change")
# Create a RakuScraper instance and execute the scraping task
scraper = RakuScraper(task, headless=False) # Set headless to True for headless mode
results = scraper.scrape()
# Print the scraped data
print("Results:", results)
Contributing
We welcome contributions from the community. If you'd like to contribute, feel free to open an issue or create a pull request. For major changes, please open an issue first to discuss what you would like to change.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file raku-scraper-0.1.0.tar.gz
.
File metadata
- Download URL: raku-scraper-0.1.0.tar.gz
- Upload date:
- Size: 16.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.11
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d1d97a16870b1d323a31b30f9b993b4eede9609ad1a9cda3fa1052f5fa547311 |
|
MD5 | 1a5fc364f05f469e4b0a23e5dd47f3c1 |
|
BLAKE2b-256 | 4e2ce2eab24d16cc985a3f7bf1b22a758e8d4df626295571831330abbcb52127 |
File details
Details for the file raku_scraper-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: raku_scraper-0.1.0-py3-none-any.whl
- Upload date:
- Size: 16.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.11
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 635a47ad99755d3ee49efcff596d9f80bbb01a17fbeeedde847be4674a639ed2 |
|
MD5 | abd0504f5dcf0dd98c44877f6e4e130d |
|
BLAKE2b-256 | 4b7aabd6703da1040767855ab9c8ac8bb6875605c0b7bf0c1f5c38649f7a2174 |