Python package, scraping recipes from all over the internet
Project description
A simple web scraping tool for recipe sites.
pip install recipe-scrapers
then:
from recipe_scrapers import scrape_me
# give the url as a string, it can be url from any site listed below
scraper = scrape_me('https://www.allrecipes.com/recipe/158968/spinach-and-feta-turkey-burgers/')
scraper.title()
scraper.total_time()
scraper.yields()
scraper.ingredients()
scraper.instructions()
scraper.image()
scraper.links()
Note: scraper.links() returns a list of dictionaries containing all of the <a> tag attributes. The attribute names are the dictionary keys.
Scrapers available for:
Contribute
Part of the reason I want this open sourced is because if a site makes a design change, the scraper for it should be modified.
If you spot a design change (or something else) that makes the scraper unable to work for a given site - please fire an issue asap.
If you are programmer PRs with fixes are warmly welcomed and acknowledged with a virtual beer.
If you want a scraper for a new site added
Open an Issue providing us the site name, as well as a recipe link from it.
- You are a developer and want to code the scraper on your own:
If Schema is available on the site - you can do this
Otherwise, scrape the HTML - like this
For Devs / Contribute
Assuming you have python3 installed, navigate to the directory where you want this project to live in and drop these lines
git clone git@github.com:hhursev/recipe-scrapers.git &&
cd recipe-scrapers &&
python3 -m venv .venv &&
source .venv/bin/activate &&
pip install -r requirements.txt &&
coverage run -m unittest &&
coverage report
Spacial thanks to:
All the contributors that helped improving the package. You are awesome!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for recipe_scrapers-8.0.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ad9e31fabafc38d823e7ff4a28c30611761354f691b9812bdf6d315e9e33c252 |
|
MD5 | a8ff487b93c78ae2ae38e0b7e8942424 |
|
BLAKE2b-256 | 9642ef130c866da4866835f55a25c4ae4bb58bb16e6b0aeb69320452a1e436e0 |