Python package, scraping recipes from all over the internet
Project description
A simple web scraping tool for recipe sites.
pip install recipe-scrapers
then:
from recipe_scrapers import scrape_me
# give the url as a string, it can be url from any site listed below
scraper = scrape_me('http://allrecipes.com/Recipe/Apple-Cake-Iv/Detail.aspx')
scraper.title()
scraper.total_time()
scraper.yields()
scraper.ingredients()
scraper.instructions()
scraper.image()
scraper.links()
Note: scraper.links() returns a dictionary object containing all of the <a> tag attributes. The attribute names are the dictionary keys.
Scrapers available for:
Contribute
Part of the reason I want this open sourced is because if a site makes a design change, the scraper for it should be modified.
If you spot a design change (or something else) that makes the scraper unable to work for a given site - please fire an issue asap.
If you are programmer PRs with fixes are warmly welcomed and acknowledged with a virtual beer.
If you want a scraper for a new site added
Open an Issue providing us the site name, as well as a recipe link from it.
If you are a developer and want to code the scraper on your own, this is a wonderful example of how to do it.
For Devs / Contribute
Assuming you have python3 installed, navigate to the directory where you want this project to live in and drop these lines
git clone git@github.com:hhursev/recipe-scrapers.git &&
cd recipe-scrapers &&
python3 -m venv .venv &&
source .venv/bin/activate &&
pip install -r requirements.txt &&
coverage run tests.py &&
coverage report
Spacial thanks to:
All the contributors that helped improving the package. You are awesome!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for recipe_scrapers-5.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 77c11691fc08bfeb66eb35f983435e9af8959d8bf4731eefcb298c54a4c265e7 |
|
MD5 | c783e5c9f0968103b69509f92770f2a2 |
|
BLAKE2b-256 | ba24af96dc3a72c8fab1d2beec046b0ee3440c5c2935ca6c2ed8ad7981feeef5 |