Python package, scraping recipes from all over the internet
A simple web scraping tool for recipe sites.
pip install recipe-scrapers
from recipe_scrapers import scrape_me # give the url as a string, it can be url from any site listed below scraper = scrape_me('https://www.allrecipes.com/recipe/158968/spinach-and-feta-turkey-burgers/') # Q: What if the recipe site I want to extract information from is not listed below? # A: You can give it a try with the wild_mode option! If there is Schema/Recipe available it will work just fine. scraper = scrape_me('https://www.feastingathome.com/tomato-risotto/', wild_mode=True) scraper.title() scraper.total_time() scraper.yields() scraper.ingredients() scraper.instructions() scraper.image() scraper.host() scraper.links() scraper.nutrients() # if available
Note: scraper.links() returns a list of dictionaries containing all of the <a> tag attributes. The attribute names are the dictionary keys.
Scrapers available for:
Part of the reason I want this open sourced is because if a site makes a design change, the scraper for it should be modified.
If you spot a design change (or something else) that makes the scraper unable to work for a given site - please fire an issue asap.
If you are programmer PRs with fixes are warmly welcomed and acknowledged with a virtual beer.
If you want a scraper for a new site added
Open an Issue providing us the site name, as well as a recipe link from it.
You are a developer and want to code the scraper on your own:
If Schema is available on the site - you can do this
Otherwise, scrape the HTML - like this
Generating a new scraper class:
python generate.py <ClassName> <URL>
- ClassName: The name of the new scraper class.
- URL: The URL of an example recipe from the target site. The content will be stored in test_data to be used with the test class.
For Devs / Contribute
Assuming you have python3 installed, navigate to the directory where you want this project to live in and drop these lines
git clone email@example.com:hhursev/recipe-scrapers.git && cd recipe-scrapers && python3 -m venv .venv && source .venv/bin/activate && pip install -r requirements-dev.txt && pre-commit install && python -m coverage run -m unittest && python -m coverage report
In case you want to run a single unittest for a newly developed scraper
python -m coverage run -m unittest tests.test_myscraper
- How do I know if a website has a Recipe Schema? Run in python shell:
from recipe_scrapers import scrape_me scraper = scrape_me('<url of a recipe from the site>', wild_mode=True) # if no error is raised - there's schema available: scraper.title() scraper.instructions() # etc.
Special thanks to:
All the contributors that helped improving the package. You are awesome!
Release history Release notifications | RSS feed
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|Filename, size||File type||Python version||Upload date||Hashes|
|Filename, size recipe_scrapers-12.0.4-py3-none-any.whl (255.0 kB)||File type Wheel||Python version py3||Upload date||Hashes View|
|Filename, size recipe_scrapers-12.0.4.tar.gz (41.9 kB)||File type Source||Python version None||Upload date||Hashes View|
Hashes for recipe_scrapers-12.0.4-py3-none-any.whl