Skip to main content

Scrapes recipes and converts them to txt or markdown

Project description

Introduction

recipe2txt is a CLI-program that you can feed your urls of recipes and it spits out formatted cookbooks containing those recipes. Highlights include:

  • asynchronous fetching of recipes
  • formatted output either as txt- or markdown-file
  • local caching of recipes

The program is a wrapper for the recipe-scrapers-library. Please visit their README.md if you would like to know which websites are supported.

WARNING

THIS SOFTWARE IS AT AN EARLY DEVELOPEMENT STAGE.

BE CAREFUL SETTING THE --output-FLAG, ANY EXISTING FILES WITH THE SAME NAME WILL BE OVERWRITTEN.

TESTED ONLY ON KUBUNTU 23.04.

Usage

Install with pip install recipe2txt. You can either use recipe2txt or re2txt to run the program.

usage: recipes2txt [-h] [--file [FILE ...]] [--output OUTPUT] [--verbosity {debug,info,warning,error,critical}]
                   [--connections CONNECTIONS] [--cache {only,new,default}] [--debug] [--timeout TIMEOUT]
                   [--markdown] [--user-agent USER_AGENT] [--erase-appdata ERASE_APPDATA]
                   [url ...]

Scrapes URLs of recipes into text files

positional arguments:
  url                   URLs whose recipes should be added to the recipe-file (default: '[]')

options:
  -h, --help            show this help message and exit
  --file [FILE ...], -f [FILE ...]
                        Text-files containing URLs whose recipes should be added to the recipe-file (default:
                        '[]')
  --output OUTPUT, -o OUTPUT
                        Specifies an output file. THIS WILL OVERWRITE ANY EXISTING FILE WITH THE SAME NAME.
                        (default: '/home/pc/sciebo/Dokumente/Programming/recipe2txt/recipes')
  --verbosity {debug,info,warning,error,critical}, -v {debug,info,warning,error,critical}
                        Sets the 'chattiness' of the program (default: 'critical')
  --connections CONNECTIONS, -con CONNECTIONS
                        Sets the number of simultaneous connections (default: '4')
  --cache {only,new,default}, -c {only,new,default}
                        Controls how the program should handle its cache: With 'only' no new data will be
                        downloaded, the recipes will be generated from data that has been downloaded previously.
                        If a recipe is not in the cache, it will not be written into the final output. 'new' will
                        make the program ignore any saved data and download the requested recipes even if they
                        have already been downloaded. Old data will be replaced by the new version, if it is
                        available. The 'default' will fetch and merge missing data with the data already saved,
                        only inserting new data into the cache where there was none previously. (default:
                        'default')
  --debug, -d           Activates debug-mode: Changes the directory for application data (default: 'False')
  --timeout TIMEOUT, -t TIMEOUT
                        Sets the number of seconds the program waits for an individual website to respond, eg.
                        sets the connect-value of aiohttp.ClientTimeout. (default: '10.0')
  --markdown, -m        Generates markdown-output instead of '.txt' (default: 'False')
  --user-agent USER_AGENT, -ua USER_AGENT
                        Sets the user-agent to be used for the requests. (default: 'Mozilla/5.0 (Windows NT 10.0;
                        Win64; x64; rv:115.0) Gecko/20100101 Firefox/115.0')
  --erase-appdata ERASE_APPDATA
                        Erases all data- and cache-files

When first run the program will generate the config-file recipe2txt.toml (use recipe2txt --help to locate it). Every option listed above has a pendant in that file. Uncomment [^1] the line and change the value after the =- sign to change the value this program uses when the option is not specified via the CLI-interface.

[^1]: Remove the leading #

Examples

recipe2txt www.example-url.com/tastyrecipe www.other-examle-url.org/deliciousmeal -o ~/Documents/great-recipes.txt

Developement

Tools

nox

This project (ab-)uses nox as test-(and task-)runner. Install nox from PyPi.org (e.g. pipx install nox). Use nox --list to get an overview over the different routines the noxfile provides. For example to create the developement enviroment use nox -s dev.

mypy

This project uses mypy for type checking. The configuration file contains all relevant settings, so a simple call to mypy from the current directory should be sufficient to typecheck the project.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

recipe2txt-0.4.2.tar.gz (68.2 kB view details)

Uploaded Source

Built Distribution

recipe2txt-0.4.2-py3-none-any.whl (60.5 kB view details)

Uploaded Python 3

File details

Details for the file recipe2txt-0.4.2.tar.gz.

File metadata

  • Download URL: recipe2txt-0.4.2.tar.gz
  • Upload date:
  • Size: 68.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for recipe2txt-0.4.2.tar.gz
Algorithm Hash digest
SHA256 1af44c62a0229161136d107ce3273939b9a40b67b0cbe9264de56f81ada9996c
MD5 f4149052aaa6199da6e3d45585bfb60b
BLAKE2b-256 b3b9883b5609894139a1a08bc2856fc33b334669a9eb3d0a70469d741d829ba4

See more details on using hashes here.

File details

Details for the file recipe2txt-0.4.2-py3-none-any.whl.

File metadata

  • Download URL: recipe2txt-0.4.2-py3-none-any.whl
  • Upload date:
  • Size: 60.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for recipe2txt-0.4.2-py3-none-any.whl
Algorithm Hash digest
SHA256 e326b23d0f90c1297e7d5f00b61d3b5ff7a066186bd7c5d1abe973531e23ff1e
MD5 3015ea5985dca40f9ce7612a38d7cda2
BLAKE2b-256 7408e4ccc01d62870b9c2df8b31cdc4e7dc3ea24c8d631957f589732da159d15

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page