Scrape glosbe dicts given a head words file
Project description
scrape-glosbe-dict
Scrape a glosbe dict
Install it
pip install scrape-glosbe-dict
# pip install git+https://github.com/ffreemt/scrape-glosbe-dict
# poetry add git+https://github.com/ffreemt/scrape-glosbe-dict
# git clone https://github.com/ffreemt/scrape-glosbe-dict && cd scrape-glosbe-dict
Use it
scrape-glosbe-dict head-word-file # default english-chinese
# or python -m scrape_glosbe_dict head-word-file
# scrape-glosbe-dict head-word-file -f de # german-chinese
head word file formt: one word/phrase per line, empty lines will be ignored.
output will be saved to a tsv file.
Docs
python -m scrape_glosbe_dict --help
Usage: python -m scrape_glosbe_dict [OPTIONS] head-word-file
Arguments:
head-word-file Head word file, one word/phrase per line, each will be used
to fetch corresponding definitons from https://glosbe.com/.
[required]
Options:
-f, --from-lang TEXT Source language, check https://glosbe.com/ for valid
value, e.g. https://glosbe.com/en/zh implies
from_lang='en'. [default: en]
-t, --to-lang TEXT Target language, check https://glosbe.com/ for valid
value, e.g. https://glosbe.com/en/zh implies
to_lang='zh'. [default: zh]
-v, --verbose Show output in the process.
-V, --version Show version info and exit.
--help Show this message and exit.
Miscellany
- A retry mechanism (via pypi
tenacity
) is built-in to fetch info from glosbe. Refer to the source file for details. - Local cache (via pypi
joblib
) is used so that you can interrupt anytime and continue later. - Scraping is often frowned upon and sometimes can result in your IP being banned from the website. Use this package at your own discretion.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file scrape-glosbe-dict-0.1.1.tar.gz
.
File metadata
- Download URL: scrape-glosbe-dict-0.1.1.tar.gz
- Upload date:
- Size: 6.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.2 importlib-metadata/4.11.1 keyring/23.5.0 rfc3986/1.4.0 colorama/0.4.4 CPython/3.8.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b9039284c2ace51ec8d55ba3380d28f616259d8a432ae648c717e589ebaf2c76 |
|
MD5 | ab4bc07e5d8098a1ee68fb3fe5d86dc8 |
|
BLAKE2b-256 | 2abffdb0bc44dcc3ce89ba49c9268fabbf51d5b47ae69ba6121141c0b23abf05 |
File details
Details for the file scrape_glosbe_dict-0.1.1-py3-none-any.whl
.
File metadata
- Download URL: scrape_glosbe_dict-0.1.1-py3-none-any.whl
- Upload date:
- Size: 7.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.2 importlib-metadata/4.11.1 keyring/23.5.0 rfc3986/1.4.0 colorama/0.4.4 CPython/3.8.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 27e38edbd29ebca42a2c7c79f5958ea999f7b4ddbf30ffc30b28591e6420afe1 |
|
MD5 | 78489d8486f44a9a79c72d709da0975b |
|
BLAKE2b-256 | 068ad8cd155d5897395fb336d1d24e85507c604197f8582c779c422e7f7f6dad |