Skip to main content

Scrape glosbe dicts given a head words file

Project description

scrape-glosbe-dict

pytestpythonCode style: blackLicense: MITPyPI version

Scrape a glosbe dict

Install it

pip install scrape-glosbe-dict

# pip install git+https://github.com/ffreemt/scrape-glosbe-dict
# poetry add git+https://github.com/ffreemt/scrape-glosbe-dict
# git clone https://github.com/ffreemt/scrape-glosbe-dict && cd scrape-glosbe-dict

Use it

scrape-glosbe-dict head-word-file  # default english-chinese

# or python -m scrape_glosbe_dict head-word-file

# scrape-glosbe-dict head-word-file -f de  # german-chinese

head word file formt: one word/phrase per line, empty lines will be ignored.

output will be saved to a tsv file.

Docs

python -m scrape_glosbe_dict --help
Usage: python -m scrape_glosbe_dict [OPTIONS] head-word-file

Arguments:
  head-word-file  Head word file, one word/phrase per line, each will be used
                  to fetch corresponding definitons from https://glosbe.com/.
                  [required]

Options:
  -f, --from-lang TEXT  Source language, check https://glosbe.com/ for valid
                        value, e.g. https://glosbe.com/en/zh implies
                        from_lang='en'.  [default: en]
  -t, --to-lang TEXT    Target language, check https://glosbe.com/ for valid
                        value, e.g. https://glosbe.com/en/zh implies
                        to_lang='zh'.  [default: zh]
  -v, --verbose         Show output in the process.
  -V, --version         Show version info and exit.
  --help                Show this message and exit.

Miscellany

  • A retry mechanism (via pypi tenacity) is built-in to fetch info from glosbe. Refer to the source file for details.
  • Local cache (via pypi joblib) is used so that you can interrupt anytime and continue later.
  • Scraping is often frowned upon and sometimes can result in your IP being banned from the website. Use this package at your own discretion.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scrape-glosbe-dict-0.1.0.tar.gz (5.9 kB view hashes)

Uploaded Source

Built Distribution

scrape_glosbe_dict-0.1.0-py3-none-any.whl (6.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page