Skip to main content

jplookup is a Python tool designed to look up Japanese words on Wiktionary and scrape pitch accent, phonetic pronunciation, definitions and example sentences from Wiktionary and turn them into straight-forward flashcards for Anki.

Project description

jplookup

jplookup is a Python tool designed to scrape pitch accent, phonetic pronunciation, definitions and example sentences from Wiktionary and turn them into straight-forward flashcards for Anki. oniisan

Features

Pitch Accent

Anki cards made with jplookup can mark the pitch accent. This will put a solid dot above the mora that will have a high pitch that's then followed by a mora with a low pitch. oriru


When a noted high pitch is sustained due to being part of a diagraph and/or has a lengthening vowel, then CSS is used to render a line indicating this. suiyoobi

If there's no dot present, then the word follows the standard pitch accent.


Scrapes Japanese word data

jplookup.scrape("猫") returns a list of dictionary objects. The very first dictionary in the list contains the primary results: neko

The rest of the list may provide further dictionaries, which are gathered from page redirects whose contents could not be linked back to the primary results dictionary through mutual matching components.


jplookup seeks out parts of speech, under those there are pronunciations, definitions, synonyms and antonyms. Each pronunciation will generally have the kana, the IPA, the pitch accent, and the furigana. Each definition is a dictionary and can contain example sentences.


Anki Integration

The program outputs a text file that can easily be read into Anki. Its fields are:

  • Key Term
  • Kana
  • Kanji
  • Definitions
  • IPA
  • Pretty Kana (HTML rendering)
  • Pretty Kanji (HTML rendering)
  • Usage Notes
  • Counter Noun

Handles Terms Linking to Other Pages

When Wiktionary links to a different page for an alternative spelling, then the information gathered from that redirect will be filtered through the original spelling in order to provide the only relevant information.

  • "撮る" redirects to the Wiktionary page for "とる" and grabs any definitions that are either specified as fitting with "撮る" or definitions with no context/kanji specification at all.
  • "取る" redirects to the Wiktionary page for "とる" and grabs any definitions that are either specified as fitting with "取る" or definitions with no context/kanji specification at all.
  • "とる" (the hiragana directly) goes to the Wiktionary page for "とる" and grabs all definitions regardless of context specification.

Installation

Clone the repository and install the required dependencies (bs4 and jaconv):

git clone https://github.com/travisgk/jplookup.git
cd jplookup
pip install -r requirements.txt

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jplookup-1.0.1.tar.gz (38.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

jplookup-1.0.1-py3-none-any.whl (48.2 kB view details)

Uploaded Python 3

File details

Details for the file jplookup-1.0.1.tar.gz.

File metadata

  • Download URL: jplookup-1.0.1.tar.gz
  • Upload date:
  • Size: 38.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.6

File hashes

Hashes for jplookup-1.0.1.tar.gz
Algorithm Hash digest
SHA256 97da298bf0ac5927c1d88646ac6c547305878e0488a15c5828d79ca401b67b15
MD5 4e9b4769954b4b95c506a1160ecc43e3
BLAKE2b-256 4743215bbe3de5b49789337b8fc2b009c02087b4184a9109722cf6be53ca1958

See more details on using hashes here.

File details

Details for the file jplookup-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: jplookup-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 48.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.6

File hashes

Hashes for jplookup-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f3b5d01b986ec2751e0faf08683c927b8e51cd008906ce4dae9fc5bf350aeedf
MD5 4b668c01249dc4c77e4722d374688a60
BLAKE2b-256 d384828c24f367e2ba187040c1a2c621306076beb36d443f05e509a33b3a1d56

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page