Build and query a local Japanese dictionary SQLite database.

These details have not been verified by PyPI

Project links

Project description

Kotodex Package Guide

Kotodex builds a local SQLite database from Japanese dictionary sources and exposes a compact Python API for vocabulary, kanji, example sentence, and combined dictionary lookups.

Use it when you want Japanese dictionary data available locally without parsing XML or CSV source files at runtime.

Install
Build A Database
Python Quick Start
Vocabulary Lookups
Kanji Lookups
Example Sentences
Combined Queries
Result Objects
JSON Export
CLI Usage
Lemma Helper
Provenance And Licensing

Install

Default install, using the Sudachi core dictionary:

pip install kotodex

Optional Sudachi dictionary sizes:

pip install "kotodex[small]"
pip install "kotodex[core]"
pip install "kotodex[full]"

Development install from a checkout:

pip install -e ".[dev]"

Build A Database

Kotodex queries a local SQLite database. Build one before using the API.

kotodex update
kotodex rebuild --db /home/user/jisho.db
kotodex status --db /home/user/jisho.db

Default paths:

Source cache: ~/.cache/kotodex/sources
Database: ~/.local/share/kotodex/jisho.db

Use --db whenever you want to control the database location explicitly.

Python Quick Start

from kotodex import Jisho

with Jisho("/home/user/jisho.db") as j:
    print(j.imi("食べる"))
    print(j.kanji("食"))
    print(j.examples("食べる"))
    print(j.query("食べる"))

Jisho can also be managed manually:

from kotodex import Jisho

j = Jisho("/home/user/jisho.db")
try:
    result = j.imi("猫")
    print(result.meaning)
finally:
    j.close()

Vocabulary Lookups

Use imi() for meaning-focused vocabulary lookup. lookup() is an alias for imi().

with Jisho("/home/user/jisho.db") as j:
    result = j.imi("待つ")

print(result.found)
print(result.word)
print(result.reading)
print(result.meaning)
print(result.jlpt)
print(result.pos)

Return more ranked matches with zenbu=True:

with Jisho("/home/user/jisho.db") as j:
    result = j.imi("食べ%る", zenbu=True)

for entry in result.entries:
    print(entry.word, entry.reading, entry.meaning)

Use SQL wildcards in vocabulary lookups:

j.imi("猫%", zenbu=True)
j.imi("%する", zenbu=True)

Use romaji input:

result = j.imi("taberu", romaji=True)
print(result.word)
print(result.reading)
print(result.reading_romaji)

Include linked example sentences on vocabulary entries:

result = j.imi("食べる", examples=2)

for sentence in result.examples:
    print(sentence.japanese)
    print(sentence.english)

Derived verb forms are recovered when direct lookup fails:

result = j.imi("待てる")

print(result.word)             # 待つ
print(result.reading)          # まつ
print(result.surface_reading)  # まてる
print(result.origin)           # 待つ
print(result.derivation)       # potential

Direct matches always win. Derivation metadata is only present when Kotodex had to recover a dictionary base form.

Kanji Lookups

Look up kanji directly:

result = j.kanji("猫")

print(result.literal)
print(result.meaning)
print(result.readings)
print(result.kun_readings)
print(result.on_readings)
print(result.stroke_count)
print(result.jlpt)

Look up every kanji in a string:

result = j.kanji("日本語")

for kanji in result.results:
    print(kanji.literal, kanji.meaning)

Search by radicals:

j.kanji(radicals=["氵", "木"])
j.kanji(radicals=["氵", "木"], radical_match="any")

Search by stroke count:

j.kanji(strokes=9)
j.kanji(strokes=(8, 10))
j.kanji(strokes=range(8, 11))

Search by JLPT level:

j.kanji(jlpt="N5", zenbu=True)

Include examples and similar kanji:

result = j.kanji("食", examples=2)

print(result.first.examples)
print(result.similar)

Disable similar kanji if you only need the base record:

j.kanji("食", include_similar=False)

Example Sentences

Use examples() to retrieve Tatoeba-linked example sentences.

result = j.examples("食べる", limit=5)

for sentence in result.sentences:
    print(sentence.japanese)
    print(sentence.english)
    print(sentence.attribution)

Filter by difficulty or JLPT level:

j.examples("食べる", difficulty="N5")
j.examples("食べる", jlpt="N5")

Use romaji input:

j.examples("taberu", romaji=True)

Combined Queries

Use query() when you want vocabulary, kanji, names, example sentences, and provenance in one result.

result = j.query("食べる")

print(result.found)
print(result.vocabulary)
print(result.kanji)
print(result.names)
print(result.examples)
print(result.provenance)

Combined queries also inherit vocabulary derivation metadata:

result = j.query("待てる")

print(result.vocabulary[0].word)  # 待つ
print(result.origin)              # 待つ
print(result.derivation)          # potential
print(result.surface_reading)     # まてる

Result Objects

All result objects support:

result.to_dict()
result.to_json(indent=2)

`ImiLookupResult`

Common fields and shortcuts:

query: original query text
lemma: Sudachi dictionary form when available
found: True when at least one entry was returned
count: number of returned entries
entries: list of ImiEntry
first: first entry or None
word, reading, meaning, meanings, jlpt, pos: shortcuts for the first entry
origin, derivation, surface_reading: populated for derived-form recovery

`ImiEntry`

Vocabulary entry fields:

word
reading
definitions
meaning
jlpt
pos
reading_romaji
common
priority
example
examples
entry_id
ent_seq
source
extra

`KanjiLookupResult`

Common fields and shortcuts:

query
found
count
results
first
literal, meaning, meanings, readings
kun_readings, on_readings
stroke_count, strokes
jlpt
radicals
similar

`KanjiEntry`

Kanji entry fields:

literal
meanings
on_readings
kun_readings
radicals
stroke_count
grade
jlpt
freq
radical_classical
on_romaji
kun_romaji
similar
examples
extra

`ExampleLookupResult`

Example lookup fields:

query
found
count
sentences
difficulty
lemma
first

`ExampleSentence`

Sentence fields:

tatoeba_id
japanese
english
attribution
difficulty
japanese_romaji
source

`QueryResult`

Combined query fields:

query
lemma
found
vocabulary
kanji
names
examples
provenance
raw
origin
derivation
surface_reading

JSON Export

Use JSON export for API responses, notebooks, scripts, and debugging.

print(j.imi("食べる").to_json(indent=2))
print(j.kanji("食").to_json(indent=2))
print(j.examples("食べる").to_json(indent=2))
print(j.query("食べる").to_json(indent=2))

Disable escaped Japanese text if you pass your own JSON settings:

j.query("食べる").to_json(indent=2, ensure_ascii=False)

ensure_ascii=False is already the default.

CLI Usage

Download or refresh source files:

kotodex update
kotodex update --force
kotodex update --cache-dir /tmp/kotodex-sources

Build a database:

kotodex rebuild --db /home/user/jisho.db
kotodex rebuild --force --db /home/user/jisho.db
kotodex rebuild --db /home/user/jisho.db --cache-dir /tmp/kotodex-sources

Check database and source status:

kotodex status --db /home/user/jisho.db

Vocabulary lookup:

kotodex imi 食べる --db /home/user/jisho.db
kotodex imi taberu --romaji --db /home/user/jisho.db
kotodex imi '食べ%る' --zenbu --db /home/user/jisho.db
kotodex imi 食べる --examples 2 --json --db /home/user/jisho.db

Kanji lookup and search:

kotodex kanji 食 --db /home/user/jisho.db
kotodex kanji 日本語 --db /home/user/jisho.db
kotodex kanji --radical 氵 --radical 木 --db /home/user/jisho.db
kotodex kanji --radical 氵 --radical 木 --radical-match any --db /home/user/jisho.db
kotodex kanji --strokes 9 --db /home/user/jisho.db
kotodex kanji --strokes 8-10 --db /home/user/jisho.db
kotodex kanji --jlpt N5 --zenbu --json --db /home/user/jisho.db

Example sentences:

kotodex examples 食べる --limit 5 --db /home/user/jisho.db
kotodex examples 食べる --difficulty N5 --db /home/user/jisho.db
kotodex examples taberu --romaji --json --db /home/user/jisho.db

Combined query:

kotodex query 食べる --db /home/user/jisho.db
kotodex query 食べる --examples 10 --json --db /home/user/jisho.db

Lemma Helper

Use Sudachi-based normalization directly when you only need the dictionary form.

from kotodex.lemma import get_lemma

print(get_lemma("待てる"))
print(get_lemma("食べました"))

Choose a Sudachi dictionary size:

get_lemma("食べました", dict_type="small")
get_lemma("食べました", dict_type="core")
get_lemma("食べました", dict_type="full")

Provenance And Licensing

Kotodex stores source provenance in the generated database.

with Jisho("/home/user/jisho.db") as j:
    print(j.notice())
    print(j.provenance())

Important licensing notes:

Source-derived content comes from EDRDG and Tatoeba and has attribution obligations.
Generated databases may be subject to CC BY-SA 4.0 due to EDRDG-derived content.
Per-sentence Tatoeba attribution is exposed as ExampleSentence.attribution.
Use Jisho.notice() and Jisho.provenance() to inspect the generated database metadata.

Typical Workflows

Build once, query many times:

kotodex update
kotodex rebuild --db ./jisho.db

from kotodex import Jisho

with Jisho("./jisho.db") as j:
    print(j.imi("猫").meaning)

Create a local JSON lookup endpoint:

from kotodex import Jisho

def lookup_json(text: str) -> str:
    with Jisho("./jisho.db") as j:
        return j.query(text).to_json(indent=2)

Export study data:

from kotodex import Jisho

with Jisho("./jisho.db") as j:
    result = j.imi("食べ%る", zenbu=True)
    rows = [(entry.word, entry.reading, entry.meaning, entry.jlpt) for entry in result.entries]

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.3.0

May 5, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kotodex-0.3.0.tar.gz (289.4 kB view details)

Uploaded May 5, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

kotodex-0.3.0-py3-none-any.whl (286.8 kB view details)

Uploaded May 5, 2026 Python 3

File details

Details for the file kotodex-0.3.0.tar.gz.

File metadata

Download URL: kotodex-0.3.0.tar.gz
Upload date: May 5, 2026
Size: 289.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.20

File hashes

Hashes for kotodex-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`fd2c850e7cca2f7a0fb0c829448fcf6ce718c5658f61a3c73fee6fb660239d40`
MD5	`6f03fbe02dceb917b891ae982e60701b`
BLAKE2b-256	`e5f40253d9e40dfa3e5b4062513fd43c393b2261cf02696d21820569fb4af3e7`

See more details on using hashes here.

File details

Details for the file kotodex-0.3.0-py3-none-any.whl.

File metadata

Download URL: kotodex-0.3.0-py3-none-any.whl
Upload date: May 5, 2026
Size: 286.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.20

File hashes

Hashes for kotodex-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`018ec57eddcf2069ce067bcedc6ef2116bf244483b417d013927629be47c5ea4`
MD5	`fac0414c22ef5a26b805a56c8aa2afe2`
BLAKE2b-256	`bc7eabb66be6bd22b232a4f89e784810bce2062d01aeb4551fca21c4add3db92`

See more details on using hashes here.

kotodex 0.3.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Kotodex Package Guide

Contents

Install

Build A Database

Python Quick Start

Vocabulary Lookups

Kanji Lookups

Example Sentences

Combined Queries

Result Objects

ImiLookupResult

ImiEntry

KanjiLookupResult

KanjiEntry

ExampleLookupResult

ExampleSentence

QueryResult

JSON Export

CLI Usage

Lemma Helper

Provenance And Licensing

Typical Workflows

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`ImiLookupResult`

`ImiEntry`

`KanjiLookupResult`

`KanjiEntry`

`ExampleLookupResult`

`ExampleSentence`

`QueryResult`