A comprehensive, openly-licensed Japanese language database
Project description
Kotobase
A Comprehensive Japanese Language Database
Kotobase is a python package that aggregates several openly licensed Japanese Language data sources into one SQLite database and exposes simple programmatic access to it
Quickstart
Install
pip install kotobase
Get The Database File
# Get The Latest Release From GitHub
kotobase db pull
# Download Sources & Build Locally
kotobase db build
Access Data
CLI
# Runs Comprehensive Lookup Across All Sources
kotobase lookup all 日本語
Python
from kotobase import Kotobase
kb = Kotobase()
result = kb("日本語")
Features
| Comprehensive Lookups | One lookup all Query Aggregates Data From All Souces |
| Organized Data | Every Source Is Fully Extracted Into A Normalized SQLite Schema & Exposed As Typed, Serializable DTOs |
| Example Sentences | Search Tatoeba Example Sentences + Their English Translation By Text |
| Wildcard Search | Match Written / Reading Forms With * & % Wildcard Patterns |
| CLI | A Typer + Rich CLI With Readable, Panelled Output & --json For Scripting |
| Self-Contained | A Single SQLite (~400MB) File + Optional Audio Pack (~150MB) With No Server / Network Access Needed At Query Time |
| Easy Database Management | Pull Pre-Built Databases From GitHub Releases Or Build It Locally + Manage The Cache From The CLI |
Help
| Documentation | Full API + CLI Reference |
| Examples | Curated Usage Examples |
| Changelog | Changes Between Versions |
Data Sources & Licenses
Every source is openly licensed
The compiled database is a derived work of the sources below, and each row keeps its source and license where appropariate
See the Third-Party Notices for the
full attribution text
| Source | Provides | License |
JMdict
|
Dictionary Entries → Written + Reading Forms / Senses, Glosses, Part-Of-Speech / Register / Field / Dialect Tags, Priorities | CC BY-SA 4.0 |
JMNedict
|
Proper Names → People / Places / Organisations + Their Types | CC BY-SA 4.0 |
KanjiDic2
|
Kanji Profiles → Readings / Meanings / Stroke Counts / Grades / Frequencies / SKIP / Dictionary References | CC BY-SA 4.0 |
KRADFILE / RADKFILE
|
Kanji To Radical Decomposition For Radical Search | CC BY-SA 4.0 |
JmdictFurigana
|
Per-Form Furigana Segmentation | CC BY-SA 4.0 |
KanjiVG
|
Stroke-Order SVG Paths | CC BY-SA 3.0 |
Tatoeba
|
Example Sentences With Japanese To English Translations | CC BY 2.0 FR |
Tanos JLPT
|
JLPT Vocabulary / Kanji / Grammar Study Lists | CC BY 4.0 |
Kanji Alive
|
Word Pronunciation Audio (Optional Download) | CC BY 4.0 |
Contributing
-
All contributions are welcome
-
See
CONTRIBUTINGfor local setup, commands, and PR conventions
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file kotobase-0.3.0.tar.gz.
File metadata
- Download URL: kotobase-0.3.0.tar.gz
- Upload date:
- Size: 333.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2ef4b62b2619e3ca7d34cffc0ec64f9e0d522eae916793d429e562ed1e23be45
|
|
| MD5 |
9396a5d35cad54acf2d9fb9e1cfff495
|
|
| BLAKE2b-256 |
ecb0b4f05296c657544884834ffc3408c43eee73de45304d0977dcfea24e8488
|
Provenance
The following attestation bundles were made for kotobase-0.3.0.tar.gz:
Publisher:
release.yaml on svdC1/kotobase
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
kotobase-0.3.0.tar.gz -
Subject digest:
2ef4b62b2619e3ca7d34cffc0ec64f9e0d522eae916793d429e562ed1e23be45 - Sigstore transparency entry: 1997715836
- Sigstore integration time:
-
Permalink:
svdC1/kotobase@f8ec56dabad247ca46297d52d149e2be64ebeff7 -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/svdC1
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yaml@f8ec56dabad247ca46297d52d149e2be64ebeff7 -
Trigger Event:
release
-
Statement type:
File details
Details for the file kotobase-0.3.0-py3-none-any.whl.
File metadata
- Download URL: kotobase-0.3.0-py3-none-any.whl
- Upload date:
- Size: 351.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b4ec89ceaddd230cdde56a3b33e6c8d21b30b02f8c7e29960ac6b2d7e6079aa7
|
|
| MD5 |
a562631e56c8e569ad088201d8395134
|
|
| BLAKE2b-256 |
bd7b402588aff10386c296ec01dc79cbee3e367524601ca8fda4dba43a2dc43c
|
Provenance
The following attestation bundles were made for kotobase-0.3.0-py3-none-any.whl:
Publisher:
release.yaml on svdC1/kotobase
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
kotobase-0.3.0-py3-none-any.whl -
Subject digest:
b4ec89ceaddd230cdde56a3b33e6c8d21b30b02f8c7e29960ac6b2d7e6079aa7 - Sigstore transparency entry: 1997715933
- Sigstore integration time:
-
Permalink:
svdC1/kotobase@f8ec56dabad247ca46297d52d149e2be64ebeff7 -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/svdC1
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yaml@f8ec56dabad247ca46297d52d149e2be64ebeff7 -
Trigger Event:
release
-
Statement type: