Skip to main content

A comprehensive, openly-licensed Japanese language database

Project description

Kotobase

PyPI Python License Docs Database Build

A Comprehensive Japanese Language Database

Kotobase is a python package that aggregates several openly licensed Japanese Language data sources into one SQLite database and exposes simple programmatic access to it

Quickstart

Install

pip install kotobase

Get The Database File

# Get The Latest Release From GitHub
kotobase db pull

# Download Sources & Build Locally
kotobase db build

Access Data

CLI

# Runs Comprehensive Lookup Across All Sources
kotobase lookup all 日本語

Python

from kotobase import Kotobase

kb = Kotobase()

result = kb("日本語")

Features

Comprehensive Lookups One lookup all Query Aggregates Data From All Souces
Organized Data Every Source Is Fully Extracted Into A Normalized SQLite Schema & Exposed As Typed, Serializable DTOs
Example Sentences Search Tatoeba Example Sentences + Their English Translation By Text
Wildcard Search Match Written / Reading Forms With * & % Wildcard Patterns
CLI A Typer + Rich CLI With Readable, Panelled Output & --json For Scripting
Self-Contained A Single SQLite (~400MB) File + Optional Audio Pack (~150MB) With No Server / Network Access Needed At Query Time
Easy Database Management Pull Pre-Built Databases From GitHub Releases Or Build It Locally + Manage The Cache From The CLI

Help

Documentation Full API + CLI Reference
Examples Curated Usage Examples
Changelog Changes Between Versions

Data Sources & Licenses

Every source is openly licensed

The compiled database is a derived work of the sources below, and each row keeps its source and license where appropariate

See the Third-Party Notices for the full attribution text

Source Provides License
JMdict Dictionary Entries → Written + Reading Forms / Senses, Glosses, Part-Of-Speech / Register / Field / Dialect Tags, Priorities CC BY-SA 4.0
JMNedict Proper Names → People / Places / Organisations + Their Types CC BY-SA 4.0
KanjiDic2 Kanji Profiles → Readings / Meanings / Stroke Counts / Grades / Frequencies / SKIP / Dictionary References CC BY-SA 4.0
KRADFILE / RADKFILE Kanji To Radical Decomposition For Radical Search CC BY-SA 4.0
JmdictFurigana Per-Form Furigana Segmentation CC BY-SA 4.0
KanjiVG Stroke-Order SVG Paths CC BY-SA 3.0
Tatoeba Example Sentences With Japanese To English Translations CC BY 2.0 FR
Tanos JLPT JLPT Vocabulary / Kanji / Grammar Study Lists CC BY 4.0
Kanji Alive Word Pronunciation Audio (Optional Download) CC BY 4.0

Contributing

  • All contributions are welcome

  • See CONTRIBUTING for local setup, commands, and PR conventions

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kotobase-0.3.0.tar.gz (333.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kotobase-0.3.0-py3-none-any.whl (351.1 kB view details)

Uploaded Python 3

File details

Details for the file kotobase-0.3.0.tar.gz.

File metadata

  • Download URL: kotobase-0.3.0.tar.gz
  • Upload date:
  • Size: 333.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for kotobase-0.3.0.tar.gz
Algorithm Hash digest
SHA256 2ef4b62b2619e3ca7d34cffc0ec64f9e0d522eae916793d429e562ed1e23be45
MD5 9396a5d35cad54acf2d9fb9e1cfff495
BLAKE2b-256 ecb0b4f05296c657544884834ffc3408c43eee73de45304d0977dcfea24e8488

See more details on using hashes here.

Provenance

The following attestation bundles were made for kotobase-0.3.0.tar.gz:

Publisher: release.yaml on svdC1/kotobase

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kotobase-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: kotobase-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 351.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for kotobase-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b4ec89ceaddd230cdde56a3b33e6c8d21b30b02f8c7e29960ac6b2d7e6079aa7
MD5 a562631e56c8e569ad088201d8395134
BLAKE2b-256 bd7b402588aff10386c296ec01dc79cbee3e367524601ca8fda4dba43a2dc43c

See more details on using hashes here.

Provenance

The following attestation bundles were made for kotobase-0.3.0-py3-none-any.whl:

Publisher: release.yaml on svdC1/kotobase

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page