Skip to main content

Kanji Database

Project description

KanjiDB

versions PyPI version Build Status Documentation Status CircleCI Test Coverage Code style: black contributions welcome

Kanji database builder and REST API.

KanjiDB aims to help you build your own kanji database by compiling informations from various existing source into a single JSON file. It's plugin system let you write you own plugin to collect and add new data to kanjis, or to arrange already written plugins to meet your needs. It's goal is to be flexible enough to let you export all the informations you need to build your own app (database, viewer, Anki deck builder, ...) and progress in learning Japanese. KanjiDB also comes with a REST API allowing to retrieve those informations and build services uppon.

Online demo

You can test the REST API online at kanjidb.jeremymorosi.com/api/v1/doc:

alt text

The documentation is generated by aiohttp_swagger.

Install

Using pip:

pip install kanjidb

Show help:

python -m kanjidb -h

Usage:  kanjidb COMMAND [OPTIONS]

A kanji database accessible via REST API

Options:
  -v, --version            Print version information and quit
  -h, --help               Show this help

Commands:
  build       Build kanji database from sources
  run         Run local server and REST API

Run 'kanjidb COMMAND --help' for more information on a command.

Generating a JSON database

Create a kanjis.txt file containing one UTF-8 encoded kanji per line. This is the list of kanjis that will be included in our database:

一
二
三

Now, create a config.yml file containing:

run:
- kanjidic2:
    kd2_file: path/to/kanjidic2.xml
    inputs:
    - type: stream
      encoding: utf8
      separator: "\n"
      path: path/to/kanjis.txt
    outputs:
    - type: stream
      indent: 4
      path: path/to/db.json

In this configuration:

  • kanjistream: is a plugin that generate a JSON dict with data from a Kanjidic2 XML file.
  • path/to/kanjidic2.xml: is the path to a Kanjidic2 XML file (download here).
  • path/to/kanjis.txt: is the path to the kanjis.txt file.
  • path/to/db.json: is the destination of generated JSON database.

Run the following command:

python -m kanjidb build config.yml

This generate a db.json file containing the generated JSON database. Depending on your configuration this file can be quite big, so here is only an example of what you would obtain:

{
    "一": {
        "meanings": [{"m_lang": "", "value": "one"}]
    },
    "二": {
        "meanings": [{"m_lang": "", "value": "two"}]
    },
    "三": {
        "meanings": [{"m_lang": "", "value": "three"}]
    }
}

You can read more about the kanjidic2 plugin and its configuration here.

Running the REST API

Now we will run a local server with a REST API allowing us to query informations from generated db.json file.

First, create a config.cnf file containing:

[service]
port = 8080
base-url = /api/v1
swagger-yml = /path/to/swagger.yml
swagger-url = /api/v1/doc
db-file = /path/to/db.json

Just replace:

  • /path/to/swagger.yml: by the path to your local swagger.yml file.
  • /path/to/db.json: by the path to your generated db.json file.

Now run:

python -m kanjidb run /path/to/config.cnf/directory/

You should see:

======== Running on http://0.0.0.0:8080 ========
(Press CTRL+C to quit)

Meaning the service is up and ready.

You can access it via:

Note that this repository also include a default config.cnf, swagger.yml and db.json file you can use to run the server. Simply checkout this repository and run:

python -m kanjidb run etc

Running with Docker

You can build a Docker image by downloading this repository and running:

docker build -t kanjidb:latest .

Next, run the Docker image as:

docker run \
 -v /path/to/etc:/etc/service \
 -v /path/to/log:/var/log/service \
 -p 8080:8080 \
 -it kanjidb:latest

Where:

  • /path/to/etc: is the path to the service directory containing config.cnf.
  • /path/to/log: is the path to the directory where you wan't to store logs.
  • 8080: is the public port to access the REST API.

You should see:

======== Running on http://0.0.0.0:8080 ========
(Press CTRL+C to quit)

Meaning the service is up and ready.

Testing

The test directory contains many tests that you can run with:

python setup.py test

Or with coverage:

coverage run --source=kanjidb setup.py test

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kanjidb-0.1a2.tar.gz (17.6 kB view hashes)

Uploaded Source

Built Distribution

kanjidb-0.1a2-py3-none-any.whl (37.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page