Skip to main content

A Japanese conjugation form converter

Project description

Katsuyo Text

日本語の活用変換器
A Japanese conjugation form converter

Motivation

日本語文法における活用変形をロジックに落とし込めるかの試み

⚠CAUTION

現状、挙動は不安定です。必要に応じてアップデートしたいです。

How to Use

追加

from katsuyo_text.katsuyo_text_helper import (
    Hitei,
    KakoKanryo,
    DanteiTeinei,
)
from katsuyo_text.spacy_katsuyo_text_detector import SpacyKatsuyoTextSourceDetector
import spacy


nlp = spacy.load("ja_ginza")
src_detector = SpacyKatsuyoTextSourceDetector()


doc = nlp("今日は旅行に行く")
sent = next(doc.sents)
katsuyo_text = src_detector.try_detect(sent[-1])

katsuyo_text
# => KatsuyoText(gokan='行', katsuyo=GodanKatsuyo(renyo_ta='っ', mizen_u='こ', meirei='け', katei='け', rentai='く', shushi='く', renyo='き', mizen='か'))

print(katsuyo_text + Hitei())
# => 行かない
print(katsuyo_text + Hitei() + KakoKanryo())
# => 行かなかった
print(katsuyo_text + Hitei() + KakoKanryo() + DanteiTeinei())
# => 行かなかったです

変換

from katsuyo_text.katsuyo_text_helper import (
    Teinei,
    Dantei,
    DanteiTeinei,
)
from katsuyo_text.spacy_sentence_converter import SpacySentenceConverter
import spacy


nlp = spacy.load("ja_ginza")
converter = SpacySentenceConverter(
    conversions_dict={
        Teinei(): None,
        DanteiTeinei(): Dantei(),
    }
)


doc = nlp("今日は旅行に行きました")
sent = next(doc.sents)
print(converter.convert(sent))
# => 今日は旅行に行った

doc = nlp("今日は最高の日でした")
sent = next(doc.sents)
print(converter.convert(sent))
# => 今日は最高の日だった

カスタマイズ

文法的に成立しない活用変形を bridge で実現している

from katsuyo_text.katsuyo_text import TaigenText, JODOUSHI_NAI

TaigenText("大丈夫") + JODOUSHI_NAI
# error => katsuyo_text.katsuyo_text.KatsuyoTextError: Unsupported katsuyo_text in merge of <class 'katsuyo_text.katsuyo_text.Nai'>: 大丈夫 type: <class 'katsuyo_text.katsuyo_text.TaigenText'>

from katsuyo_text.katsuyo_text_helper import Hitei
TaigenText("大丈夫") + Hitei()
# => KatsuyoText(gokan='大丈夫ではな', katsuyo=KeiyoushiKatsuyo(katei='けれ', rentai='い', shushi='い', renyo_ta='かっ', renyo='く', mizen='かろ'))

TaigenText("大丈夫") + Hitei() == Hitei().bridge(TaigenText("大丈夫"))
# => True

bridge はカスタマイズ可能

from katsuyo_text.katsuyo_text import KatsuyoText, TaigenText, KAKUJOSHI_GA
from katsuyo_text.katsuyo import KEIYOUSHI

nai = KatsuyoText(gokan="な", katsuyo=KEIYOUSHI)
custom_hitei = Hitei(bridge=lambda src: src + KAKUJOSHI_GA + nai)

TaigenText("耐性") + custom_hitei
# => KatsuyoText(gokan='耐性がな', katsuyo=KeiyoushiKatsuyo(katei='けれ', rentai='い', shushi='い', renyo_ta='かっ', renyo='く', mizen='かろ'))

IKatsuyoTextHelper で独自の活用変形を実装可能

from typing import Optional
from katsuyo_text.katsuyo_text_helper import IKatsuyoTextHelper
from katsuyo_text.katsuyo_text import (
    TaigenText,
    KatsuyoTextError,
    IKatsuyoTextSource,
    SetsuzokujoshiText,
    KURU,
    SETSUZOKUJOSHI_KARA,
    JUNTAIJOSHI_NO,
    JODOUSHI_DA_DANTEI,
)


class JunsetsuKakutei(IKatsuyoTextHelper[SetsuzokujoshiText]):
    def try_merge(self, pre: IKatsuyoTextSource) -> Optional[SetsuzokujoshiText]:
        try:
            pre + SETSUZOKUJOSHI_KARA
        except KatsuyoTextError as e:
            # Handle error
            return None


KURU
# => KatsuyoText(gokan='', katsuyo=KaGyoHenkakuKatsuyo(meirei='こい', katei='くれ', rentai='くる', shushi='くる', renyo='き', mizen='こ'))
KURU + JunsetsuKakutei()
# => SetsuzokujoshiText(gokan='くるから', katsuyo=None)

custom_junsetsu_kakutei = JunsetsuKakutei(bridge=lambda src: src + JODOUSHI_DA_DANTEI + SETSUZOKUJOSHI_KARA)

TaigenText("症状") + JunsetsuKakutei()
# error => katsuyo_text.katsuyo_text.KatsuyoTextError: Unsupported katsuyo_text in merge of <class '__main__.JunsetsuKakutei'>: 症状 type: <class 'katsuyo_text.katsuyo_text.TaigenText'> katsuyo: <class 'NoneType'>
TaigenText("症状") + custom_junsetsu_kakutei
# => SetsuzokujoshiText(gokan='症状だから', katsuyo=None)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

katsuyo-text-0.1.2.tar.gz (24.4 kB view details)

Uploaded Source

Built Distribution

katsuyo_text-0.1.2-py3-none-any.whl (26.0 kB view details)

Uploaded Python 3

File details

Details for the file katsuyo-text-0.1.2.tar.gz.

File metadata

  • Download URL: katsuyo-text-0.1.2.tar.gz
  • Upload date:
  • Size: 24.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.2.1 CPython/3.10.7 Darwin/22.1.0

File hashes

Hashes for katsuyo-text-0.1.2.tar.gz
Algorithm Hash digest
SHA256 36d4d7bebcfd4d08ce5fd2d852213a98f222feee1a2158f32d451e6c64a557d0
MD5 e5dc9e8d4207d0fd2e202b5758df109c
BLAKE2b-256 886a817d38f1ebb7f18d84d9b17cf80981fc3ded83a79336ffeb91c5340dcfb9

See more details on using hashes here.

File details

Details for the file katsuyo_text-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: katsuyo_text-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 26.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.2.1 CPython/3.10.7 Darwin/22.1.0

File hashes

Hashes for katsuyo_text-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 b5b0c302b6041ffa6a7d3a9556cd3caded8c58edb6777fa37779866eb80f3f01
MD5 525c38731ec9c533953429a68f16e549
BLAKE2b-256 6780edb3a6f1eeaf1daf0ff68ee9bbb313764c4c34d8240817b3a19735e52e9a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page