This product is bugfix splitter of Plone for Japanese.
Project description
This product is bugfix splitter of Plone for Japanese.
Monkey patching below functions.
Products.CMFPlone.UnicodeSplitter.splitter.bigram
Products.CMFPlone.UnicodeSplitter.splitter.process_unicode
Products.CMFPlone.UnicodeSplitter.splitter.process_unicode_glob
Details
bigram
return [u[i : i + 2] for i in range(len(u) - limit)]
to
if len(u) == 1:
return [u]
else:
return [u[i:i + 2] for i in range(len(u) - limit)]
process_unicode
swords = [g.group() for g in pattern.finditer(word)]
for sword in swords:
if not rx_all.match(sword[0]):
yield sword
else:
yield from bigram(sword, 0)
to
swords = [g.group() for g in pattern.finditer(word)]
for sword in swords:
if not rx_all.match(sword[0]):
yield sword
else:
for x in bigram(sword, 1): # modified
yield x
process_unicode_glob
if i == len(swords) - 1:
limit = 1
else:
limit = 0
to
limit = 1
Installation
Install c2.patch.jasplitter by adding it to your buildout:
[buildout] ... eggs = c2.patch.jasplitter
and then running bin/buildout
Contribute
Issue Tracker: https://bitbucket.org/cmscom/c2.patch.jasplitter/admin/issues
Source Code: https://bitbucket.org/cmscom/c2.patch.jasplitter
Support
If you are having issues, please let us know on the issue tracker.
License
The project is licensed under the GPLv2.
Contributors
Manabu TERADA, terada@cmscom.jp
Changelog
1.0b1 (2024-02-06)
remove includeDependecies for plone6 compatibility. [terapyon]
1.0a5 (2023-08-24)
Upload PyPI. [terapyon]
1.0a4 (2023-08-17)
Support Python 3. [terapyon]
1.0a3 (2017-03-9)
Support continual CJK and Ascii words. [terapyon]
Missing packaging for MANIFEST [terapyon]
1.0a2 (2016-11-17)
Package bugfix. [terapyon]
1.0a1 (unreleased)
Initial release. [terapyon]
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Hashes for c2.patch.jasplitter-1.0b1.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 50fc39066c73d891f0a7a32307b6b10a3a5075ac85a1340b85d71a8ff2675ca6 |
|
MD5 | 676c2e6e10bf41dbe922649246ae29ef |
|
BLAKE2b-256 | 85642bd28e415772e0e7d2d49a2b20067cae274969f768b060aa2595cedc4b3d |