Skip to main content

A nlp tool to transform numbers to Chinese characters

Project description

num2chinese

A nlp tool to transform numbers to Chinese

num2chinese uses regular expression to parse alphanumeric literals and transform them into readable Chinese charaters.

Why it matters

  • Chinese's pronuncication has lots of exceptions.
  • For Chinese numbers, a character is uttered dependent of context.
  • Lots of rules are required to handle messy Chinese number pronunciation. Dont' reinvent the wheel!

Examples

  • $120 : 美金一百二十
  • 200塊 : 兩百塊
  • 12121212個蘋果 : 一千兩百一十二萬一千兩百一十二個蘋果
  • 2002002支 : 兩百萬兩千零二支
  • 9487 : 九四八七
  • 080080123 : 零八零零八零一二

Usage

text = '12121212個蘋果''
normalizer = Normalizer()
text_normalized = normalizer.normalize(text)
print(text_normalized)
# result is '一千兩百萬十二萬一千兩百一十二個蘋果'

Installation

pip install num2chinese

Requirements

python>=3.6,<4.0

License

MIT license

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

num2chinese-0.0.2.tar.gz (7.2 kB view hashes)

Uploaded Source

Built Distribution

num2chinese-0.0.2-py3-none-any.whl (7.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page