Skip to main content

A Python extension module powered by Rust and PyO3, providing fast and accurate Chinese text conversion.

Project description

opencc_pyo3

PyPI version Downloads Python Versions License Build Status

opencc_pyo3 is a Python extension module powered by Rust and PyO3, providing fast and accurate conversion between different Chinese text variants using OpenCC algorithms.

Features

  • Convert between Simplified, Traditional, Hong Kong, Taiwan, and Japanese Kanji Chinese text.
  • Fast and memory-efficient, leveraging Rust's performance.
  • Easy-to-use Python API.
  • Supports punctuation conversion and automatic text code detection.

Supported Conversion Configurations

  • s2t, t2s, s2tw, tw2s, s2twp, tw2sp, s2hk, hk2s, t2tw, tw2t, t2twp, tw2tp, t2hk, hk2t, t2jp, jp2t

Installation

Build and install the Python wheel using maturin:

# In project root
maturin build --release
pip install ./target/wheels/opencc_pyo3-<version>-cp<pyver>-abi3-<platform>.whl

Or for development (May require venv):

maturin develop -r

See build.txt for detailed build and install instructions.

Usage

Python

from opencc_pyo3 import OpenCC

text = "“春眠不觉晓,处处闻啼鸟。”"
opencc = OpenCC("s2t")
converted = opencc.convert(text, punctuation=True)
print(converted)  # 「春眠不覺曉,處處聞啼鳥。」

CLI

You can also use the CLI interface:

python -m opencc_pyo3 -i input.txt -o output.txt -c s2t --punct

API

Class: OpenCC

  • OpenCC(config: str = "s2t")
    • config: Conversion configuration (see above).
  • convert(input: str, punctuation: bool = False) -> str
    • Convert text with optional punctuation conversion.
  • zho_check(input: str) -> int
    • Detects the code of the input text.
    • 1 - Traditional, 2 - Simplified, 0 - others

Development

Benchmarks

Package: opencc_pyo3
Python 3.13.5 (tags/v3.13.5:6cb20a2, Jun 11 2025, 16:15:46) [MSC v.1943 64 bit (AMD64)]
Platform: Windows-11-10.0.26100-SP0
Processor: Intel64 Family 6 Model 191 Stepping 2, GenuineIntel

BENCHMARK RESULTS


Method Config TextSize Mean StdDev Min Max Ops/sec Chars/sec
Convert_Small s2t 100 0.118 ms 0.097 ms 0.049 ms 0.811 ms 8,499 849,910
Convert_Medium s2t 1,000 0.250 ms 0.036 ms 0.211 ms 0.509 ms 4,004 4,003,531
Convert_Large s2t 10,000 0.845 ms 0.060 ms 0.775 ms 1.420 ms 1,184 11,835,419
Convert_XLarge s2t 100,000 4.755 ms 0.152 ms 4.515 ms 5.680 ms 210 21,030,543
Convert_Small s2tw 100 0.141 ms 0.027 ms 0.096 ms 0.321 ms 7,111 711,093
Convert_Medium s2tw 1,000 0.392 ms 0.030 ms 0.355 ms 0.623 ms 2,552 2,552,127
Convert_Large s2tw 10,000 1.271 ms 0.044 ms 1.191 ms 1.474 ms 787 7,869,452
Convert_XLarge s2tw 100,000 6.317 ms 0.139 ms 6.004 ms 7.250 ms 158 15,831,322
Convert_Small s2twp 100 0.204 ms 0.028 ms 0.132 ms 0.380 ms 4,911 491,118
Convert_Medium s2twp 1,000 0.598 ms 0.039 ms 0.527 ms 0.747 ms 1,671 1,671,296
Convert_Large s2twp 10,000 1.942 ms 0.061 ms 1.823 ms 2.223 ms 515 5,149,357
Convert_XLarge s2twp 100,000 9.937 ms 0.173 ms 9.542 ms 10.707 ms 101 10,063,174

Throughput vs Size

ThroughputVsSizeChart

License

MIT


Powered by Rust, PyO3, OpenCC and opencc-fmmseg.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

opencc_pyo3-0.6.2-cp38-abi3-win_amd64.whl (1.5 MB view details)

Uploaded CPython 3.8+Windows x86-64

opencc_pyo3-0.6.2-cp38-abi3-manylinux_2_34_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.34+ x86-64

opencc_pyo3-0.6.2-cp38-abi3-macosx_11_0_arm64.whl (1.6 MB view details)

Uploaded CPython 3.8+macOS 11.0+ ARM64

File details

Details for the file opencc_pyo3-0.6.2-cp38-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for opencc_pyo3-0.6.2-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 41300ac74b5d1bc84138ee8f7e2666e95558902daea814ba04058a8b67c18243
MD5 237d6cf7e6e63a16e4539871dab87516
BLAKE2b-256 da400c976bc25ab2f3a32b52b27d63b27090abe45671e10a0ca967464dc894ca

See more details on using hashes here.

File details

Details for the file opencc_pyo3-0.6.2-cp38-abi3-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for opencc_pyo3-0.6.2-cp38-abi3-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 bda79d61085d076904dd5761071eecfccfc0c36d8f2c2f657ff15a916aacfac7
MD5 686cef9b9db940b6ab98617debf769e0
BLAKE2b-256 657ed6016573b41d1b625d452107c9c1e5b1f37fb2251b5b7fdfbb1fd60f4492

See more details on using hashes here.

File details

Details for the file opencc_pyo3-0.6.2-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for opencc_pyo3-0.6.2-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 4abe7e6cb7530258a939fc173eacb796780eff2d1f86576b5738381d9bc6851d
MD5 111955ab6d0a95c26350082b78752f5f
BLAKE2b-256 8b2faaf6bd78e3b031402a7f2efb5d2277193ce3ac782c3564b1a3a6d2bbe506

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page