Skip to main content

A package for processing complex text with mixed Chinese and English characters

Project description

Complex Text Tools

PyPI version PyPI - Python Version PyPI - License

一个用于处理包含中英文混合字符的复杂文本的Python包,能够移除多余空格并根据特定规则计算文本长度。

功能特性

  • 移除中文字符之间的多余空格
  • 移除中英文字符之间的多余空格
  • 正确处理标点符号周围的间距
  • 根据特定规则计算文本长度(中文字符、英文单词、数字、等式等)
  • 修复中文文本中的标点符号(将英文标点转换为中文标点)
  • 高效处理混合语言文本

安装

pip install complex-text-tools

使用方法

移除多余空格

from complex_text_tools import remove_extra_spaces

text = "这 是  中文 测试  文本 ,  mixed  English  text  here , 还 有   symbols :  ;  !  "
clean_text = remove_extra_spaces(text)
print(clean_text)
# 输出: "这是中文测试文本,mixed English text here,还有 symbols:;!"

计算有效文本长度

from complex_text_tools import count_eff_len

text = "这是一段包含 English words 和 123.45 数字的 mixed 文本"
result = count_eff_len(text)
print(result)
# 输出:15

修复标点符号

from complex_text_tools import fix_punctuation

text = "这是中文文本,但使用了英文标点.这看起来不太自然,对吗?"
fixed_text = fix_punctuation(text)
print(fixed_text)
# 输出: "这是中文文本,但使用了中文标点。这看起来不太自然,对吗?"

许可证

该项目基于 MIT 许可证 - 详情请见 LICENSE 文件。

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

complex_text_tools-0.2.4.tar.gz (6.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

complex_text_tools-0.2.4-py3-none-any.whl (5.6 kB view details)

Uploaded Python 3

File details

Details for the file complex_text_tools-0.2.4.tar.gz.

File metadata

  • Download URL: complex_text_tools-0.2.4.tar.gz
  • Upload date:
  • Size: 6.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.8.10

File hashes

Hashes for complex_text_tools-0.2.4.tar.gz
Algorithm Hash digest
SHA256 8a2fcb1ec77bd8b8c2dd117f2c2167ac84970798217ebb87a89b6835d93f9c1d
MD5 a51f376f690f2e45c1aa3cd7722dda32
BLAKE2b-256 552e8fffd82df359667d0bcc0172ecfed678609c26816e4240c0193734798257

See more details on using hashes here.

File details

Details for the file complex_text_tools-0.2.4-py3-none-any.whl.

File metadata

File hashes

Hashes for complex_text_tools-0.2.4-py3-none-any.whl
Algorithm Hash digest
SHA256 e8f8278aec99282cadbffcca77e3901186b0d3beb80c189f59f9c5c8a6624c5e
MD5 124aa16430d7377f3afeca78f8e340a4
BLAKE2b-256 c85e26b447d49d17c14c81045d685a716958f4adac47d446910017eed09884c5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page