Google's langdetect modified for Chinese texts
Project description
langdetect_zh
Installation
$ pip install langdetect_zh
Supported Python versions 2.7, 3.4+.
Languages
langdetect_zh supports 2 languages out of the box (ISO 639-1 codes):
zh-cn, zh-tw
Basic usage
Directly output the most similar language code:
>>> from langdetect_zh import detect
>>> detect("这是一段中文文本")
'zh-cn'
To find out the probabilities for the top languages:
>>> from langdetect_zh import detect_langs
>>> detect_langs("这是一段中文文本")
[zh-cn:0.999997316441747]
NOTE
Language detection algorithm is non-deterministic, which means that if you try to run it on a text which is either too short or too ambiguous, you might get different results everytime you run it.
To enforce consistent results, call following code before the first language detection:
from langdetect_zh import DetectorFactory
DetectorFactory.seed = 0
Original project
This package is an optimization of langdetect. The specific optimization measure is to subdivide simplified Chinese and traditional Chinese under the condition of pure Chinese.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file langdetect_zh-1.0.4.tar.gz.
File metadata
- Download URL: langdetect_zh-1.0.4.tar.gz
- Upload date:
- Size: 59.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
35e8a3e52ae40566b2999d1440a7f1730769b4bb6a1c61eb47f8a2344ff15e28
|
|
| MD5 |
7392aa2b9b9d6d51baa7215092f8649f
|
|
| BLAKE2b-256 |
d9dccf9c7298121a4ffa6b76cfe28081dff7de83f1e19075a99e8e5c63fef08a
|
File details
Details for the file langdetect_zh-1.0.4-py3-none-any.whl.
File metadata
- Download URL: langdetect_zh-1.0.4-py3-none-any.whl
- Upload date:
- Size: 62.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
579a5020da6bb071dba117acedffee210841d9290d617f506396f59b48bf6509
|
|
| MD5 |
ea043c700549e44b82f7e21043ad3ba9
|
|
| BLAKE2b-256 |
eec1a65aec56af3e148b4b9e2d46da3c258ebfe7c056e09a89895ebdb1c5ec0e
|