cChardet is high speed universal character encoding detector.
Reason this release was yanked:
Does not import
Project description
cChardet
NOTICE: This is a fork of the original project at https://github.com/PyYoshi/cChardet since the original project is no longer maintained.
To install:
pip install faust-cchardet
cChardet is high speed universal character encoding detector. - binding to uchardet.
Supported Languages/Encodings
International (Unicode)
UTF-8
UTF-16BE / UTF-16LE
UTF-32BE / UTF-32LE / X-ISO-10646-UCS-4-34121 / X-ISO-10646-UCS-4-21431
Arabic
ISO-8859-6
WINDOWS-1256
Bulgarian
ISO-8859-5
WINDOWS-1251
Chinese
ISO-2022-CN
BIG5
EUC-TW
GB18030
HZ-GB-2312
Croatian:
ISO-8859-2
ISO-8859-13
ISO-8859-16
Windows-1250
IBM852
MAC-CENTRALEUROPE
Czech
Windows-1250
ISO-8859-2
IBM852
MAC-CENTRALEUROPE
Danish
ISO-8859-1
ISO-8859-15
WINDOWS-1252
English
ASCII
Esperanto
ISO-8859-3
Estonian
ISO-8859-4
ISO-8859-13
ISO-8859-13
Windows-1252
Windows-1257
Finnish
ISO-8859-1
ISO-8859-4
ISO-8859-9
ISO-8859-13
ISO-8859-15
WINDOWS-1252
French
ISO-8859-1
ISO-8859-15
WINDOWS-1252
German
ISO-8859-1
WINDOWS-1252
Greek
ISO-8859-7
WINDOWS-1253
Hebrew
ISO-8859-8
WINDOWS-1255
Hungarian:
ISO-8859-2
WINDOWS-1250
Irish Gaelic
ISO-8859-1
ISO-8859-9
ISO-8859-15
WINDOWS-1252
Italian
ISO-8859-1
ISO-8859-3
ISO-8859-9
ISO-8859-15
WINDOWS-1252
Japanese
ISO-2022-JP
SHIFT_JIS
EUC-JP
Korean
ISO-2022-KR
EUC-KR / UHC
Lithuanian
ISO-8859-4
ISO-8859-10
ISO-8859-13
Latvian
ISO-8859-4
ISO-8859-10
ISO-8859-13
Maltese
ISO-8859-3
Polish:
ISO-8859-2
ISO-8859-13
ISO-8859-16
Windows-1250
IBM852
MAC-CENTRALEUROPE
Portuguese
ISO-8859-1
ISO-8859-9
ISO-8859-15
WINDOWS-1252
Romanian:
ISO-8859-2
ISO-8859-16
Windows-1250
IBM852
Russian
ISO-8859-5
KOI8-R
WINDOWS-1251
MAC-CYRILLIC
IBM866
IBM855
Slovak
Windows-1250
ISO-8859-2
IBM852
MAC-CENTRALEUROPE
Slovene
ISO-8859-2
ISO-8859-16
Windows-1250
IBM852
M
Example
# -*- coding: utf-8 -*-
import cchardet as chardet
with open(r"src/tests/samples/wikipediaJa_One_Thousand_and_One_Nights_SJIS.txt", "rb") as f:
msg = f.read()
result = chardet.detect(msg)
print(result)
Benchmark
$ cd src/
$ pip install chardet
$ python tests/bench.py
Results
CPU: Intel(R) Core(TM) i7-9700K CPU @ 3.60GHz
RAM: DDR4-3200 64GB
Platform: Ubuntu 20.04 amd64
Python 3.9.0
Request (call/s) |
|
---|---|
chardet v3.0.4 |
0.46 |
cchardet v2.1.7 |
1404.05 |
LICENSE
See COPYING file.
Contact
Platform
Support
Windows i686, x86_64
Linux i686, x86_64
macOS x86_64
Do not Support
CHANGES
2.x.x
2.1.7 (2020-10-27)
support Python 3.9
drop support for Python 3.5
2.1.6 (2020-03-17)
drop support for Python 2.7
support Github Actions
update dev-dependencies
2.1.5 (2019-09-27)
update language models (uchardet)
add iso8859-2 test but disabled it
support Python 3.8
drop support for Python 3.4
2.1.4 (2018-09-27)
disable LTO because become poor performance
2.1.3 (2018-09-26)
support Python 3.7
2.1.2 (2018-09-26)
enable LTO for wheel builds
update Cython
2.1.1 (2017-07-01)
fix that different results with different chuck sizes
fix that assignments to nsSMState in nsCodingStateMachine result in unspecified behavior
include COPYING in package
2.1.0 (2017-05-15)
2.0.1 (2017-04-25)
2.0.0 (2017-04-06)
Improve tests
2.0a4 (2017-04-05)
Update uchardet repo (Fix buffer overflow)
2.0a3 (2017-03-29)
Implement UniversalDetector (like chardet)
2.0a2 (2017-03-28)
Update uchardet repo (Fix memory leak)
2.0a1 (2017-03-28)
Replace uchardet-enhanced to uchardet
Remove Detector class
1.1.3 (2017-02-26)
Support AArch64
1.1.2 (2017-01-08)
Support Python 3.6
1.1.1 (2016-11-05)
Use len() function (9e61cb9e96b138b0d18e5f9e013e144202ae4067)
Remove detect function in _cchardet.pyx (25b581294fc0ae8f686ac9972c8549666766f695)
Support manylinux1 wheel
1.1.0 (2016-10-17)
Add Detector class
Improve unit tests
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Hashes for faust_cchardet-2.1.9-cp39-cp39-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 90bcac2a8b460ca1c8db50f5191d32bd4832ac7394763a62afedbe6958c2a945 |
|
MD5 | 807cbd7454256a91de4ce156600d7934 |
|
BLAKE2b-256 | 9ed1f70f69eed9e35258fe5440907247ac7984ee6d43755f963bac62b0c8b181 |
Hashes for faust_cchardet-2.1.9-cp39-cp39-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e17904faac8afbdf9cc1a2284be340490180ec6e38a7ce469a15566f4e339600 |
|
MD5 | df56365845a1f5aa5ffee6de876f2c7d |
|
BLAKE2b-256 | f67ae99015f91d9403932eafd500c59248f6a0a1bc7023865b43da15b18e2751 |
Hashes for faust_cchardet-2.1.9-cp39-cp39-manylinux2010_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1779db0a551a14688dd2d4358496bbb9cfe3a90854471d098f1e4c7af04650a2 |
|
MD5 | 1b19281b1b63cce797e7222685c231bf |
|
BLAKE2b-256 | 7986dbf76659c1b4a2f81bab616aa4a5548f93645c62ec538b4f49953e293dcb |
Hashes for faust_cchardet-2.1.9-cp39-cp39-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e49c91ecb00f61c7b7354d5ffd70effe8656ce260fa437463716d90cbaa1f32c |
|
MD5 | ae2a8b95ad8a52a2785dda08ba8d7ad6 |
|
BLAKE2b-256 | 0d60c611b43556bb82075717c5b8d237afda015f2f165e93ebeafc748a0511c4 |
Hashes for faust_cchardet-2.1.9-cp39-cp39-manylinux1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b841cfb1ed7b7f179f05c76e783f21cb08811b913e3824f84015041cf92218b9 |
|
MD5 | a443ab757a273cbb2d6769046221d3c7 |
|
BLAKE2b-256 | c0514561b8386ccbe0188c337d1a39bbfb4e3efb7147e4cb15f4b9ec2e98b470 |
Hashes for faust_cchardet-2.1.9-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 99be731b155b589f817555f3ba1e91cc9950f02a80d42c69b7d32fea1cf88ccb |
|
MD5 | bfd01ca28db370bca32eda164a45e805 |
|
BLAKE2b-256 | cf43caa04ef0074db1ddfb2fbf610779b2ef1a780ace2c37f19de08b49c2206c |
Hashes for faust_cchardet-2.1.9-cp38-cp38-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4cae1dd56f6ce33529067703d37f73ca8aadf4a7b4cd071d9c9bdb449dd60395 |
|
MD5 | 28d457c9f06c560e0cc7aa182194dc22 |
|
BLAKE2b-256 | 8e1aac69c9aa512b92a21fdf157b47b0ed3b70fa428fd5e6ee27d7468f9542e5 |
Hashes for faust_cchardet-2.1.9-cp38-cp38-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e9922e51fcefc86f46de8f71127c0fc3daf0ddc3d4c3a7de28edd9c48e9ec433 |
|
MD5 | d75f1a770cc3bb3b3be6b3f4e8c25170 |
|
BLAKE2b-256 | 2df4c38c31703c482059bd3ec60731a59dc87c8c5582a3a7d531133a9324dc97 |
Hashes for faust_cchardet-2.1.9-cp38-cp38-manylinux2010_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7104cd952fd1f8fba572905d0353789ba261acc2d557432c810e8f81e636a3c0 |
|
MD5 | 8c251ba1d24f476f1acdc9c01b369993 |
|
BLAKE2b-256 | ed851d043d1e2ebcf1a5ff3442368b216f6a3818ca9b64f0c39958e2b3b233f0 |
Hashes for faust_cchardet-2.1.9-cp38-cp38-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0d22cc08b37132e14e9d7f062b0b1ea56871a04cecb06ed25ade597d3ce120f5 |
|
MD5 | 066a8ba3da0fd31e3b4b0a13e7832409 |
|
BLAKE2b-256 | 7b49d49ee81a5eca593ad0219c764c787cc1bbe2467a0cd253fb5b20e193b3af |
Hashes for faust_cchardet-2.1.9-cp38-cp38-manylinux1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 58b726dd041e2c0e23c57dd9a6f5913b3e8649c31b2ab32a489d581db9cdd7df |
|
MD5 | ae1ae9908aedefa312457f143497165f |
|
BLAKE2b-256 | b7b38765078164ec83b8d913f8acd678222640eae364b289ff77e586292d1460 |
Hashes for faust_cchardet-2.1.9-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3be22ad91034bacfbc106ce8949381ec283f899dceacdc12ae0730c71539a15f |
|
MD5 | fe5f8e127a491d3c199e63917f314a82 |
|
BLAKE2b-256 | 6409ae21b3dc512d8bd3ec0a043d4ad44d1e7699ae9516caf6702e43490e9174 |
Hashes for faust_cchardet-2.1.9-cp37-cp37m-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0baf583612df66fa0707445e38ad59f93cf74168d00169a78833ca34e3e9f61b |
|
MD5 | d19b88e457c00c4226dc9a4ea9d023cc |
|
BLAKE2b-256 | b0d7a803239c6afa8c22e114af9c7d2491ec98e3f671ec9e4936b31c4f6942b9 |
Hashes for faust_cchardet-2.1.9-cp37-cp37m-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0a6e9afa3fcf420d393725311b113f6f57b1440be36e1e1c8f8a7473b87b4a13 |
|
MD5 | 0f5a57a426b057513ce4332517adffa8 |
|
BLAKE2b-256 | 74989e9d1e45f4b426857b1f938a2523a39485a4940b5f418b02212f54558738 |
Hashes for faust_cchardet-2.1.9-cp37-cp37m-manylinux2010_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 266350790e1566b895139b584c4daeba9ea06e96b77466a6ea32a319c3e6a19f |
|
MD5 | 9b1da337494be0e1ab46f1c4c8aefba9 |
|
BLAKE2b-256 | d723c5e85c7ceeecf7dd0cdce81c878cb1245f5726305ae12a876b0b847e1383 |
Hashes for faust_cchardet-2.1.9-cp37-cp37m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bf4e46dd58b6e3d2f49d4c81a07f23af3f911c30caef1ee742561d831ee70bff |
|
MD5 | 498e96fc06102961a6e5084f1d88ce98 |
|
BLAKE2b-256 | 18401df9b337d8c3594dc7be2480a3d31751e8a5244fbc02f5f7eabad870011e |
Hashes for faust_cchardet-2.1.9-cp37-cp37m-manylinux1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e6ecdd647d034023fbd4cb79f7fa4ab9f024e036aa0b1007e4db81dd070e8642 |
|
MD5 | abc7085e7c7252b8aa7c549051aca070 |
|
BLAKE2b-256 | 3aa52a4b1996f8982b24acbf90b8ffec57d683ba477f50f04c29f87eca913626 |
Hashes for faust_cchardet-2.1.9-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c439f894b07752550f7b8606c402ed2d38d62817fba11f308dd772726be23ea6 |
|
MD5 | bde39326b09d7909bca5c0d77c09d69b |
|
BLAKE2b-256 | b42510bc3f5eafb6ee75b9a2a1fca831b8223c6201313d42df9b01e7a2eba7f3 |
Hashes for faust_cchardet-2.1.9-cp36-cp36m-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6f9578b9e7c22cf1f7829e1160b37e7b9687fb4cd15d8c25c943ac7f0a95de33 |
|
MD5 | 4ec8882a9cf084603c9f6e5d9901644c |
|
BLAKE2b-256 | 4573183ae9e9aab5ff0fa9c673682dc7ce78a1491a6cff1af3aa0664ac2b2b0c |
Hashes for faust_cchardet-2.1.9-cp36-cp36m-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 008d32e88b7f4df095cdc9e48752522a0387b0ddf72df1c838548233df517cff |
|
MD5 | 58d3724ac267737e3752a48eb832da35 |
|
BLAKE2b-256 | 89f32891a68a1a62c6059e1351700ce5abdcbcbe90dd6b8cfa9c925b37248947 |
Hashes for faust_cchardet-2.1.9-cp36-cp36m-manylinux2010_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8245629c90970e359bcfce6a4d7ebc30fed20fa0431f0d9a98f0ca8f3fdabfe4 |
|
MD5 | 902f61f6906cd4c9d142b2f7874773ba |
|
BLAKE2b-256 | 1c859122dafb10bbfc9c8dfb9796731d644d608e6d158c00a5ff02eca8424841 |
Hashes for faust_cchardet-2.1.9-cp36-cp36m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6cc91008a2e03681bd99f1bd893ea6f39a294ae3d07fb342e6edae5cf88549af |
|
MD5 | e6ab27c6972272beb87e5f06cac325cb |
|
BLAKE2b-256 | 0aad130a5d3b0f3a0b580256b5fdd4be9997cda233189587e6cc2c915b2e8929 |
Hashes for faust_cchardet-2.1.9-cp36-cp36m-manylinux1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2819eaa2af53a7bdf8fbb42cee8b087d925487524c965d9d08c8f526fe9dbd87 |
|
MD5 | f3efa5275a0195a324462794d750152b |
|
BLAKE2b-256 | 5727e5a37d4ac4670e224d17a9a553662f75fb5a6fdba913b9bad6bf4f3012df |
Hashes for faust_cchardet-2.1.9-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8358119c0e22dfe227187cabeacaecc7f5bcb5fcec7734ba7f6788cf4c98d96c |
|
MD5 | ab31da187300c65722404c8170fecc46 |
|
BLAKE2b-256 | cce7ffa8a91c245f1cb1e43b19167461b8033f4261b1e95eb3f3abb00744fae4 |