cChardet is high speed universal character encoding detector.
Reason this release was yanked:
Does not import
Project description
cChardet
NOTICE: This is a fork of the original project at https://github.com/PyYoshi/cChardet since the original project is no longer maintained.
To install:
pip install faust-cchardet
cChardet is high speed universal character encoding detector. - binding to uchardet.
Supported Languages/Encodings
International (Unicode)
UTF-8
UTF-16BE / UTF-16LE
UTF-32BE / UTF-32LE / X-ISO-10646-UCS-4-34121 / X-ISO-10646-UCS-4-21431
Arabic
ISO-8859-6
WINDOWS-1256
Bulgarian
ISO-8859-5
WINDOWS-1251
Chinese
ISO-2022-CN
BIG5
EUC-TW
GB18030
HZ-GB-2312
Croatian:
ISO-8859-2
ISO-8859-13
ISO-8859-16
Windows-1250
IBM852
MAC-CENTRALEUROPE
Czech
Windows-1250
ISO-8859-2
IBM852
MAC-CENTRALEUROPE
Danish
ISO-8859-1
ISO-8859-15
WINDOWS-1252
English
ASCII
Esperanto
ISO-8859-3
Estonian
ISO-8859-4
ISO-8859-13
ISO-8859-13
Windows-1252
Windows-1257
Finnish
ISO-8859-1
ISO-8859-4
ISO-8859-9
ISO-8859-13
ISO-8859-15
WINDOWS-1252
French
ISO-8859-1
ISO-8859-15
WINDOWS-1252
German
ISO-8859-1
WINDOWS-1252
Greek
ISO-8859-7
WINDOWS-1253
Hebrew
ISO-8859-8
WINDOWS-1255
Hungarian:
ISO-8859-2
WINDOWS-1250
Irish Gaelic
ISO-8859-1
ISO-8859-9
ISO-8859-15
WINDOWS-1252
Italian
ISO-8859-1
ISO-8859-3
ISO-8859-9
ISO-8859-15
WINDOWS-1252
Japanese
ISO-2022-JP
SHIFT_JIS
EUC-JP
Korean
ISO-2022-KR
EUC-KR / UHC
Lithuanian
ISO-8859-4
ISO-8859-10
ISO-8859-13
Latvian
ISO-8859-4
ISO-8859-10
ISO-8859-13
Maltese
ISO-8859-3
Polish:
ISO-8859-2
ISO-8859-13
ISO-8859-16
Windows-1250
IBM852
MAC-CENTRALEUROPE
Portuguese
ISO-8859-1
ISO-8859-9
ISO-8859-15
WINDOWS-1252
Romanian:
ISO-8859-2
ISO-8859-16
Windows-1250
IBM852
Russian
ISO-8859-5
KOI8-R
WINDOWS-1251
MAC-CYRILLIC
IBM866
IBM855
Slovak
Windows-1250
ISO-8859-2
IBM852
MAC-CENTRALEUROPE
Slovene
ISO-8859-2
ISO-8859-16
Windows-1250
IBM852
M
Example
# -*- coding: utf-8 -*-
import cchardet as chardet
with open(r"src/tests/samples/wikipediaJa_One_Thousand_and_One_Nights_SJIS.txt", "rb") as f:
msg = f.read()
result = chardet.detect(msg)
print(result)
Benchmark
$ cd src/
$ pip install chardet
$ python tests/bench.py
Results
CPU: Intel(R) Core(TM) i7-9700K CPU @ 3.60GHz
RAM: DDR4-3200 64GB
Platform: Ubuntu 20.04 amd64
Python 3.9.0
Request (call/s) |
|
---|---|
chardet v3.0.4 |
0.46 |
cchardet v2.1.7 |
1404.05 |
LICENSE
See COPYING file.
Contact
Platform
Support
Windows i686, x86_64
Linux i686, x86_64
macOS x86_64
Do not Support
CHANGES
2.x.x
2.1.7 (2020-10-27)
support Python 3.9
drop support for Python 3.5
2.1.6 (2020-03-17)
drop support for Python 2.7
support Github Actions
update dev-dependencies
2.1.5 (2019-09-27)
update language models (uchardet)
add iso8859-2 test but disabled it
support Python 3.8
drop support for Python 3.4
2.1.4 (2018-09-27)
disable LTO because become poor performance
2.1.3 (2018-09-26)
support Python 3.7
2.1.2 (2018-09-26)
enable LTO for wheel builds
update Cython
2.1.1 (2017-07-01)
fix that different results with different chuck sizes
fix that assignments to nsSMState in nsCodingStateMachine result in unspecified behavior
include COPYING in package
2.1.0 (2017-05-15)
2.0.1 (2017-04-25)
2.0.0 (2017-04-06)
Improve tests
2.0a4 (2017-04-05)
Update uchardet repo (Fix buffer overflow)
2.0a3 (2017-03-29)
Implement UniversalDetector (like chardet)
2.0a2 (2017-03-28)
Update uchardet repo (Fix memory leak)
2.0a1 (2017-03-28)
Replace uchardet-enhanced to uchardet
Remove Detector class
1.1.3 (2017-02-26)
Support AArch64
1.1.2 (2017-01-08)
Support Python 3.6
1.1.1 (2016-11-05)
Use len() function (9e61cb9e96b138b0d18e5f9e013e144202ae4067)
Remove detect function in _cchardet.pyx (25b581294fc0ae8f686ac9972c8549666766f695)
Support manylinux1 wheel
1.1.0 (2016-10-17)
Add Detector class
Improve unit tests
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Hashes for faust_cchardet-2.1.9rc5-cp39-cp39-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 747b067dccdf7c2f97cdf628df0ffdea137de0ec5712a8c97b8ee71a29be67c4 |
|
MD5 | b41ce598dc0ec905b7ce582d79225866 |
|
BLAKE2b-256 | c9039659d72b8d2bdc527a4026f9927519a777316de467bc55d89c4b2ea7dba6 |
Hashes for faust_cchardet-2.1.9rc5-cp39-cp39-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9b944dc7fef35ece04a1eb8307a5b6ef5347e4bc9ac8b82a7e2158c46bd1dbc3 |
|
MD5 | f85fd7f60ee194a67d3dab0a85d1cabb |
|
BLAKE2b-256 | 0e6f2500411b83678400a231bb40a9ab90c2427d719a8cd9630a67ecf1f4adc7 |
Hashes for faust_cchardet-2.1.9rc5-cp39-cp39-manylinux2010_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 436fa5be6e8fd68191e2ef0d8bb7005dc159ae142a716f6e941c1185bb59d2b0 |
|
MD5 | 6e01ee7615158ed89a34c0e4e1c17af8 |
|
BLAKE2b-256 | b62bfacd81a487adbfb9ac52f358153ad24069b5b0e34676c457473c27d993d2 |
Hashes for faust_cchardet-2.1.9rc5-cp39-cp39-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 80d67c68e442c7be84e6b3ffc278f9494c9c6b5c6f8e728c98cf7c67bd89fdc2 |
|
MD5 | 819d8848161554501471f9d629ab5404 |
|
BLAKE2b-256 | 32649a9976325eefc8e0e37dca5338ff5e6c13637766ddbd3903f14fd9bae066 |
Hashes for faust_cchardet-2.1.9rc5-cp39-cp39-manylinux1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 86e55aee107738f9eb90147eb19b6e0c9c1ab536cec84eec65920f94ed810651 |
|
MD5 | fb45e390487e3a2e0759d793ac06ebd5 |
|
BLAKE2b-256 | e0c12d151370c99db023d6d4723c20d9ff0d49b722af580ad10667f72cd1a087 |
Hashes for faust_cchardet-2.1.9rc5-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cb6f5ca7fab812270d1c0a864f35f457a86dfdd61ac09c64c077908da0612ad9 |
|
MD5 | 21bb503ab54614305713cd23acae3e80 |
|
BLAKE2b-256 | 994ed7b7e491c1deb981fd581c3fc39b179b4349548cdc6a16e2cc49296a27f7 |
Hashes for faust_cchardet-2.1.9rc5-cp38-cp38-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f794fbdfa6335a0cff01b532060fabfa247da6ee7830f848513f7f7df1d8ff5e |
|
MD5 | 5b7839e5f943fe62656b3a849bc97b16 |
|
BLAKE2b-256 | 62492255be58a7821df0eb05d479e7847e0e7fb0c7fddb60c094f361ed48be4d |
Hashes for faust_cchardet-2.1.9rc5-cp38-cp38-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cca35f2b67ed1dd2dd01e2c41d4a36d6a42d8009fe454eb3ba100db394a1254a |
|
MD5 | 9bacde3a5426efe3bab53738582c706b |
|
BLAKE2b-256 | 22b69776b5e715f35bb6eed5a6cc084981915c047b0f3a0de11ec51a82914bf0 |
Hashes for faust_cchardet-2.1.9rc5-cp38-cp38-manylinux2010_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d3c572466565fc3b29586ffcc625949d2e5a4563013d89859f3a7c192622944d |
|
MD5 | ff21b656aa31daf660c5867e4ecf47ea |
|
BLAKE2b-256 | 1d4f056bb12d408b56c4c0cb69d94050d2ad6ae0ecf67d39a457cff4e4abcab9 |
Hashes for faust_cchardet-2.1.9rc5-cp38-cp38-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1dd812769d35d4dece61f08b7a5b15cc0ae70dc745889bff705dcbe25cffb45b |
|
MD5 | 0ecfcd84deb30bb7e3c49664b6318dd4 |
|
BLAKE2b-256 | b48245bbcddde860edf4a02e2b9975888b91ecbdc424e932367fbc8bf242a3ac |
Hashes for faust_cchardet-2.1.9rc5-cp38-cp38-manylinux1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3984f9c9c50e5a528fdd7963512b4e9741c9a08a14de7fbaefd3fd520d740899 |
|
MD5 | ad9ffc8771d02582d0db9daca08b96ea |
|
BLAKE2b-256 | f6cfceea36cb3d34802c376db2e512dad2c0fb2184df2c2c71891e6effca0390 |
Hashes for faust_cchardet-2.1.9rc5-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 83bc7d6bf0d7f143f853a5a8625e317b6441d4e9496a66075bfb41b07d122783 |
|
MD5 | 2a7bcb9195f8d500e5733b73b0f2929b |
|
BLAKE2b-256 | 996b698cf52baa31d8561b58828267d5f86b5594d58126db842f107bb177937a |
Hashes for faust_cchardet-2.1.9rc5-cp37-cp37m-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5ea3f4bfc3cd465149e24ac19d77ea8a7bdf71aaa8721adddf0f86740fab183e |
|
MD5 | 3c66d93132839260be86d8a9754e0d2b |
|
BLAKE2b-256 | 82a3b8c9c456e89f73ec6baeff3119bfc3b5ed8e7153395c28749beb57bce9c9 |
Hashes for faust_cchardet-2.1.9rc5-cp37-cp37m-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e7536a2f1195550868f876ffeb0f35cbfb72afb811737ecf9747a73275da48b9 |
|
MD5 | 562a4f5bb55e936c609472b7a3e13279 |
|
BLAKE2b-256 | 0d33994ad5f20e47929ca708d04d9a89331aa1f5b6390478575c7f3deb744bb8 |
Hashes for faust_cchardet-2.1.9rc5-cp37-cp37m-manylinux2010_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1c67c4ffd49026fc3d4be348a789a50372090e4e1ad66155e67d957c2220fab8 |
|
MD5 | 1e5935e77cc225dd9eacdc71a72c7394 |
|
BLAKE2b-256 | 260d8131eb2d491e260374dfcb92e874fdbcf5fcab3f88f2fbbf0d760958d5b6 |
Hashes for faust_cchardet-2.1.9rc5-cp37-cp37m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9d5a0289d82d4147479310e448b315e6d83e59dd9fc12aa28935072e054892b5 |
|
MD5 | 76c769eda1126bf942e18e3eaad0c2d7 |
|
BLAKE2b-256 | 07a0e8d719678c8524639611e43c823c27cab109bc9c2320c14b69a09cc525f2 |
Hashes for faust_cchardet-2.1.9rc5-cp37-cp37m-manylinux1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 95df6842f10faa6c94dfd0f05c86123475a02424fdd4cfa4c0493a838fd88f9d |
|
MD5 | 8fec2788aea25e303f1fe6afe6876a30 |
|
BLAKE2b-256 | a4468aebeef47b6664acd963f447b982da380048ef2bddaef528e44250a2725f |
Hashes for faust_cchardet-2.1.9rc5-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1ae7ca8715711b9aecb04fd4f245d70c612567292a5c7a63a3c41e07dd77ae69 |
|
MD5 | aafcc7d1098fa2c2462773134ca74614 |
|
BLAKE2b-256 | 4dfa47da186e1c1cb497ae75704985f9c94501725074b90ca989ce0a2010d610 |
Hashes for faust_cchardet-2.1.9rc5-cp36-cp36m-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e45119fef4f7ae850e43624f49e4a5e78918014fc1b319e336e8ae7c8ae401af |
|
MD5 | e12256396625005cb779df94cd444324 |
|
BLAKE2b-256 | dba5948a61a34c16f68df895621de77c24141a6dc8d3da9a91c00c3f5b8d7524 |
Hashes for faust_cchardet-2.1.9rc5-cp36-cp36m-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 938443542dc84deff55b1995a6481d5aa682f346212c5100f4ffc50bd809147e |
|
MD5 | 76815ca6f083f48395adbfadd116b0cf |
|
BLAKE2b-256 | 3d023386fe7359095a3281e265247ea3f2bccba355bf8dc1ee39abee733c2aea |
Hashes for faust_cchardet-2.1.9rc5-cp36-cp36m-manylinux2010_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 62216851077ef29921308a1c0e0633cd43e8769471bddcb6b6c57178f3e5f719 |
|
MD5 | 471d522a65f72159583739653285771c |
|
BLAKE2b-256 | d06448256cae481ffc1f5f0bf66afef19a9d11cd8695acfa60d908ac8802d7fb |
Hashes for faust_cchardet-2.1.9rc5-cp36-cp36m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 39dc84b87ccc8cd7b74afc471ee99c14af4a141c7d855405edc9f4eb55290dda |
|
MD5 | 9b803d963f87f3b2e1e6b0ac6f88b49a |
|
BLAKE2b-256 | 1ef886c22f24e8c452f03129e00e57a1341e6a4e48a4f0e2be13e7507a9d7a13 |
Hashes for faust_cchardet-2.1.9rc5-cp36-cp36m-manylinux1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | efbc7150c6fc999a7507246bff9a5907a3c3cbe9e4786d6f228c1c24283d221c |
|
MD5 | dcea5cb740d48f43de1f66c0b5faa5bd |
|
BLAKE2b-256 | 7baab33f1e74143447419d078a54b45d499e906b23b2ba1d3d2b58af205864f9 |
Hashes for faust_cchardet-2.1.9rc5-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ae6d48cd44578cf0475c748d423a4bc692b67cc5d9e50cf74d1b05b0504490d3 |
|
MD5 | bef860c208c205e55e052080e62bf353 |
|
BLAKE2b-256 | 36287d3dd1266f631867d22531e00fd99a955bd45d5d821df76eade2533f1a46 |