cChardet is high speed universal character encoding detector.
Project description
cChardet
NOTICE: This is a fork of the original project at https://github.com/PyYoshi/cChardet since the original project is no longer maintained.
To install:
pip install faust-cchardet
cChardet is high speed universal character encoding detector. - binding to uchardet.
Supported Languages/Encodings
International (Unicode)
UTF-8
UTF-16BE / UTF-16LE
UTF-32BE / UTF-32LE / X-ISO-10646-UCS-4-34121 / X-ISO-10646-UCS-4-21431
Arabic
ISO-8859-6
WINDOWS-1256
Bulgarian
ISO-8859-5
WINDOWS-1251
Chinese
ISO-2022-CN
BIG5
EUC-TW
GB18030
HZ-GB-2312
Croatian:
ISO-8859-2
ISO-8859-13
ISO-8859-16
Windows-1250
IBM852
MAC-CENTRALEUROPE
Czech
Windows-1250
ISO-8859-2
IBM852
MAC-CENTRALEUROPE
Danish
ISO-8859-1
ISO-8859-15
WINDOWS-1252
English
ASCII
Esperanto
ISO-8859-3
Estonian
ISO-8859-4
ISO-8859-13
ISO-8859-13
Windows-1252
Windows-1257
Finnish
ISO-8859-1
ISO-8859-4
ISO-8859-9
ISO-8859-13
ISO-8859-15
WINDOWS-1252
French
ISO-8859-1
ISO-8859-15
WINDOWS-1252
German
ISO-8859-1
WINDOWS-1252
Greek
ISO-8859-7
WINDOWS-1253
Hebrew
ISO-8859-8
WINDOWS-1255
Hungarian:
ISO-8859-2
WINDOWS-1250
Irish Gaelic
ISO-8859-1
ISO-8859-9
ISO-8859-15
WINDOWS-1252
Italian
ISO-8859-1
ISO-8859-3
ISO-8859-9
ISO-8859-15
WINDOWS-1252
Japanese
ISO-2022-JP
SHIFT_JIS
EUC-JP
Korean
ISO-2022-KR
EUC-KR / UHC
Lithuanian
ISO-8859-4
ISO-8859-10
ISO-8859-13
Latvian
ISO-8859-4
ISO-8859-10
ISO-8859-13
Maltese
ISO-8859-3
Polish:
ISO-8859-2
ISO-8859-13
ISO-8859-16
Windows-1250
IBM852
MAC-CENTRALEUROPE
Portuguese
ISO-8859-1
ISO-8859-9
ISO-8859-15
WINDOWS-1252
Romanian:
ISO-8859-2
ISO-8859-16
Windows-1250
IBM852
Russian
ISO-8859-5
KOI8-R
WINDOWS-1251
MAC-CYRILLIC
IBM866
IBM855
Slovak
Windows-1250
ISO-8859-2
IBM852
MAC-CENTRALEUROPE
Slovene
ISO-8859-2
ISO-8859-16
Windows-1250
IBM852
M
Example
# -*- coding: utf-8 -*-
import cchardet as chardet
with open(r"src/tests/samples/wikipediaJa_One_Thousand_and_One_Nights_SJIS.txt", "rb") as f:
msg = f.read()
result = chardet.detect(msg)
print(result)
Benchmark
$ cd src/
$ pip install chardet
$ python tests/bench.py
Results
CPU: Intel(R) Core(TM) i7-9700K CPU @ 3.60GHz
RAM: DDR4-3200 64GB
Platform: Ubuntu 20.04 amd64
Python 3.9.0
Request (call/s) |
|
---|---|
chardet v3.0.4 |
0.46 |
cchardet v2.1.7 |
1404.05 |
LICENSE
See COPYING file.
Contact
Platform
Support
Windows i686, x86_64
Linux i686, x86_64
macOS x86_64
Do not Support
CHANGES
2.x.x
2.1.7 (2020-10-27)
support Python 3.9
drop support for Python 3.5
2.1.6 (2020-03-17)
drop support for Python 2.7
support Github Actions
update dev-dependencies
2.1.5 (2019-09-27)
update language models (uchardet)
add iso8859-2 test but disabled it
support Python 3.8
drop support for Python 3.4
2.1.4 (2018-09-27)
disable LTO because become poor performance
2.1.3 (2018-09-26)
support Python 3.7
2.1.2 (2018-09-26)
enable LTO for wheel builds
update Cython
2.1.1 (2017-07-01)
fix that different results with different chuck sizes
fix that assignments to nsSMState in nsCodingStateMachine result in unspecified behavior
include COPYING in package
2.1.0 (2017-05-15)
2.0.1 (2017-04-25)
2.0.0 (2017-04-06)
Improve tests
2.0a4 (2017-04-05)
Update uchardet repo (Fix buffer overflow)
2.0a3 (2017-03-29)
Implement UniversalDetector (like chardet)
2.0a2 (2017-03-28)
Update uchardet repo (Fix memory leak)
2.0a1 (2017-03-28)
Replace uchardet-enhanced to uchardet
Remove Detector class
1.1.3 (2017-02-26)
Support AArch64
1.1.2 (2017-01-08)
Support Python 3.6
1.1.1 (2016-11-05)
Use len() function (9e61cb9e96b138b0d18e5f9e013e144202ae4067)
Remove detect function in _cchardet.pyx (25b581294fc0ae8f686ac9972c8549666766f695)
Support manylinux1 wheel
1.1.0 (2016-10-17)
Add Detector class
Improve unit tests
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for faust_cchardet-2.1.15-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c44c70671c9af7a6c43f46c24ae1fe5e71ef65015e1a9d79ec9bbeb2e866b3ea |
|
MD5 | 0e767eebc226acd95b3cfb2f967bf318 |
|
BLAKE2b-256 | 4281ba44236be455e3ac20121016013154daf199e10a5ec633ba9e23a3dfba45 |
Hashes for faust_cchardet-2.1.15-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e2aafca1a603b99edcbc75955eab418f1495fce89a72947a7433851d75b85a5b |
|
MD5 | 36db29387ec40a430b135330cdce5616 |
|
BLAKE2b-256 | b7f7147a42e1d9f8c45d89792c179d1f01d1227bc2d7b61756d2516ea919c1dc |
Hashes for faust_cchardet-2.1.15-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0e3e84f2befd028304ed708d8c70b5a6f02d84bf01dd15baf64c331a30272ae8 |
|
MD5 | 874bb887df424ed1880fc7fc99cbfda1 |
|
BLAKE2b-256 | bdfb00bf9ba0070ec242bee3c667b02608cc865a501386eddb1075470a3c5f3f |
Hashes for faust_cchardet-2.1.15-cp311-cp311-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9276049885f0cd40a1787b39b2492228738fbf05ec1ef1bb85a039bb664b0365 |
|
MD5 | 61f4a95af26bcc8bd2c766afdcd0360d |
|
BLAKE2b-256 | 8826010c1ff1d09a7cdef5ab3d073ba27a539ec35f00d91a5438b703d4392daa |
Hashes for faust_cchardet-2.1.15-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f6764efb845c28a6739f47002571cd623b5d33bf1bc5512b289f42573ab0ccd2 |
|
MD5 | 5e7926f0103c1b8f7bb9f518801524c7 |
|
BLAKE2b-256 | bc636134a603452d5d35d1b7df1630c125e139fef317e5240f1efc84cc4be9d7 |
Hashes for faust_cchardet-2.1.15-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b1f9e52103a035743d4b70852db0095e3e500d232886c34f87a3f7130b6ab3d1 |
|
MD5 | dc0cec2e08b814f566be068e9103f620 |
|
BLAKE2b-256 | eb80596207808280cc4969d53efba7d8f8d7d7bd988d86e722bdcdb07992dad9 |
Hashes for faust_cchardet-2.1.15-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d9e6310622067b89d21e3a24e747968c5e4933e51ebdc751174564fbd85d7c67 |
|
MD5 | 1ca054c3c96335b338cc5ad53bddd518 |
|
BLAKE2b-256 | 60c3fe4a519b46113d6898289cd6b5366e6e5673396ba1e5ade967ccd41216ec |
Hashes for faust_cchardet-2.1.15-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 909545f28bba69c045ee3405802c7dea344fc6847a6047e0a3c9a1256f33ccbc |
|
MD5 | 811db3eb0ca9e87db450e583b1cf715b |
|
BLAKE2b-256 | 6042f1c58ee476913d2b47bff0aae8901ecc751b1cf8f7df03d711c471aa409c |
Hashes for faust_cchardet-2.1.15-cp310-cp310-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c84253729ede997948293aa79142cf96e43961520d6dbd89e7e767a321680d43 |
|
MD5 | bb02b7f4eb541847a3dc130e097a8ba3 |
|
BLAKE2b-256 | 34a1fcad7d359a93c0fbfe65a6ab0374bf5e9688941a5f98776d2dcda76a7d46 |
Hashes for faust_cchardet-2.1.15-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c64fc43607fb288e235537443e730d85265e6e742bb03ed90f979a920fa03b56 |
|
MD5 | 50f19016e957ce5365cea8048b8734c8 |
|
BLAKE2b-256 | 5a7efcb4676e556a8466d90b1032e8a7c05fb28f71906d7b71a834e8ac220647 |
Hashes for faust_cchardet-2.1.15-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3ddb47c8157753e9eabdadacb5b58ff0c5c095acedca6d776397de5bd93f5eb6 |
|
MD5 | e914d88ade8918f6ecd9b1e18a4949f6 |
|
BLAKE2b-256 | 0906e2f823b63f07c34bf35de623f9987261435fff254392bc54fff67da3cfe4 |
Hashes for faust_cchardet-2.1.15-cp39-cp39-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 57e210a83d4772ab1e61d0b8ce8863d8db548cd49c13fd7904fcd754c4bfe367 |
|
MD5 | 6ba4ea8e1efbc7ef887ee5fa7af03875 |
|
BLAKE2b-256 | 8336e9ea0a752679b21ca43eb8ecf0fac54c1409333fef4c07276ee33496aec7 |
Hashes for faust_cchardet-2.1.15-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 76e09206a58309d0555505dd6708a4b662f354c983ad80ff54a26ce52cd5d917 |
|
MD5 | 135bb8054de90be768ca168a91f96dfc |
|
BLAKE2b-256 | fcd83e236f7f2f8c0aa1a569a79847ac5e06804f8ea18ed3431718cbb0456d99 |
Hashes for faust_cchardet-2.1.15-cp39-cp39-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a85bebd34f6dfda7d01fbf5c930de91ee815bb59ad13fc408369324c5f525031 |
|
MD5 | bb1fd693da4350067c83465641d97b5e |
|
BLAKE2b-256 | 44d896257ff585b1ed8985c420c510bceaf6d9fc7a11302519ded5ecb9c0c2c0 |
Hashes for faust_cchardet-2.1.15-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bc2d411c5e32a47d4cd766f9a052f1a1ee5d3322d64782ff17324d9dc9034216 |
|
MD5 | c1b8d07a28686f449883b0d3f6c3aa5e |
|
BLAKE2b-256 | 407a64d81bf8d07bbf28edcfa2803a7b6f70baaf2d31c9a566a103746b5137fa |
Hashes for faust_cchardet-2.1.15-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 138a90a7a17de63114182048ef62e85b13b8baed6ba70827321c17e66ce1040f |
|
MD5 | 78e38c8999008ec58f961fc73e84d216 |
|
BLAKE2b-256 | 3c7206c0bbe7c7518a9d9b1f4abb8e9f0c6bcd42c8ae716c8a96e376aaf50787 |
Hashes for faust_cchardet-2.1.15-cp38-cp38-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 867d9d50a833c9eebc8a6ee4139b4f6e4bc531b0503a5deff5ddadcc9c1b26b0 |
|
MD5 | e08a6a080b7799c165b503e9059fc260 |
|
BLAKE2b-256 | 79f7e937b8d535e1a57d88af447d346a434ba4ec2c6a72937a3eb0541739fdf8 |
Hashes for faust_cchardet-2.1.15-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | aa9c1c8cccf311e8530bf4b36eaa5e77f94e2add6eadf6af2a3b4d0e8f56b6bb |
|
MD5 | 20d3b989118c14656f6d3d321f59f391 |
|
BLAKE2b-256 | 9c7cf037257ab14bf68897c69d0ead7e5a5a81ccad004e5057a45c0f4eb1375f |
Hashes for faust_cchardet-2.1.15-cp38-cp38-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 26f9a5802377424870b61054b94be8454896aac586332e75b7736c6f4fc85414 |
|
MD5 | 91a3726faca7970b7967adde2566da3c |
|
BLAKE2b-256 | a0d1a13c0bc04c79a9a5be7c4ae550c53c8d6f834a72fa6a0e061d12436b0083 |
Hashes for faust_cchardet-2.1.15-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 63678f20155164386b74287bbc98eb1c895a7c37aedf70661f398f592266254a |
|
MD5 | 470a7912d67b75b6b80e82b08e496bf1 |
|
BLAKE2b-256 | 891cde44da734b20695bcdc0ef984210e73a99c39ae082438c517d6cef84e51c |
Hashes for faust_cchardet-2.1.15-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6f3aa2c73174c13959b1d34d05d08d53e59de253e427301b0d7ac876b501f634 |
|
MD5 | 45ef62c8e72031944971374ce78e15e8 |
|
BLAKE2b-256 | 70b22130a4d0ed3211494b87940401ba73cc82286a15c96fe2531b7b8237496d |
Hashes for faust_cchardet-2.1.15-cp37-cp37m-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 552943b8db3553061d8ef70fc037fe786b4cbabeb88f278403e113937d43ebbb |
|
MD5 | ffaac20ca8a0e1f96a96755876618b70 |
|
BLAKE2b-256 | a8e548929b2de6029f05c13692fbcc265ae0af3c3f6849cf77ba920a326b3aff |
Hashes for faust_cchardet-2.1.15-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e67cdba08fd077122699ae303f5cb8b2fc41204195a457647528f99335548027 |
|
MD5 | 74891e8f72a35e8a2b47d620626ce900 |
|
BLAKE2b-256 | 596e3b2f1a3dfe00f5176c354b28b80d0b0c14c70022dd4d50b8f12d3105028d |
Hashes for faust_cchardet-2.1.15-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9194edde0149b703d73ebf8f648def072234e39f181ae231229f49fffa0d26e7 |
|
MD5 | 53a4110da1999a2d2a2581a2fa8f34b4 |
|
BLAKE2b-256 | e0a4acc1ccdf553b9a54119e2d76fd770796ce435410f7554204f9659cf4bfbc |
Hashes for faust_cchardet-2.1.15-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 87195fc1ce789a7aecad25e57fff167c2448969298719823cfd968466a8f73b0 |
|
MD5 | 9bed31db942fde170cf9173b44222e2a |
|
BLAKE2b-256 | 359c347f68b8f2ed28449fa29a2d7ed87f89dcc9feaddcc28f3032db47405e8c |
Hashes for faust_cchardet-2.1.15-cp36-cp36m-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7a322e1019933825ef50700b48c49fe66b2763da4516ba71f448e853f5859d1d |
|
MD5 | 404bdda715d05a46ff498408862c52e1 |
|
BLAKE2b-256 | 524a94bd6c1f76949b5bae46578eec91d8a12d7fe52f6cddd77b290b12ba3802 |
Hashes for faust_cchardet-2.1.15-cp36-cp36m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3c228b89faccd2121f91c05d45a7c248d2b48c806299317ab4b9952b8413ede0 |
|
MD5 | c92337663c4df6e77d3b37c3c097288a |
|
BLAKE2b-256 | b8624039a3260912b469fc9ea54d49f6fea2d39f00062268af1a9dc9afb4b7fa |
Hashes for faust_cchardet-2.1.15-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3a5352742a71f86702defc8eeec5fe66e096b65785a3cf20a677e42d117eee82 |
|
MD5 | 91c2983814ffe4a16f7bb882b8c7411f |
|
BLAKE2b-256 | b369114b802f635cd177d6abf758e747fcbf588b4152001964bf290a46eedd94 |