Universal encoding detector. This library is faster than chardet.
Project description
cChardet
cChardet is high speed universal character encoding detector. - binding to charsetdetect.
Support codecs
Big5
EUC-JP
EUC-KR
GB18030
HZ-GB-2312
IBM855
IBM866
ISO-2022-CN
ISO-2022-JP
ISO-2022-KR
ISO-8859-2
ISO-8859-5
ISO-8859-7
ISO-8859-8
KOI8-R
Shift_JIS
TIS-620
UTF-8
UTF-16BE
UTF-16LE
UTF-32BE
UTF-32LE
WINDOWS-1250
WINDOWS-1251
WINDOWS-1252
WINDOWS-1253
WINDOWS-1255
EUC-TW
X-ISO-10646-UCS-4-2143
X-ISO-10646-UCS-4-3412
x-mac-cyrillic
Requirements
Example
# -*- coding: utf-8 -*-
import cchardet as chardet
with open(r"src/tests/testdata/wikipediaJa_One_Thousand_and_One_Nights_SJIS.txt", "rb") as f:
msg = f.read()
result = chardet.detect(msg)
print(result)
Benchmark
$ cd src/
$ pip install chardet
$ python tests/bench.py
Results
CPU: Intel(R) Core(TM) i3-4170 CPU @ 3.70GHz
RAM: DDR3 1600Mhz 16GB
Platform: Ubuntu 16.04 amd64
Python 2.7.12
Request (call/s) |
|
---|---|
chardet |
0.26 |
cchardet |
1408.73 |
Python 3.5.2
Request (call/s) |
|
---|---|
chardet |
0.28 |
cchardet |
1380.40 |
License
The MIT License: src/cchardet
Other Libraries License: Please, look at the src/ext directory.
Thanks
Contact
CHANGES
1.1.3 (2017-02-26)
Support AArch64
1.1.2 (2017-01-08)
Support Python 3.6
1.1.1 (2016-11-05)
Use len() function (9e61cb9e96b138b0d18e5f9e013e144202ae4067)
Remove detect function in _cchardet.pyx (25b581294fc0ae8f686ac9972c8549666766f695)
Support manylinux1 wheel
1.1.0 (2016-10-17)
Add Detector class
Improve unit tests
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for cchardet-1.1.3-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 94876472946bf5de63abfa5630403353c0c2decc696d1da681021b33c471bc59 |
|
MD5 | 4c3a5997cc0e72f4392b84876bccfbb1 |
|
BLAKE2b-256 | 24870d36f2adbc7a6bf9721c8b0677d688c4c3702ca490ba485c373c33af9a34 |
Hashes for cchardet-1.1.3-cp36-cp36m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 558ffc9e653b07ac72fe37a594a4151ed389b3d19fb72286beee3eae592a4391 |
|
MD5 | 366ff14fe8703b43743668b10187c94b |
|
BLAKE2b-256 | 44fceb1b9ceec790985890971fe1237a2ba9aa6df6e262305fcf3eb34e0af0f3 |
Hashes for cchardet-1.1.3-cp36-cp36m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c966fa91c53e36db3e1663f7c98070cb936a15b2d98a34a66b5833b147b9bc5a |
|
MD5 | 3f9f7c8cfc71b43b71ccb61bc822f3a0 |
|
BLAKE2b-256 | 765278fd5ac898bdd2a6260e47432b592b1b1c0ef971014fe6d6088b0b382a4d |
Hashes for cchardet-1.1.3-cp36-cp36m-manylinux1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 41ccfb1b5b0e9e8a02c6ccdd264df955fd5dfff414887ce160f3f7aa52b6f286 |
|
MD5 | 64b9a67946b0c69b44c2733ef094b794 |
|
BLAKE2b-256 | 5159380ecf08e86a300f53b2a0363581555785b3157d194180e499c19c471b7c |
Hashes for cchardet-1.1.3-cp35-cp35m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fb64c00ac0184a1b6c93b19c7bd47218d60d8eba3c00b15ac802979444e5a1bb |
|
MD5 | 0da49f9b53a45cac7ed260d6695fddcc |
|
BLAKE2b-256 | b4eddc53e92499037a092ce840fe61fa4294205fee9a90bbc5dc212e5eeb751b |
Hashes for cchardet-1.1.3-cp35-cp35m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 45735175ae5e0504258a0f7cced071d5a60ea3e50598921dd3f7c4aa824b1037 |
|
MD5 | 827e1b036410d9895c99c64800813751 |
|
BLAKE2b-256 | 5b78d475ccab1295c6ea48af028414daebb37e7a22cb5f528747905f058bd8f7 |
Hashes for cchardet-1.1.3-cp35-cp35m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8da0a24ee36f7a407dab16e3b3e1ff87b58af34bd273089dbacd3f61ce3c002d |
|
MD5 | 5e6b46edbbfff5daf257479188c6cc8d |
|
BLAKE2b-256 | 81343ec71f11b0790813bd46df5ed040b83d671e1d6abed8bef9ca0dea073558 |
Hashes for cchardet-1.1.3-cp35-cp35m-manylinux1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | adebe077aec746a3144fab8cebf88766476a0f4881a50c71abb592be9112eb0b |
|
MD5 | 506fdab500cef37186378b3acc15a4e4 |
|
BLAKE2b-256 | 3eac6d9be15ee801575edd0efcd5323f2c7fdf4aee69bbcea7d8e2ff9d52401b |
Hashes for cchardet-1.1.3-cp34-cp34m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 20437fc0c812b3e167ddf71239ad6a9072aa1f8e1d3a1271330e446ae2edf4cb |
|
MD5 | 09f31198f4036a84b91f7f38c479dac2 |
|
BLAKE2b-256 | 1d3eed5c893c83f08fa4c8adfb53606059cfe2f5dab15f8ceb99235309a42dc3 |
Hashes for cchardet-1.1.3-cp34-cp34m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 665cf8a7f22224a4740ad992677f97b68b074284dad08ebc9dfd8fd5b4fe8d75 |
|
MD5 | d92f719f890229dc3de17176f676c85f |
|
BLAKE2b-256 | 2bca6f4ff795d524589e8076f833701ca0dea86da1687efc6719b79c29474df9 |
Hashes for cchardet-1.1.3-cp34-cp34m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4bc4df2e8a8e418e4ad07f8114dcab066d3d83049a3d9f9d424abab1a0f45493 |
|
MD5 | 9d1585090b5ccb700878ebdeab2d7fa8 |
|
BLAKE2b-256 | d8f0181da08e086d62a8a57ab30d8ba0d42ddbb8b278365acbdd7205e5326794 |
Hashes for cchardet-1.1.3-cp34-cp34m-manylinux1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 04ec9fe899d55d5bb358ced4ba578f5242dc1fd73ac7ed3fb9f55672298ab0e0 |
|
MD5 | f7f8023a9943e695dc274e66f23d0aaa |
|
BLAKE2b-256 | 081be194b6a4517255ab0c6f18d88d7faf4686349904f57b8d2aca3ceb698c46 |
Hashes for cchardet-1.1.3-cp27-cp27mu-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0597b2a4fcecca07bafd0afd27794c8f79160c8c8b3ac46aeff18d650c5a248e |
|
MD5 | 33d33ca6cf8546ae10a39219ee3b3f97 |
|
BLAKE2b-256 | d7466659f454405d1f557edc186991d36969210a27346d8d89bfdc2ab90eef40 |
Hashes for cchardet-1.1.3-cp27-cp27mu-manylinux1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d1272d39154a0314b76c33898d09cacd7d435654285fbcecc97b88250a26f1ce |
|
MD5 | 938030306acb70eb4352d3db6fed796c |
|
BLAKE2b-256 | 55e5de9e72ed7e27d91cd5551790b926a79c44cc294bafcc71415f71bb45815b |
Hashes for cchardet-1.1.3-cp27-cp27m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1b785adb648ed9a087e58f556cedb0cf90e3acece8f0a8efdd1c58d29778c428 |
|
MD5 | 0c568d4cfbab81e5c0505cf70bcf749c |
|
BLAKE2b-256 | 6adff735780ba8c037578c69a01b21964aa5e0147e3c6111dedd36079445f8e1 |
Hashes for cchardet-1.1.3-cp27-cp27m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e88a932383e43de11c75567d3391e11a159ef8ed2524ab4f71a17020f3b7834f |
|
MD5 | 1b5c965f51f94b69bb298f8188c8d330 |
|
BLAKE2b-256 | 590ad163639985bf83d1dc4a20d95d6587e0f9c26d317e23ee7d0268441a6040 |
Hashes for cchardet-1.1.3-cp27-cp27m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b885af572379d02ddda08f1f2ebe0b945d30c508072b53199c51e054b8d58478 |
|
MD5 | fbcf0f538d3ec9ddcc432921a406546e |
|
BLAKE2b-256 | 6dc5ebedd3b9dcd43ea7c23c3372d511d4528607089471cbfc211df00d156f52 |
Hashes for cchardet-1.1.3-cp27-cp27m-manylinux1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fa0b85dca774101bd621ee29bd0b8b25ca1bb1affa463e145d3daf1d950debd7 |
|
MD5 | 97c97b54914b0a9c66d16ccc16f0aed0 |
|
BLAKE2b-256 | bff75ea66fea430b9c7581de57c4e5ee40879c5dd8861e0fb4a6c32505050a8e |