Universal encoding detector. This library is faster than chardet.
Project description
cChardet
cChardet is high speed universal character encoding detector. - binding to charsetdetect.
Support codecs
Big5
EUC-JP
EUC-KR
GB18030
HZ-GB-2312
IBM855
IBM866
ISO-2022-CN
ISO-2022-JP
ISO-2022-KR
ISO-8859-2
ISO-8859-5
ISO-8859-7
ISO-8859-8
KOI8-R
Shift_JIS
TIS-620
UTF-8
UTF-16BE
UTF-16LE
UTF-32BE
UTF-32LE
WINDOWS-1250
WINDOWS-1251
WINDOWS-1252
WINDOWS-1253
WINDOWS-1255
EUC-TW
X-ISO-10646-UCS-4-2143
X-ISO-10646-UCS-4-3412
x-mac-cyrillic
Requirements
Example
# -*- coding: utf-8 -*-
import cchardet as chardet
with open(r"src/tests/testdata/wikipediaJa_One_Thousand_and_One_Nights_SJIS.txt", "rb") as f:
msg = f.read()
result = chardet.detect(msg)
print(result)
Benchmark
$ cd src/
$ pip install chardet
$ python tests/bench.py
Results
CPU: Intel(R) Core(TM) i3-4170 CPU @ 3.70GHz
RAM: DDR3 1600Mhz 16GB
Platform: Ubuntu 16.04 amd64
Python 2.7.12
Request (call/s) |
|
---|---|
chardet |
0.26 |
cchardet |
1408.73 |
Python 3.5.2
Request (call/s) |
|
---|---|
chardet |
0.28 |
cchardet |
1380.40 |
License
The MIT License: src/cchardet
Other Libraries License: Please, look at the src/ext directory.
Thanks
Contact
CHANGES
1.x.x
none
1.1.1
Use len() function (9e61cb9e96b138b0d18e5f9e013e144202ae4067)
Remove detect function in _cchardet.pyx (25b581294fc0ae8f686ac9972c8549666766f695)
Support manylinux1 wheel
1.1.0 (2016-10-17)
Add Detector class
Improve unit tests
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for cchardet-1.1.1-cp35-cp35m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 23bcb645ae49ba1dd21dc4e9daa44f7afbf401030f0f3d85490b5494af1bb19f |
|
MD5 | fdb823750710be40b1c19be09f97358f |
|
BLAKE2b-256 | 2a390a67d8482f3560daf130567a2124bb2a09b10fe3baba50de184b5cfc3372 |
Hashes for cchardet-1.1.1-cp35-cp35m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5ab9bdde38bef4ea875fde9091ff4e8461686bfcfe25078622dd82e71755ebbe |
|
MD5 | bb1539cd6fdf20deea6b4103c58c309d |
|
BLAKE2b-256 | d5c78525ee6f855b9ffb01581f8038e7789dbc63ccced209cb4d723f6f541c8c |
Hashes for cchardet-1.1.1-cp35-cp35m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e2c330c47cf2668f2b2d054deba47f2d7c497f816e37b07a0dbe117b1a25d49d |
|
MD5 | b806c0fb3680a0e43f95d8bea231746d |
|
BLAKE2b-256 | 47906331a26c9d8578187a92c645ef16904f11bf5ecd77a7b1b479e13a87a15e |
Hashes for cchardet-1.1.1-cp35-cp35m-manylinux1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c087acc5cddb90429bb5d20770e45de03ce76519aca3905d5882a48f3b1efc9e |
|
MD5 | e6b979b8a31fe41150f7bfb75d3ab9be |
|
BLAKE2b-256 | 0bd8bba9286dece92e89c61257a10df190b4e467b82fe47d0b2270dfb1f8ccd6 |
Hashes for cchardet-1.1.1-cp34-cp34m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8c0b476f7ea0337982636cf10b5cefd4cad90446e5605f91dcba20cc88365016 |
|
MD5 | 3ca3096c5243672af3675691cc64da64 |
|
BLAKE2b-256 | f5d10bd534a99cac7a99d5a84445bff9aa2b3a56fb6b30a3ced3eac55faa8d07 |
Hashes for cchardet-1.1.1-cp34-cp34m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 623613fe4940679230881d9f16459013a4c5a7cb8b9b1d592d871893a7bc071e |
|
MD5 | b0df7f224d7488c462065eac577deab7 |
|
BLAKE2b-256 | e1aa7b80d28f2c9903fff84404eb31b8a78dd22ad8a04ed665820780e620cd02 |
Hashes for cchardet-1.1.1-cp34-cp34m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a6ca59c776fc6287138047d5a3b1dadaf97d9e1774912e02ae49468e6c7c9ccb |
|
MD5 | 314f43bec4d9d76356ab4eb0de0493c5 |
|
BLAKE2b-256 | ffd2a3aec93aea41fa60d8d2892db3cb30d42cbe042011c8af2bf9487af88b08 |
Hashes for cchardet-1.1.1-cp34-cp34m-manylinux1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5b40be88887c795977b518e63e02da5705f4bb1b81c07b6e48d227fe7c738fd1 |
|
MD5 | 0fd7553caa8ae03772b55049bbb2fd4c |
|
BLAKE2b-256 | 3d055573941fb233ae1daa49e8e1996626a608a0f4cd5234a6c5e0225188bdc5 |
Hashes for cchardet-1.1.1-cp27-cp27mu-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4c7669b0f5c68d5597d30ac16ffdafad980b3364a74a6954e25d8c35e5b9b04c |
|
MD5 | ddd2b2664925213c1c6ef481295ebdc2 |
|
BLAKE2b-256 | b2df3fa1aea31d165cc2005d47482974a6aa41340678af44b8d7a7a0bd2e0dc1 |
Hashes for cchardet-1.1.1-cp27-cp27mu-manylinux1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a8caeb556bf2e7c5fbbc93721ef59c756c7a54295754c1d9d2fb7c297f786ae2 |
|
MD5 | 70f00b876fbb2f1f84ca1df93a95cf52 |
|
BLAKE2b-256 | 38386e53156499e136aef7179ae649d61592bbc3fa1c552f263870810c48932a |
Hashes for cchardet-1.1.1-cp27-cp27m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1094a4a2ccc0c9f576032d366ae37c51186d28c94d67ad903bda524025fccb5d |
|
MD5 | 4470678091ecd4cec1c57a63cf9055e3 |
|
BLAKE2b-256 | 89155da6c6a5fc4deec952d02d28a566e5bc34990f9f0530130da9b289134c3c |
Hashes for cchardet-1.1.1-cp27-cp27m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 707c518a03662fbf365234d244bf0726e18bbe7d1f705085b19bf2198e1fe5c7 |
|
MD5 | 90c23ac358fc4e101a41eea7242b73cc |
|
BLAKE2b-256 | 2630e83c5c7a295224ee6ac9b8b6372362b1dc16c77bc3391d59ab23fb49b7b0 |
Hashes for cchardet-1.1.1-cp27-cp27m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 57e52276480a1a6029c1e7e9d10dbc628ef27b8a49bcf3f80fa40aaf09ceeaf4 |
|
MD5 | 3fccfb8dcfaa0722288e64a6f9c139ff |
|
BLAKE2b-256 | abcd413069cef5f9aaa9a4614b9e1ca979bfd841d937e016495368c547610fc1 |
Hashes for cchardet-1.1.1-cp27-cp27m-manylinux1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7a46f263344d87e49319dc71e9bfb427aec694daedeb7dd2b27d344864858c8d |
|
MD5 | 395ba426882db88e21c7f7091d1214fa |
|
BLAKE2b-256 | 1a9a88f60179a46175b22b88e831ffafe9a26780841d700837f9c3a913cf7256 |