Universal encoding detector. This library is faster than chardet.
Project description
cChardet
cChardet is high speed universal character encoding detector. - binding to charsetdetect.
Support codecs
Big5
EUC-JP
EUC-KR
GB18030
HZ-GB-2312
IBM855
IBM866
ISO-2022-CN
ISO-2022-JP
ISO-2022-KR
ISO-8859-2
ISO-8859-5
ISO-8859-7
ISO-8859-8
KOI8-R
Shift_JIS
TIS-620
UTF-8
UTF-16BE
UTF-16LE
UTF-32BE
UTF-32LE
WINDOWS-1250
WINDOWS-1251
WINDOWS-1252
WINDOWS-1253
WINDOWS-1255
EUC-TW
X-ISO-10646-UCS-4-2143
X-ISO-10646-UCS-4-3412
x-mac-cyrillic
Requirements
Example
# -*- coding: utf-8 -*-
import cchardet as chardet
with open(r"src/tests/testdata/wikipediaJa_One_Thousand_and_One_Nights_SJIS.txt", "rb") as f:
msg = f.read()
result = chardet.detect(msg)
print(result)
Benchmark
$ cd src/
$ pip install chardet
$ python tests/bench.py
Results
CPU: Intel(R) Core(TM) i3-4170 CPU @ 3.70GHz
RAM: DDR3 1600Mhz 16GB
Platform: Ubuntu 16.04 amd64
Python 2.7.12
Request (call/s) |
|
---|---|
chardet |
0.26 |
cchardet |
1408.73 |
Python 3.5.2
Request (call/s) |
|
---|---|
chardet |
0.28 |
cchardet |
1380.40 |
License
The MIT License: src/cchardet
Other Libraries License: Please, look at the src/ext directory.
Thanks
Contact
CHANGES
1.1.0 (2016-10-17)
Add Detector class
Improve unit tests
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for cchardet-1.1.0-cp35-cp35m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4cbf3a771064dc2e9a8fd6cce5ca49757ba4de0004de1336a98361d093f51ee4 |
|
MD5 | c636a5d530388749ddc4b22c84bda085 |
|
BLAKE2b-256 | 39c16b96fc0a90f92d8f95599e92cc4f8736ef00d4bd526491c28dce95a2a7fe |
Hashes for cchardet-1.1.0-cp35-cp35m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5614e275322fe51539f558faba656001ed5178483b410c4420a4cecdbcddb2cf |
|
MD5 | 7903f64c57fc7444f2fe63410c7f1579 |
|
BLAKE2b-256 | aafd078fd935d79a9cb3eb94736a32ede77ddb6676a949af97b1c1463c1eb100 |
Hashes for cchardet-1.1.0-cp34-cp34m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8b7af1f99d2b95db5279abbb21962f2d51bb19323c5a0df4d89e76bac437a008 |
|
MD5 | aec6ced0c62ff10de2dab9e40f03393d |
|
BLAKE2b-256 | 5a82f2798fceeb6229885fda0521a167c540438cad0d20544d89142a23d760ff |
Hashes for cchardet-1.1.0-cp34-cp34m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f989c2aad52fee2065ffcb23fd1834a8fb65b52b325ecd960059b9572a9d5f58 |
|
MD5 | 28175c0bc2cb3b814339c344634c2f56 |
|
BLAKE2b-256 | 87912960fdd209132dcef426699935ff93bf0e12be92b6fe163b955075c593e2 |
Hashes for cchardet-1.1.0-cp27-cp27m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 38ee772a5e6fc2e8c1b4cdd2aee7bd5dedd9313d145a107b200c0d4a066ec071 |
|
MD5 | 4b06d33d2cad708eef19c1cdb1a66ffc |
|
BLAKE2b-256 | 2ffa483ebd45ad7913d53da44e69319021643c03ecf9ed13d6cda4d881c41a8c |
Hashes for cchardet-1.1.0-cp27-cp27m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fdd92e61974e1fbe9e1537da08750c0a1f1bbc0c429ea6b42710572908106414 |
|
MD5 | 1dfdea2e0e2111d3d75e798e52b00cc2 |
|
BLAKE2b-256 | ad919641301b6d49855108b44e8e4cf35fb306eb9676c6f0b54adb117555418a |