cChardet is high speed universal character encoding detector.
Reason this release was yanked:
Does not import
Project description
cChardet
NOTICE: This is a fork of the original project at https://github.com/PyYoshi/cChardet since the original project is no longer maintained.
To install:
pip install faust-cchardet
cChardet is high speed universal character encoding detector. - binding to uchardet.
Supported Languages/Encodings
International (Unicode)
UTF-8
UTF-16BE / UTF-16LE
UTF-32BE / UTF-32LE / X-ISO-10646-UCS-4-34121 / X-ISO-10646-UCS-4-21431
Arabic
ISO-8859-6
WINDOWS-1256
Bulgarian
ISO-8859-5
WINDOWS-1251
Chinese
ISO-2022-CN
BIG5
EUC-TW
GB18030
HZ-GB-2312
Croatian:
ISO-8859-2
ISO-8859-13
ISO-8859-16
Windows-1250
IBM852
MAC-CENTRALEUROPE
Czech
Windows-1250
ISO-8859-2
IBM852
MAC-CENTRALEUROPE
Danish
ISO-8859-1
ISO-8859-15
WINDOWS-1252
English
ASCII
Esperanto
ISO-8859-3
Estonian
ISO-8859-4
ISO-8859-13
ISO-8859-13
Windows-1252
Windows-1257
Finnish
ISO-8859-1
ISO-8859-4
ISO-8859-9
ISO-8859-13
ISO-8859-15
WINDOWS-1252
French
ISO-8859-1
ISO-8859-15
WINDOWS-1252
German
ISO-8859-1
WINDOWS-1252
Greek
ISO-8859-7
WINDOWS-1253
Hebrew
ISO-8859-8
WINDOWS-1255
Hungarian:
ISO-8859-2
WINDOWS-1250
Irish Gaelic
ISO-8859-1
ISO-8859-9
ISO-8859-15
WINDOWS-1252
Italian
ISO-8859-1
ISO-8859-3
ISO-8859-9
ISO-8859-15
WINDOWS-1252
Japanese
ISO-2022-JP
SHIFT_JIS
EUC-JP
Korean
ISO-2022-KR
EUC-KR / UHC
Lithuanian
ISO-8859-4
ISO-8859-10
ISO-8859-13
Latvian
ISO-8859-4
ISO-8859-10
ISO-8859-13
Maltese
ISO-8859-3
Polish:
ISO-8859-2
ISO-8859-13
ISO-8859-16
Windows-1250
IBM852
MAC-CENTRALEUROPE
Portuguese
ISO-8859-1
ISO-8859-9
ISO-8859-15
WINDOWS-1252
Romanian:
ISO-8859-2
ISO-8859-16
Windows-1250
IBM852
Russian
ISO-8859-5
KOI8-R
WINDOWS-1251
MAC-CYRILLIC
IBM866
IBM855
Slovak
Windows-1250
ISO-8859-2
IBM852
MAC-CENTRALEUROPE
Slovene
ISO-8859-2
ISO-8859-16
Windows-1250
IBM852
M
Example
# -*- coding: utf-8 -*-
import cchardet as chardet
with open(r"src/tests/samples/wikipediaJa_One_Thousand_and_One_Nights_SJIS.txt", "rb") as f:
msg = f.read()
result = chardet.detect(msg)
print(result)
Benchmark
$ cd src/
$ pip install chardet
$ python tests/bench.py
Results
CPU: Intel(R) Core(TM) i7-9700K CPU @ 3.60GHz
RAM: DDR4-3200 64GB
Platform: Ubuntu 20.04 amd64
Python 3.9.0
Request (call/s) |
|
---|---|
chardet v3.0.4 |
0.46 |
cchardet v2.1.7 |
1404.05 |
LICENSE
See COPYING file.
Contact
Platform
Support
Windows i686, x86_64
Linux i686, x86_64
macOS x86_64
Do not Support
CHANGES
2.x.x
2.1.7 (2020-10-27)
support Python 3.9
drop support for Python 3.5
2.1.6 (2020-03-17)
drop support for Python 2.7
support Github Actions
update dev-dependencies
2.1.5 (2019-09-27)
update language models (uchardet)
add iso8859-2 test but disabled it
support Python 3.8
drop support for Python 3.4
2.1.4 (2018-09-27)
disable LTO because become poor performance
2.1.3 (2018-09-26)
support Python 3.7
2.1.2 (2018-09-26)
enable LTO for wheel builds
update Cython
2.1.1 (2017-07-01)
fix that different results with different chuck sizes
fix that assignments to nsSMState in nsCodingStateMachine result in unspecified behavior
include COPYING in package
2.1.0 (2017-05-15)
2.0.1 (2017-04-25)
2.0.0 (2017-04-06)
Improve tests
2.0a4 (2017-04-05)
Update uchardet repo (Fix buffer overflow)
2.0a3 (2017-03-29)
Implement UniversalDetector (like chardet)
2.0a2 (2017-03-28)
Update uchardet repo (Fix memory leak)
2.0a1 (2017-03-28)
Replace uchardet-enhanced to uchardet
Remove Detector class
1.1.3 (2017-02-26)
Support AArch64
1.1.2 (2017-01-08)
Support Python 3.6
1.1.1 (2016-11-05)
Use len() function (9e61cb9e96b138b0d18e5f9e013e144202ae4067)
Remove detect function in _cchardet.pyx (25b581294fc0ae8f686ac9972c8549666766f695)
Support manylinux1 wheel
1.1.0 (2016-10-17)
Add Detector class
Improve unit tests
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for faust_cchardet-2.1.12-cp311-cp311-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 97896ed59ca47db2b5ac2db4be50f7f411e87495e5b1a2ca7ed381c4f442906c |
|
MD5 | 14b999218faa51493c90281c1cea5e5a |
|
BLAKE2b-256 | 12803bc9b5ac2335f2570970b8238ec0f40c9b2cc4c2fe7d082e8ef5ecbe51c5 |
Hashes for faust_cchardet-2.1.12-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f4bf8be8efa3a000e26e29d3ce567fb03fa8461d6cf5b6f815942d8e22a3be00 |
|
MD5 | 18db321a4da07525892eeb4bb81318ef |
|
BLAKE2b-256 | 3e30f00cd4875a6ebc518623be3844df92f6d030966fb54ca207e32a990cd040 |
Hashes for faust_cchardet-2.1.12-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e6b5e08bd54058c65c8c9b0de79e84a2fddfb9e67924568d26e8075a75047d3e |
|
MD5 | 94dcb78bc0cf5f8a02529a8854249160 |
|
BLAKE2b-256 | b7a7253d339ef3f6f95de66d344a72b1920d8c728be13253c6aab66962a099ad |
Hashes for faust_cchardet-2.1.12-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 26b78844b96778419e66bae5fc1c147cc681cbc82271761af8651f25e55b13aa |
|
MD5 | f18370faac78abb223e2327f50047f67 |
|
BLAKE2b-256 | dcc4a3f09b62e3972fe51852942af41d1d3533e2dd0d42f752d23cf01b32344d |
Hashes for faust_cchardet-2.1.12-cp310-cp310-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 81909991e18c227a345c3665fff8512b30cc39909d34015ba6408df513484ebd |
|
MD5 | 1aa210737272eea3b8f177c18bc0d7e5 |
|
BLAKE2b-256 | 3a6358b1d8b40892f7276567be774f111a97586db3ba8e5e2c70708ac69e8324 |
Hashes for faust_cchardet-2.1.12-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bafb7a320bf2c4af7fb7799d8f9cff08dd25777ce9319a1528770f6f562f945b |
|
MD5 | c5b8c4de925e84b65087fba3081a81d3 |
|
BLAKE2b-256 | 13eef2dc80c74bd9f74d1ccf5499b21ad2d06a140a7b6218b55713619490ecb8 |
Hashes for faust_cchardet-2.1.12-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c94c94f87c73c32cc1d208a534a2852bb07990108386517652f35dbdd4112107 |
|
MD5 | 955afee9d6a3978d5b262cda70516b99 |
|
BLAKE2b-256 | 275337ce7c7419a983df403ec230d03aa6b30be685293c367bdd61077a188523 |
Hashes for faust_cchardet-2.1.12-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c2485e24149514b0fe3ae6cd517b1f654959a0a0fa96b149bc885e24091b5e1a |
|
MD5 | 0bcaa3b3490d7218ea44e80bc69e739a |
|
BLAKE2b-256 | ace5857f07206bf8f1caedc4ac7fddc8607a75d9233ef53a1390f78a762f70e5 |
Hashes for faust_cchardet-2.1.12-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e3b002d48fcc7eccf25bcf939f88af631904698cac60a5e450f12f329f2116f1 |
|
MD5 | 9157231ca8e00212b1aabf3dbd32c397 |
|
BLAKE2b-256 | 5ecfae0fc3cb3abd06c594813437d6415107b11f3e6aa7edcaf84e0abff86b4e |
Hashes for faust_cchardet-2.1.12-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3911d039ab1f7633600180e05e29dec26f5a32749476f2359a80a24ce750d7da |
|
MD5 | b4b49cb30e36fe00b2ed795eccf12961 |
|
BLAKE2b-256 | 5851c3509ec21317914b12498cf597478abacd505cbbf6d860afcf21af75fcf1 |
Hashes for faust_cchardet-2.1.12-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2219cbab440c8b59b347efb3ca76f4f6538c0573d782a68957ed40d8e3c26a9f |
|
MD5 | f7695efaaa9d33c92d6ea8cdb41475bf |
|
BLAKE2b-256 | bef87d5f4487efbefbf4461f33d933ddc544b144b1d715f4252495083040861d |
Hashes for faust_cchardet-2.1.12-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1ae48d35cdf2caa265a5dbd40173b74a60292d37dfa359be062f6f65c538e080 |
|
MD5 | d31739647d4897622b93cb925c004d4c |
|
BLAKE2b-256 | 2c09dad2f934846466457d8c8f522e59ff8abd2d85abb317401cdd7fac125fc7 |
Hashes for faust_cchardet-2.1.12-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b0481a0ccbb0685a84257f3770f9d2d455690c2881c8a6a08c220d714e93d3a6 |
|
MD5 | 6d3ad5db3aa63d01546e7f0cb9f45c06 |
|
BLAKE2b-256 | 6c76e9f0dd7b6b0e8c5dc022a44611c57b31404b84cb675a23f87b83d2ccceb3 |
Hashes for faust_cchardet-2.1.12-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 69cb2b4580a10f80d73bd70f6b68136ccbc120e8b05a10926fc209a7ef3091f3 |
|
MD5 | 75eb86e66aaa57d63822ba28b3d66e3b |
|
BLAKE2b-256 | 6ad1fe0c65f405864515b4c3957284abb1f89efdccc43a6340ae896db8033353 |
Hashes for faust_cchardet-2.1.12-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7df1bb58f5913edc8ef8397596ca245e8cbaa6c16cc7fa267af260aa17e2aac8 |
|
MD5 | c862c96a96a9b4e3224ba1fb69521ea3 |
|
BLAKE2b-256 | a3815878523eefc0d2e75f528db70a4a146d73a7e585e78ab903492fb31157bc |
Hashes for faust_cchardet-2.1.12-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6348d0e691aa2b03e7b6837a29ba62d3a2c01d4016abe265e7b5bce01eb95846 |
|
MD5 | 01212bc4508e3a42f9c267d5a5507498 |
|
BLAKE2b-256 | a54f7e93a8f6e62210404ceb76d534f1f6b154a7a1bf98a9dc9653383ef4aacc |
Hashes for faust_cchardet-2.1.12-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0963991ef0fd0b2403fb7345edef4968d46b259e04c2eda50abaea9386713bcd |
|
MD5 | 93bfbc77646b3af842f8e0aeb1b2360c |
|
BLAKE2b-256 | 207d338285c9b806f9fa562e93d41defe803c8e67f763cb7c0765600c0cf846f |
Hashes for faust_cchardet-2.1.12-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3ba537adbe3dc3d17e81a87d2afd2165e676ae3f6101dedfb6dd0a345dcb5459 |
|
MD5 | 19d2f8b6fa66492846ca99bef2e1f3c3 |
|
BLAKE2b-256 | 359a95145b49ead0604949adee619dd92b2e5b48e0aa21752a9534152c890d60 |
Hashes for faust_cchardet-2.1.12-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b6dcee31f975c53f0f34a4602b45cec0b2f0ee11376c56fea3f7e32460497a68 |
|
MD5 | cf2a1697ed5a51d0216c3b656a7145d5 |
|
BLAKE2b-256 | 9c9fbda627eeefea7e257e04cf7756e8ae4b0a20d176eab09b0ca296a022b0ff |
Hashes for faust_cchardet-2.1.12-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7ecae1a46e241aa8c7221b018dd9be5f533e2da7437e30278019e7f9f799311f |
|
MD5 | 92e2f8966f8b63f7345c3b8ddeb26794 |
|
BLAKE2b-256 | 68bc59df2d390fd59c483e56cf3ad1f783b094a20bd54f19a3c448b8c0f1d68e |
Hashes for faust_cchardet-2.1.12-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | eb41f6a08fd398eef3ceb3416bf3920e2ccd8cd2048d308cc53cf59c4ad1fcc3 |
|
MD5 | a4b141fb83f6c8953ebb669b056a8dce |
|
BLAKE2b-256 | 841a8badc60d13aca216e2a823592c85e193a04204044ca28081e2abf67e6911 |
Hashes for faust_cchardet-2.1.12-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4932b7909ed83f1af054761258973e1cd51b7f794fe84643ac190ab52b0f1cdf |
|
MD5 | d1fd8a9e284c14f02e2f218725ebeb07 |
|
BLAKE2b-256 | 43632b8da7fb2cf31b05f82a1953aec45303c36f0546feaff8eae4885165716c |
Hashes for faust_cchardet-2.1.12-cp36-cp36m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c039fc2fcbe9b4644c3918dd1f3dca4a158ad29d09a8661eaa7ef4cb72adf450 |
|
MD5 | 1030249755e95baf01f915be27e2fce6 |
|
BLAKE2b-256 | 02cea93355d3dac3be790dd652965a9b89c6a54dfb86b2cf0c69eb5af880f89b |
Hashes for faust_cchardet-2.1.12-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a63a7a7c08b2bfa988e62d37c86ac0e175a424adb6cf85a57946098cd052aa2c |
|
MD5 | 26a20807f617dfab6b039c762dfed566 |
|
BLAKE2b-256 | 9bf713e66a43544d7752505ace882ad1e5f4dcdf5cf0a4f1a3e83dc5aabcbc43 |