cChardet is high speed universal character encoding detector.
Reason this release was yanked:
Does not import
Project description
cChardet
NOTICE: This is a fork of the original project at https://github.com/PyYoshi/cChardet since the original project is no longer maintained.
To install:
pip install faust-cchardet
cChardet is high speed universal character encoding detector. - binding to uchardet.
Supported Languages/Encodings
International (Unicode)
UTF-8
UTF-16BE / UTF-16LE
UTF-32BE / UTF-32LE / X-ISO-10646-UCS-4-34121 / X-ISO-10646-UCS-4-21431
Arabic
ISO-8859-6
WINDOWS-1256
Bulgarian
ISO-8859-5
WINDOWS-1251
Chinese
ISO-2022-CN
BIG5
EUC-TW
GB18030
HZ-GB-2312
Croatian:
ISO-8859-2
ISO-8859-13
ISO-8859-16
Windows-1250
IBM852
MAC-CENTRALEUROPE
Czech
Windows-1250
ISO-8859-2
IBM852
MAC-CENTRALEUROPE
Danish
ISO-8859-1
ISO-8859-15
WINDOWS-1252
English
ASCII
Esperanto
ISO-8859-3
Estonian
ISO-8859-4
ISO-8859-13
ISO-8859-13
Windows-1252
Windows-1257
Finnish
ISO-8859-1
ISO-8859-4
ISO-8859-9
ISO-8859-13
ISO-8859-15
WINDOWS-1252
French
ISO-8859-1
ISO-8859-15
WINDOWS-1252
German
ISO-8859-1
WINDOWS-1252
Greek
ISO-8859-7
WINDOWS-1253
Hebrew
ISO-8859-8
WINDOWS-1255
Hungarian:
ISO-8859-2
WINDOWS-1250
Irish Gaelic
ISO-8859-1
ISO-8859-9
ISO-8859-15
WINDOWS-1252
Italian
ISO-8859-1
ISO-8859-3
ISO-8859-9
ISO-8859-15
WINDOWS-1252
Japanese
ISO-2022-JP
SHIFT_JIS
EUC-JP
Korean
ISO-2022-KR
EUC-KR / UHC
Lithuanian
ISO-8859-4
ISO-8859-10
ISO-8859-13
Latvian
ISO-8859-4
ISO-8859-10
ISO-8859-13
Maltese
ISO-8859-3
Polish:
ISO-8859-2
ISO-8859-13
ISO-8859-16
Windows-1250
IBM852
MAC-CENTRALEUROPE
Portuguese
ISO-8859-1
ISO-8859-9
ISO-8859-15
WINDOWS-1252
Romanian:
ISO-8859-2
ISO-8859-16
Windows-1250
IBM852
Russian
ISO-8859-5
KOI8-R
WINDOWS-1251
MAC-CYRILLIC
IBM866
IBM855
Slovak
Windows-1250
ISO-8859-2
IBM852
MAC-CENTRALEUROPE
Slovene
ISO-8859-2
ISO-8859-16
Windows-1250
IBM852
M
Example
# -*- coding: utf-8 -*-
import cchardet as chardet
with open(r"src/tests/samples/wikipediaJa_One_Thousand_and_One_Nights_SJIS.txt", "rb") as f:
msg = f.read()
result = chardet.detect(msg)
print(result)
Benchmark
$ cd src/
$ pip install chardet
$ python tests/bench.py
Results
CPU: Intel(R) Core(TM) i7-9700K CPU @ 3.60GHz
RAM: DDR4-3200 64GB
Platform: Ubuntu 20.04 amd64
Python 3.9.0
Request (call/s) |
|
---|---|
chardet v3.0.4 |
0.46 |
cchardet v2.1.7 |
1404.05 |
LICENSE
See COPYING file.
Contact
Platform
Support
Windows i686, x86_64
Linux i686, x86_64
macOS x86_64
Do not Support
CHANGES
2.x.x
2.1.7 (2020-10-27)
support Python 3.9
drop support for Python 3.5
2.1.6 (2020-03-17)
drop support for Python 2.7
support Github Actions
update dev-dependencies
2.1.5 (2019-09-27)
update language models (uchardet)
add iso8859-2 test but disabled it
support Python 3.8
drop support for Python 3.4
2.1.4 (2018-09-27)
disable LTO because become poor performance
2.1.3 (2018-09-26)
support Python 3.7
2.1.2 (2018-09-26)
enable LTO for wheel builds
update Cython
2.1.1 (2017-07-01)
fix that different results with different chuck sizes
fix that assignments to nsSMState in nsCodingStateMachine result in unspecified behavior
include COPYING in package
2.1.0 (2017-05-15)
2.0.1 (2017-04-25)
2.0.0 (2017-04-06)
Improve tests
2.0a4 (2017-04-05)
Update uchardet repo (Fix buffer overflow)
2.0a3 (2017-03-29)
Implement UniversalDetector (like chardet)
2.0a2 (2017-03-28)
Update uchardet repo (Fix memory leak)
2.0a1 (2017-03-28)
Replace uchardet-enhanced to uchardet
Remove Detector class
1.1.3 (2017-02-26)
Support AArch64
1.1.2 (2017-01-08)
Support Python 3.6
1.1.1 (2016-11-05)
Use len() function (9e61cb9e96b138b0d18e5f9e013e144202ae4067)
Remove detect function in _cchardet.pyx (25b581294fc0ae8f686ac9972c8549666766f695)
Support manylinux1 wheel
1.1.0 (2016-10-17)
Add Detector class
Improve unit tests
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for faust_cchardet-2.1.11-cp311-cp311-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bf7f5b9a17c4b6f2f4361ac96506b0be19ac56eddd76abe8c1946f7f2cfcd45c |
|
MD5 | 65a3d585d593213ab2958398554a5fea |
|
BLAKE2b-256 | 0f1c610894990102206f794ac0fc7c2c1ad2bc5ce300dd682fd1ab7ece7e1500 |
Hashes for faust_cchardet-2.1.11-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9b951883621f17f0a435e140843d1c69421f69edb2d220efec32d006ecd04d35 |
|
MD5 | e552f667b65343d7d184ab2c82e6d4fe |
|
BLAKE2b-256 | 367c49139f5c62c1bf70983eeb84eaa26a5675e45dc63a6558e42719460438d3 |
Hashes for faust_cchardet-2.1.11-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cb31ac7bf1ccfa3e44c9832283585d3654aa2b270225becd982d96da420b3fa7 |
|
MD5 | a67773fe841971879eb075c653ca2ec4 |
|
BLAKE2b-256 | 2ba37fc1cab5937c88b78dc2e19557d445ac66cb7800bcfa35afc7a11d651a0a |
Hashes for faust_cchardet-2.1.11-cp310-cp310-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 28a26fa9d0e5ddc6cb5dc740884f30c49f766ac02c138ec2123043b54bf1648b |
|
MD5 | 1457605b67f0a424d1b5317d638b7cef |
|
BLAKE2b-256 | 29eb723795236a09815604c64bbaba8cd3cccce0e7e3141a5ac1bf4d76932572 |
Hashes for faust_cchardet-2.1.11-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3a800bf8bed2ef6b02277581cd44b8b8237eda633ee3f7880efa9fb61e6247ab |
|
MD5 | 9b698119ff78c266c61d42366e397c44 |
|
BLAKE2b-256 | e48dff1db3c745a733d5ebdf47a2f7b3414f5ca2b21bcc38283daa63ffe1f5b0 |
Hashes for faust_cchardet-2.1.11-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | eee2e875a8f1ee1afcd5089075b27b44f6f3e8547a27cefd055b875b49b7f8e7 |
|
MD5 | 06c47ce6f8c364e35b20dc13cb0385b7 |
|
BLAKE2b-256 | ac03d33891543a771d39c828499ef4c34d988611f1fc1035c4c59a2c7953d17f |
Hashes for faust_cchardet-2.1.11-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a59963b8fe685e28bb3a17656a1ee97b205eea19eef616e4c8617d69296e06e1 |
|
MD5 | db5832e83bfe4d4a7ceceac69ca8a381 |
|
BLAKE2b-256 | d31037e9fb40f56c3f3943c368f383888014a29cbc167161e7ac617d9effbf3d |
Hashes for faust_cchardet-2.1.11-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ea15abe311ba8c5b10d7dcb038b77175909016de0cd7134396cd21b9387a15c8 |
|
MD5 | 550c501cdc003f43bc8e4463e53b282d |
|
BLAKE2b-256 | 3c7f42d0f7523f96d885194f360c9450d054390ced1fa8b8a4f96224f60abe42 |
Hashes for faust_cchardet-2.1.11-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e52f7776eabb103e4e2e1cd9b8d9876e21d369e66ed1ab951bfcf9f5cd340bfb |
|
MD5 | fa83bfd450bf85be778d6ae05f1b7a5e |
|
BLAKE2b-256 | dfd7fdaf92f59f8cd16ef3897f9901523e4cdb92fbea018e244187132fe5ae08 |
Hashes for faust_cchardet-2.1.11-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 280fd87db318f828d5ac7b4ed24cab273409cd60f057843c1d28a023b42408d5 |
|
MD5 | c66732a5ca76ae43f047aaadc536d8ff |
|
BLAKE2b-256 | f51bc38747b9d36aab5c632a95f8962848c10272f95d6edde758f90575e72455 |
Hashes for faust_cchardet-2.1.11-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6b442e88861ebe5a8c5d6714539f0c5697ab522b2a5d10afa47dbe8b35d2d9ce |
|
MD5 | c24efa0ef856f0d0254675a462978540 |
|
BLAKE2b-256 | ad07e0fb289c05ec818e32cd73cd93797e979f53ec9fb228ad25518eca5dbc0b |
Hashes for faust_cchardet-2.1.11-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 79476fa24930adc6b59c21bf838a10842e99e8ffdcac4c2576ac6343ab1d49bb |
|
MD5 | fda9f51efe1b28092b58f1079bbbc019 |
|
BLAKE2b-256 | 6eaa030dde57e71f1c0962cbbee16eb90c61f2be6f86be5af560dd36e9b433c8 |
Hashes for faust_cchardet-2.1.11-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 427262d5a4364693a7f475fba6b6a39503591d28e5339741b2641638e04f880c |
|
MD5 | dbdc4b20063dd20038290e8d103be24f |
|
BLAKE2b-256 | 4d6e89889e9f93b7c2528ce271e7eb9f40a5768c6c269e040390b7b0b043adfa |
Hashes for faust_cchardet-2.1.11-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | af1fb3d06aba6ec80e693d9c00f4cd141f612455d8f3d33cf02cecb05c0551d5 |
|
MD5 | 44acc59a01972cf82b3a134dd42d1435 |
|
BLAKE2b-256 | 50b4f18287fdc96a9d65ac61e3968de12389671f939ea9010362c63b7b40cdb7 |
Hashes for faust_cchardet-2.1.11-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f6d2f0afcb913c26b877d53dbc42e0b16ff7ea280b751e1a34eeb1fed1252b84 |
|
MD5 | d54223607453798360903c8d2769c927 |
|
BLAKE2b-256 | 966348bb562d3fc41c59ae6a70b87d113f8b58f642bf5e4bacaac0a59cabb89f |
Hashes for faust_cchardet-2.1.11-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e753d58476d479bf975bf275801f51d0ee692c0568e8388c809d382d8632753c |
|
MD5 | 08316c4e8905a03c4ee42d30d9567151 |
|
BLAKE2b-256 | 6571834f9f800d70b274579d76c7f071a15419c2dd1f364088cb7e264cfa208a |
Hashes for faust_cchardet-2.1.11-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4b53db42f8ec3ff10400bc5d5f81d52ab42c54162f8189cdb3a463e484796413 |
|
MD5 | e0f4849047c29029d6578b9e9d3788cd |
|
BLAKE2b-256 | a2f58f6d9a00415070d200b5b62cae75a3da6e2784f1712c72b662862404ece5 |
Hashes for faust_cchardet-2.1.11-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e9cbfd70d8aba378e5a31ee1c273079a5310b35fce0e86271bf269b954229eab |
|
MD5 | dd5fc64ffdf0ac1b49824a03025c28e4 |
|
BLAKE2b-256 | 74b004ca5a1977445460c932e3a05c60170c65f92cb8247e879c8ccef6852cbe |