Skip to main content

cChardet is high speed universal character encoding detector.

Reason this release was yanked:

Does not import

Project description

cChardet

NOTICE: This is a fork of the original project at https://github.com/PyYoshi/cChardet since the original project is no longer maintained.

To install:

pip install faust-cchardet

cChardet is high speed universal character encoding detector. - binding to uchardet.

PyPI version Build for Linux Build for macOS Build for Windows

Supported Languages/Encodings

  • International (Unicode)

    • UTF-8

    • UTF-16BE / UTF-16LE

    • UTF-32BE / UTF-32LE / X-ISO-10646-UCS-4-34121 / X-ISO-10646-UCS-4-21431

  • Arabic

    • ISO-8859-6

    • WINDOWS-1256

  • Bulgarian

    • ISO-8859-5

    • WINDOWS-1251

  • Chinese

    • ISO-2022-CN

    • BIG5

    • EUC-TW

    • GB18030

    • HZ-GB-2312

  • Croatian:

    • ISO-8859-2

    • ISO-8859-13

    • ISO-8859-16

    • Windows-1250

    • IBM852

    • MAC-CENTRALEUROPE

  • Czech

    • Windows-1250

    • ISO-8859-2

    • IBM852

    • MAC-CENTRALEUROPE

  • Danish

    • ISO-8859-1

    • ISO-8859-15

    • WINDOWS-1252

  • English

    • ASCII

  • Esperanto

    • ISO-8859-3

  • Estonian

    • ISO-8859-4

    • ISO-8859-13

    • ISO-8859-13

    • Windows-1252

    • Windows-1257

  • Finnish

    • ISO-8859-1

    • ISO-8859-4

    • ISO-8859-9

    • ISO-8859-13

    • ISO-8859-15

    • WINDOWS-1252

  • French

    • ISO-8859-1

    • ISO-8859-15

    • WINDOWS-1252

  • German

    • ISO-8859-1

    • WINDOWS-1252

  • Greek

    • ISO-8859-7

    • WINDOWS-1253

  • Hebrew

    • ISO-8859-8

    • WINDOWS-1255

  • Hungarian:

    • ISO-8859-2

    • WINDOWS-1250

  • Irish Gaelic

    • ISO-8859-1

    • ISO-8859-9

    • ISO-8859-15

    • WINDOWS-1252

  • Italian

    • ISO-8859-1

    • ISO-8859-3

    • ISO-8859-9

    • ISO-8859-15

    • WINDOWS-1252

  • Japanese

    • ISO-2022-JP

    • SHIFT_JIS

    • EUC-JP

  • Korean

    • ISO-2022-KR

    • EUC-KR / UHC

  • Lithuanian

    • ISO-8859-4

    • ISO-8859-10

    • ISO-8859-13

  • Latvian

    • ISO-8859-4

    • ISO-8859-10

    • ISO-8859-13

  • Maltese

    • ISO-8859-3

  • Polish:

    • ISO-8859-2

    • ISO-8859-13

    • ISO-8859-16

    • Windows-1250

    • IBM852

    • MAC-CENTRALEUROPE

  • Portuguese

    • ISO-8859-1

    • ISO-8859-9

    • ISO-8859-15

    • WINDOWS-1252

  • Romanian:

    • ISO-8859-2

    • ISO-8859-16

    • Windows-1250

    • IBM852

  • Russian

    • ISO-8859-5

    • KOI8-R

    • WINDOWS-1251

    • MAC-CYRILLIC

    • IBM866

    • IBM855

  • Slovak

    • Windows-1250

    • ISO-8859-2

    • IBM852

    • MAC-CENTRALEUROPE

  • Slovene

    • ISO-8859-2

    • ISO-8859-16

    • Windows-1250

    • IBM852

    • M

Example

# -*- coding: utf-8 -*-
import cchardet as chardet
with open(r"src/tests/samples/wikipediaJa_One_Thousand_and_One_Nights_SJIS.txt", "rb") as f:
    msg = f.read()
    result = chardet.detect(msg)
    print(result)

Benchmark

$ cd src/
$ pip install chardet
$ python tests/bench.py

Results

CPU: Intel(R) Core(TM) i7-9700K CPU @ 3.60GHz

RAM: DDR4-3200 64GB

Platform: Ubuntu 20.04 amd64

Python 3.9.0

Request (call/s)

chardet v3.0.4

0.46

cchardet v2.1.7

1404.05

LICENSE

See COPYING file.

Contact

Platform

Support

  • Windows i686, x86_64

  • Linux i686, x86_64

  • macOS x86_64

Do not Support

CHANGES

2.x.x

2.1.7 (2020-10-27)

  • support Python 3.9

  • drop support for Python 3.5

2.1.6 (2020-03-17)

  • drop support for Python 2.7

  • support Github Actions

  • update dev-dependencies

2.1.5 (2019-09-27)

  • update language models (uchardet)

  • add iso8859-2 test but disabled it

  • support Python 3.8

  • drop support for Python 3.4

2.1.4 (2018-09-27)

  • disable LTO because become poor performance

2.1.3 (2018-09-26)

  • support Python 3.7

2.1.2 (2018-09-26)

  • enable LTO for wheel builds

  • update Cython

2.1.1 (2017-07-01)

  • fix that different results with different chuck sizes

  • fix that assignments to nsSMState in nsCodingStateMachine result in unspecified behavior

  • include COPYING in package

2.1.0 (2017-05-15)

2.0.1 (2017-04-25)

  • fix an issue where UTF-8 with a BOM would not be detected as UTF-8-SIG (fix #28)

  • pass NULL Byte to feed() / detect() (fix #27)

2.0.0 (2017-04-06)

  • Improve tests

2.0a4 (2017-04-05)

  • Update uchardet repo (Fix buffer overflow)

2.0a3 (2017-03-29)

  • Implement UniversalDetector (like chardet)

2.0a2 (2017-03-28)

  • Update uchardet repo (Fix memory leak)

2.0a1 (2017-03-28)

1.1.3 (2017-02-26)

  • Support AArch64

1.1.2 (2017-01-08)

  • Support Python 3.6

1.1.1 (2016-11-05)

  • Use len() function (9e61cb9e96b138b0d18e5f9e013e144202ae4067)

  • Remove detect function in _cchardet.pyx (25b581294fc0ae8f686ac9972c8549666766f695)

  • Support manylinux1 wheel

1.1.0 (2016-10-17)

  • Add Detector class

  • Improve unit tests

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

faust-cchardet-2.1.10rc0.tar.gz (302.5 kB view details)

Uploaded Source

Built Distributions

faust_cchardet-2.1.10rc0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (199.8 kB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64

faust_cchardet-2.1.10rc0-cp311-cp311-macosx_10_9_x86_64.whl (113.0 kB view details)

Uploaded CPython 3.11 macOS 10.9+ x86-64

faust_cchardet-2.1.10rc0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (199.8 kB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

faust_cchardet-2.1.10rc0-cp310-cp310-macosx_10_9_x86_64.whl (113.0 kB view details)

Uploaded CPython 3.10 macOS 10.9+ x86-64

faust_cchardet-2.1.10rc0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (199.8 kB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

faust_cchardet-2.1.10rc0-cp39-cp39-macosx_10_9_x86_64.whl (113.0 kB view details)

Uploaded CPython 3.9 macOS 10.9+ x86-64

faust_cchardet-2.1.10rc0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (199.8 kB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

faust_cchardet-2.1.10rc0-cp38-cp38-macosx_10_9_x86_64.whl (113.0 kB view details)

Uploaded CPython 3.8 macOS 10.9+ x86-64

faust_cchardet-2.1.10rc0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (199.8 kB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64

faust_cchardet-2.1.10rc0-cp37-cp37m-macosx_10_9_x86_64.whl (113.0 kB view details)

Uploaded CPython 3.7m macOS 10.9+ x86-64

faust_cchardet-2.1.10rc0-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (199.8 kB view details)

Uploaded CPython 3.6m manylinux: glibc 2.17+ x86-64

faust_cchardet-2.1.10rc0-cp36-cp36m-macosx_10_9_x86_64.whl (113.0 kB view details)

Uploaded CPython 3.6m macOS 10.9+ x86-64

File details

Details for the file faust-cchardet-2.1.10rc0.tar.gz.

File metadata

  • Download URL: faust-cchardet-2.1.10rc0.tar.gz
  • Upload date:
  • Size: 302.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.15

File hashes

Hashes for faust-cchardet-2.1.10rc0.tar.gz
Algorithm Hash digest
SHA256 522ecbf3630960c71a6971582d85de056eb902be88812e2ebf9f3e87153cceff
MD5 57ec8b61ce364ae30fc9631867a3a276
BLAKE2b-256 e6e2e06c728f44a1b9f6a0947dfb1d476b003c529f65b97a36a333fac1d64e4c

See more details on using hashes here.

File details

Details for the file faust_cchardet-2.1.10rc0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for faust_cchardet-2.1.10rc0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 a3223567ca3659b2bd5af3c5efa2a4778f99012183ce65e411212f0283b3758d
MD5 83ef3f975b973e58a4640a746225f02a
BLAKE2b-256 6bf97d3aeee0045860a0d64b1d6f185f2f7dcf4430757b72edfcecf07ab9a931

See more details on using hashes here.

File details

Details for the file faust_cchardet-2.1.10rc0-cp311-cp311-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for faust_cchardet-2.1.10rc0-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 046b7cfc24c6ee00c35cb8b30d7fe0b776ef883a764fe548ff5b0213caf16bce
MD5 a563c436c395c40029756d7bd90c0cff
BLAKE2b-256 eb53e289e78494c0a3734c29eeaa328c228d619c8f5cc392ea9dfefaa160e3f6

See more details on using hashes here.

File details

Details for the file faust_cchardet-2.1.10rc0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for faust_cchardet-2.1.10rc0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 85664ce3eb05f196004733a19e85b3703eddd842b8e672d6d4692d2ec9da5312
MD5 04c9f86c4af9cc79e3f9fc87df7c2ad0
BLAKE2b-256 14842e5ed046edab7d376e4203fc409dfe56601e20dedf799d9059538ca72d1c

See more details on using hashes here.

File details

Details for the file faust_cchardet-2.1.10rc0-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for faust_cchardet-2.1.10rc0-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 9d9f943ea1e2d822cec72ac4cd077c974dc61f62ebf0ae0a20279e2ca21983e3
MD5 9f7f8a45a8cb5d82eb76676aa8c41e3f
BLAKE2b-256 15650ac35bcc05eb387d4a70308f3726f8f7e5aee48c28ed2550c912959a891d

See more details on using hashes here.

File details

Details for the file faust_cchardet-2.1.10rc0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for faust_cchardet-2.1.10rc0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 81512166e95dd5472806fa1b26655efe4bb3535d00fccab572ab8be6fd481373
MD5 ef0ace65d684b2721a06b3ee305d7a76
BLAKE2b-256 e38485f7fa62afb5f78b56d1778ea5c4a0b5ea28984813790971437a2bfaf84b

See more details on using hashes here.

File details

Details for the file faust_cchardet-2.1.10rc0-cp39-cp39-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for faust_cchardet-2.1.10rc0-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 5bfc6524f917cec3d5cc204848297e50e862b3a5da3169c8a2e00794be27c2f1
MD5 914360787f4ee5d50e61ae07ee6be397
BLAKE2b-256 a3b96bfdbf1a58055ba64a2c87573018d964ca0005395c754efdd9318abef040

See more details on using hashes here.

File details

Details for the file faust_cchardet-2.1.10rc0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for faust_cchardet-2.1.10rc0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 2a830a6bdd60a390d837a1c6e27b89cfbccc1296e1783c5a51bac483762d72df
MD5 1fca58f8e1df4fe913e149222bb541bb
BLAKE2b-256 0e23cff5ca46f0a6d5caa62059015121ad59923c76fa568e99f86c431dce5312

See more details on using hashes here.

File details

Details for the file faust_cchardet-2.1.10rc0-cp38-cp38-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for faust_cchardet-2.1.10rc0-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 1c9700016cdaa844b5d1a64bcde66531ca26c17e1dc1fb5323bb455a85f04401
MD5 b4d8f2095821d587a2a4b0970b663b12
BLAKE2b-256 7509d568969deb8e5f14497ebee5b63333dfdb749b7be40881cd87ff6282abec

See more details on using hashes here.

File details

Details for the file faust_cchardet-2.1.10rc0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for faust_cchardet-2.1.10rc0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 afb596519538e9a6a683d751533d6b2392468713c0b369f78fe6c7fbac7d928a
MD5 9d1c845eb8b275805fb6e3b1d64a68be
BLAKE2b-256 b97a7967170d785035e3913d9bf2508d9d22ccf5a87ad79e716683cb0ccb7950

See more details on using hashes here.

File details

Details for the file faust_cchardet-2.1.10rc0-cp37-cp37m-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for faust_cchardet-2.1.10rc0-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 d92c40246f4a5d1ca81e282fb135c48ee281446859c87c062c0cc626e6b12752
MD5 0fad301cede117fc967b49ad741db5de
BLAKE2b-256 8048adc6609410377e831d11f6081e5ff2a37d7d7bc25916ac7f567d39dd6dc3

See more details on using hashes here.

File details

Details for the file faust_cchardet-2.1.10rc0-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for faust_cchardet-2.1.10rc0-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 01d108d3c1f5ccac4337d095361341f9dfee09e67be804eff5dec92f279bf735
MD5 eabbc317afffb4fa4c6788c58bf2e747
BLAKE2b-256 65b00603f14f696c6124079d2952e8132b8223d8b3f8dba902cb8cab8b09514e

See more details on using hashes here.

File details

Details for the file faust_cchardet-2.1.10rc0-cp36-cp36m-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for faust_cchardet-2.1.10rc0-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 60586c12fa0259b0a70659f366a1ad069b681e3e9af9d9b8c8ad83a58ab0d503
MD5 818f0b1b8b9c9f16e9bb8206da79ab13
BLAKE2b-256 8a163d7d6a1ed20ed6acf81cbe3b4a8df8d6b55d425b56c0fb8a58188b44a73f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page