Skip to main content

cChardet is high speed universal character encoding detector.

Project description

cChardet

cChardet is high speed universal character encoding detector. - binding to uchardet.

PyPI version Travis Ci build status AppVeyor build status

Supported Languages/Encodings

  • International (Unicode)

    • UTF-8

    • UTF-16BE / UTF-16LE

    • UTF-32BE / UTF-32LE / X-ISO-10646-UCS-4-34121 / X-ISO-10646-UCS-4-21431

  • Arabic

    • ISO-8859-6

    • WINDOWS-1256

  • Bulgarian

    • ISO-8859-5

    • WINDOWS-1251

  • Chinese

    • ISO-2022-CN

    • BIG5

    • EUC-TW

    • GB18030

    • HZ-GB-2312

  • Croatian:

    • ISO-8859-2

    • ISO-8859-13

    • ISO-8859-16

    • Windows-1250

    • IBM852

    • MAC-CENTRALEUROPE

  • Czech

    • Windows-1250

    • ISO-8859-2

    • IBM852

    • MAC-CENTRALEUROPE

  • Danish

    • ISO-8859-1

    • ISO-8859-15

    • WINDOWS-1252

  • English

    • ASCII

  • Esperanto

    • ISO-8859-3

  • Estonian

    • ISO-8859-4

    • ISO-8859-13

    • ISO-8859-13

    • Windows-1252

    • Windows-1257

  • Finnish

    • ISO-8859-1

    • ISO-8859-4

    • ISO-8859-9

    • ISO-8859-13

    • ISO-8859-15

    • WINDOWS-1252

  • French

    • ISO-8859-1

    • ISO-8859-15

    • WINDOWS-1252

  • German

    • ISO-8859-1

    • WINDOWS-1252

  • Greek

    • ISO-8859-7

    • WINDOWS-1253

  • Hebrew

    • ISO-8859-8

    • WINDOWS-1255

  • Hungarian:

    • ISO-8859-2

    • WINDOWS-1250

  • Irish Gaelic

    • ISO-8859-1

    • ISO-8859-9

    • ISO-8859-15

    • WINDOWS-1252

  • Italian

    • ISO-8859-1

    • ISO-8859-3

    • ISO-8859-9

    • ISO-8859-15

    • WINDOWS-1252

  • Japanese

    • ISO-2022-JP

    • SHIFT_JIS

    • EUC-JP

  • Korean

    • ISO-2022-KR

    • EUC-KR / UHC

  • Lithuanian

    • ISO-8859-4

    • ISO-8859-10

    • ISO-8859-13

  • Latvian

    • ISO-8859-4

    • ISO-8859-10

    • ISO-8859-13

  • Maltese

    • ISO-8859-3

  • Polish:

    • ISO-8859-2

    • ISO-8859-13

    • ISO-8859-16

    • Windows-1250

    • IBM852

    • MAC-CENTRALEUROPE

  • Portuguese

    • ISO-8859-1

    • ISO-8859-9

    • ISO-8859-15

    • WINDOWS-1252

  • Romanian:

    • ISO-8859-2

    • ISO-8859-16

    • Windows-1250

    • IBM852

  • Russian

    • ISO-8859-5

    • KOI8-R

    • WINDOWS-1251

    • MAC-CYRILLIC

    • IBM866

    • IBM855

  • Slovak

    • Windows-1250

    • ISO-8859-2

    • IBM852

    • MAC-CENTRALEUROPE

  • Slovene

    • ISO-8859-2

    • ISO-8859-16

    • Windows-1250

    • IBM852

    • M

Example

# -*- coding: utf-8 -*-
import cchardet as chardet
with open(r"src/tests/samples/wikipediaJa_One_Thousand_and_One_Nights_SJIS.txt", "rb") as f:
    msg = f.read()
    result = chardet.detect(msg)
    print(result)

Benchmark

$ cd src/
$ pip install chardet
$ python tests/bench.py

Results

CPU: Intel(R) Core(TM) i5-4690 CPU @ 3.50GHz

RAM: DDR3 1600Mhz 16GB

Platform: Ubuntu 16.04 amd64

Python 2.7.13

Request (call/s)

chardet v3.0.2

0.36

cchardet v2.0.1

1396.42

Python 3.6.1

Request (call/s)

chardet v3.0.2

0.35

cchardet v2.0.1

1467.77

LICENSE

See COPYING file.

Contact

CHANGES

2.1.1 (2017-07-01)

  • fix that different results with different chuck sizes

  • fix that assignments to nsSMState in nsCodingStateMachine result in unspecified behavior

  • include COPYING in package

2.1.0 (2017-05-15)

2.0.1 (2017-04-25)

  • fix an issue where UTF-8 with a BOM would not be detected as UTF-8-SIG (fix #28)

  • pass NULL Byte to feed() / detect() (fix #27)

2.0.0 (2017-04-06)

  • Improve tests

2.0a4 (2017-04-05)

  • Update uchardet repo (Fix buffer overflow)

2.0a3 (2017-03-29)

  • Implement UniversalDetector (like chardet)

2.0a2 (2017-03-28)

  • Update uchardet repo (Fix memory leak)

2.0a1 (2017-03-28)

1.1.3 (2017-02-26)

  • Support AArch64

1.1.2 (2017-01-08)

  • Support Python 3.6

1.1.1 (2016-11-05)

  • Use len() function (9e61cb9e96b138b0d18e5f9e013e144202ae4067)

  • Remove detect function in _cchardet.pyx (25b581294fc0ae8f686ac9972c8549666766f695)

  • Support manylinux1 wheel

1.1.0 (2016-10-17)

  • Add Detector class

  • Improve unit tests

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cchardet-2.1.1.tar.gz (645.9 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

cchardet-2.1.1-cp36-cp36m-win_amd64.whl (93.5 kB view details)

Uploaded CPython 3.6mWindows x86-64

cchardet-2.1.1-cp36-cp36m-win32.whl (90.2 kB view details)

Uploaded CPython 3.6mWindows x86

cchardet-2.1.1-cp36-cp36m-manylinux1_x86_64.whl (202.0 kB view details)

Uploaded CPython 3.6m

cchardet-2.1.1-cp36-cp36m-manylinux1_i686.whl (192.9 kB view details)

Uploaded CPython 3.6m

cchardet-2.1.1-cp35-cp35m-win_amd64.whl (93.5 kB view details)

Uploaded CPython 3.5mWindows x86-64

cchardet-2.1.1-cp35-cp35m-win32.whl (90.2 kB view details)

Uploaded CPython 3.5mWindows x86

cchardet-2.1.1-cp35-cp35m-manylinux1_x86_64.whl (201.9 kB view details)

Uploaded CPython 3.5m

cchardet-2.1.1-cp35-cp35m-manylinux1_i686.whl (192.7 kB view details)

Uploaded CPython 3.5m

cchardet-2.1.1-cp34-cp34m-win_amd64.whl (90.6 kB view details)

Uploaded CPython 3.4mWindows x86-64

cchardet-2.1.1-cp34-cp34m-win32.whl (88.3 kB view details)

Uploaded CPython 3.4mWindows x86

cchardet-2.1.1-cp34-cp34m-manylinux1_x86_64.whl (202.2 kB view details)

Uploaded CPython 3.4m

cchardet-2.1.1-cp34-cp34m-manylinux1_i686.whl (193.2 kB view details)

Uploaded CPython 3.4m

cchardet-2.1.1-cp27-cp27mu-manylinux1_x86_64.whl (199.9 kB view details)

Uploaded CPython 2.7mu

cchardet-2.1.1-cp27-cp27mu-manylinux1_i686.whl (190.8 kB view details)

Uploaded CPython 2.7mu

cchardet-2.1.1-cp27-cp27m-win_amd64.whl (90.3 kB view details)

Uploaded CPython 2.7mWindows x86-64

cchardet-2.1.1-cp27-cp27m-win32.whl (87.9 kB view details)

Uploaded CPython 2.7mWindows x86

cchardet-2.1.1-cp27-cp27m-manylinux1_x86_64.whl (199.9 kB view details)

Uploaded CPython 2.7m

cchardet-2.1.1-cp27-cp27m-manylinux1_i686.whl (190.8 kB view details)

Uploaded CPython 2.7m

File details

Details for the file cchardet-2.1.1.tar.gz.

File metadata

  • Download URL: cchardet-2.1.1.tar.gz
  • Upload date:
  • Size: 645.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for cchardet-2.1.1.tar.gz
Algorithm Hash digest
SHA256 9c9269208b9f8d7446dbd970f6544ce48104096efab0f769ee5918066ba1ee7e
MD5 bbfb26239b5129e93c8812efcc54d935
BLAKE2b-256 6374fbf92cd7fe2e603600096098d78f5c5957c5071861298d00084f058e174f

See more details on using hashes here.

File details

Details for the file cchardet-2.1.1-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for cchardet-2.1.1-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 d6c8eb90a9aa77f94e040a75d563f65849ab3b0c8f675b27928a91583648f8f8
MD5 0a67623b6a5f06193fb24c4516c70d46
BLAKE2b-256 ad33216ad3ba6f7982be3f9895bc9059c6b4a769e9522b3656c8ad87e7a49fdf

See more details on using hashes here.

File details

Details for the file cchardet-2.1.1-cp36-cp36m-win32.whl.

File metadata

File hashes

Hashes for cchardet-2.1.1-cp36-cp36m-win32.whl
Algorithm Hash digest
SHA256 eb8ee148e9fc13101e0e19ac98552d24b82731fcfddc915eed216c13ebbebec0
MD5 5281e786e102048d9b58b3f2d6c045ce
BLAKE2b-256 2488240c5f53980f74ca58116f08a24be731965083b580320489b8bd4e3e1d9f

See more details on using hashes here.

File details

Details for the file cchardet-2.1.1-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for cchardet-2.1.1-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 07dace80abce108d42a82be5a598797c0c07575741d81e698819bd42d367cdde
MD5 a05634277d3a0b8fc73f34f23f9a1f5a
BLAKE2b-256 f90a330740ba16f34599173fe7567baf4d847f31772bafd99f74c08e608701f6

See more details on using hashes here.

File details

Details for the file cchardet-2.1.1-cp36-cp36m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for cchardet-2.1.1-cp36-cp36m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 6dfc76b71f66e002a99efa68efe4366143e8845b54cf5623eb05b5fa8fb030d6
MD5 888a8912d8ef9f68efd37b9c6ddc8171
BLAKE2b-256 e75b68c6fe9bc81d16e0d4b742b36225a2973713316d6bfeccf407f8640a9e3c

See more details on using hashes here.

File details

Details for the file cchardet-2.1.1-cp35-cp35m-win_amd64.whl.

File metadata

File hashes

Hashes for cchardet-2.1.1-cp35-cp35m-win_amd64.whl
Algorithm Hash digest
SHA256 e1c3addf0c7408f76b98bd5f55f3abe844716d47dd6ab0d32eea8caa11a8fa41
MD5 1050bdde4efb3efb11d4e8ef594b6b3a
BLAKE2b-256 4917915c4d7eff8c6d611f9b7fa72dd6809a435c6794e34884380e0fd98bcca1

See more details on using hashes here.

File details

Details for the file cchardet-2.1.1-cp35-cp35m-win32.whl.

File metadata

File hashes

Hashes for cchardet-2.1.1-cp35-cp35m-win32.whl
Algorithm Hash digest
SHA256 e32c4a420c6f7c6ea8d8a1fe36c60c70316a4ca1779dba2e00044b61d8ee2017
MD5 3b5e0fda7b75d5328b72221690917851
BLAKE2b-256 1f0778cdac5a666b991273aa57f3d2afe64c5c6ae36a5f11003e014efcd7f399

See more details on using hashes here.

File details

Details for the file cchardet-2.1.1-cp35-cp35m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for cchardet-2.1.1-cp35-cp35m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 d12b3f1913068975f9b9431f3cdc44488786523cc6d5467ffcb5bd43d3210157
MD5 0bea04f9dda8b7c724fc4d016f6bb040
BLAKE2b-256 54f65819fdc63c74fd2c28b08498768310215431f9276af51bb8a75ea934875a

See more details on using hashes here.

File details

Details for the file cchardet-2.1.1-cp35-cp35m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for cchardet-2.1.1-cp35-cp35m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 36d58862c158de32ace6497e7bafc7f85049b35a3abbd65118baffbe2a1ec1e5
MD5 4bd3e07f14c81b7be3e8ce60f7973bc9
BLAKE2b-256 583a6cc4aba0d3197b61287f566ca700c3eee34ea0e0e8cdfee5a24c84dcfce9

See more details on using hashes here.

File details

Details for the file cchardet-2.1.1-cp34-cp34m-win_amd64.whl.

File metadata

File hashes

Hashes for cchardet-2.1.1-cp34-cp34m-win_amd64.whl
Algorithm Hash digest
SHA256 6e001eb2ff93c4c31a9952cf01c71f5f95c758314032094df5cf086168678b23
MD5 f14cad635cc8db3c54bc880c3ede894d
BLAKE2b-256 1472a623ecf82a368a5b5202fb840fa915e8afd37e923ed33135367c5fc8e22f

See more details on using hashes here.

File details

Details for the file cchardet-2.1.1-cp34-cp34m-win32.whl.

File metadata

File hashes

Hashes for cchardet-2.1.1-cp34-cp34m-win32.whl
Algorithm Hash digest
SHA256 3f70d1c41f0694d1411b47868fdb7c3147fd1bf09c22e6565a765eedfb888989
MD5 0bccac6fbf8c5b3399a3a2e2a5f987bc
BLAKE2b-256 a7c87a23d95e7fbe783c2664b2eb94f0a04c9d3a925e71ac8e6d94bc9c42dc81

See more details on using hashes here.

File details

Details for the file cchardet-2.1.1-cp34-cp34m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for cchardet-2.1.1-cp34-cp34m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 69311e20183056b45313475cc05c3e968faa2b14a466a6b0c23780645a462afe
MD5 04ca204c697a01d37b1166d8c58420ce
BLAKE2b-256 a86e993de1f94421ae69bfbe5a4e011d94fc93e9d1fb766e1deff6f428084608

See more details on using hashes here.

File details

Details for the file cchardet-2.1.1-cp34-cp34m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for cchardet-2.1.1-cp34-cp34m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 feda07443d732d86c9821671a898107b96ceb00462f405ec1dc08a353a9ddab0
MD5 f29428bfbbde1f4e7a07210a9947e0a2
BLAKE2b-256 ca4d06a3c2618164753deec2b749d1a812b6fb8748fb75bab8210359b5c3e90b

See more details on using hashes here.

File details

Details for the file cchardet-2.1.1-cp27-cp27mu-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for cchardet-2.1.1-cp27-cp27mu-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 823a981ba75fe8c12a0a0259eb80ec3a657273559f6d7445ba6fe2d2b061c8f9
MD5 1de1507f4c066cd94b65f18ec8cb245a
BLAKE2b-256 feb4e71bd76e37ad9fab2a0b89acd2fadc19d59b0a391db310b9c30b4d7c7983

See more details on using hashes here.

File details

Details for the file cchardet-2.1.1-cp27-cp27mu-manylinux1_i686.whl.

File metadata

File hashes

Hashes for cchardet-2.1.1-cp27-cp27mu-manylinux1_i686.whl
Algorithm Hash digest
SHA256 b94a65d3a8cc900058e6aaedc0dde9c99ffe436d8670d156784d7b561b874cf5
MD5 f6e7d33dc1afc272e2cea9730188a7f5
BLAKE2b-256 f2b89b6d5f165cd62efe34938e0ab4d7d17ef6bffe8fe0e15feb19f1cbe2723f

See more details on using hashes here.

File details

Details for the file cchardet-2.1.1-cp27-cp27m-win_amd64.whl.

File metadata

File hashes

Hashes for cchardet-2.1.1-cp27-cp27m-win_amd64.whl
Algorithm Hash digest
SHA256 f4e3d0d9a0113cdfbc2fafa995674c1c49ed4166543b454945ca44d6e2148935
MD5 ac01ce5b8ab8b44cffb412730c322cc7
BLAKE2b-256 642a0e3796f3af0924e157b29d0224533db65cd032ec96aa1dafb545433f1861

See more details on using hashes here.

File details

Details for the file cchardet-2.1.1-cp27-cp27m-win32.whl.

File metadata

File hashes

Hashes for cchardet-2.1.1-cp27-cp27m-win32.whl
Algorithm Hash digest
SHA256 7187a01130b838cea449904f3aa5c0bee0609fcc0f5f667f4ce08ea99d102ddc
MD5 2da112509c7b0bb96b43727d7e1e6fc1
BLAKE2b-256 d0719841419b316232e39ff9bd7d4b49295c0431e6789635cefb0a381c3d5f28

See more details on using hashes here.

File details

Details for the file cchardet-2.1.1-cp27-cp27m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for cchardet-2.1.1-cp27-cp27m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 a62b29c8c5a41f5ae95f620746d6db03b86fb259340fd991c9a608aabc60a275
MD5 ad334adb1d988e687b682eadd7664b42
BLAKE2b-256 4021e682c89b09c8e08ad00d73ddfb716ee570859dcaf7fe907255dba289413f

See more details on using hashes here.

File details

Details for the file cchardet-2.1.1-cp27-cp27m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for cchardet-2.1.1-cp27-cp27m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 e47d90a8484cc425ca4c13a204901e24e2d0b3e206deef7cf391c10639d33d6b
MD5 1822c6f86310c4667bda01b03b0c69b5
BLAKE2b-256 4e6bcaea582ded84e0c86119adb405ad0ef05e08f6d98db58c7287362f97f850

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page