Skip to main content

cChardet is high speed universal character encoding detector.

Project description

cChardet

cChardet is high speed universal character encoding detector. - binding to uchardet.

PyPI version Travis Ci build status AppVeyor build status

Supported Languages/Encodings

  • International (Unicode)

    • UTF-8

    • UTF-16BE / UTF-16LE

    • UTF-32BE / UTF-32LE / X-ISO-10646-UCS-4-34121 / X-ISO-10646-UCS-4-21431

  • Arabic

    • ISO-8859-6

    • WINDOWS-1256

  • Bulgarian

    • ISO-8859-5

    • WINDOWS-1251

  • Chinese

    • ISO-2022-CN

    • BIG5

    • EUC-TW

    • GB18030

    • HZ-GB-2312

  • Croatian:

    • ISO-8859-2

    • ISO-8859-13

    • ISO-8859-16

    • Windows-1250

    • IBM852

    • MAC-CENTRALEUROPE

  • Czech

    • Windows-1250

    • ISO-8859-2

    • IBM852

    • MAC-CENTRALEUROPE

  • Danish

    • ISO-8859-1

    • ISO-8859-15

    • WINDOWS-1252

  • English

    • ASCII

  • Esperanto

    • ISO-8859-3

  • Estonian

    • ISO-8859-4

    • ISO-8859-13

    • ISO-8859-13

    • Windows-1252

    • Windows-1257

  • Finnish

    • ISO-8859-1

    • ISO-8859-4

    • ISO-8859-9

    • ISO-8859-13

    • ISO-8859-15

    • WINDOWS-1252

  • French

    • ISO-8859-1

    • ISO-8859-15

    • WINDOWS-1252

  • German

    • ISO-8859-1

    • WINDOWS-1252

  • Greek

    • ISO-8859-7

    • WINDOWS-1253

  • Hebrew

    • ISO-8859-8

    • WINDOWS-1255

  • Hungarian:

    • ISO-8859-2

    • WINDOWS-1250

  • Irish Gaelic

    • ISO-8859-1

    • ISO-8859-9

    • ISO-8859-15

    • WINDOWS-1252

  • Italian

    • ISO-8859-1

    • ISO-8859-3

    • ISO-8859-9

    • ISO-8859-15

    • WINDOWS-1252

  • Japanese

    • ISO-2022-JP

    • SHIFT_JIS

    • EUC-JP

  • Korean

    • ISO-2022-KR

    • EUC-KR / UHC

  • Lithuanian

    • ISO-8859-4

    • ISO-8859-10

    • ISO-8859-13

  • Latvian

    • ISO-8859-4

    • ISO-8859-10

    • ISO-8859-13

  • Maltese

    • ISO-8859-3

  • Polish:

    • ISO-8859-2

    • ISO-8859-13

    • ISO-8859-16

    • Windows-1250

    • IBM852

    • MAC-CENTRALEUROPE

  • Portuguese

    • ISO-8859-1

    • ISO-8859-9

    • ISO-8859-15

    • WINDOWS-1252

  • Romanian:

    • ISO-8859-2

    • ISO-8859-16

    • Windows-1250

    • IBM852

  • Russian

    • ISO-8859-5

    • KOI8-R

    • WINDOWS-1251

    • MAC-CYRILLIC

    • IBM866

    • IBM855

  • Slovak

    • Windows-1250

    • ISO-8859-2

    • IBM852

    • MAC-CENTRALEUROPE

  • Slovene

    • ISO-8859-2

    • ISO-8859-16

    • Windows-1250

    • IBM852

    • M

Example

# -*- coding: utf-8 -*-
import cchardet as chardet
with open(r"src/tests/samples/wikipediaJa_One_Thousand_and_One_Nights_SJIS.txt", "rb") as f:
    msg = f.read()
    result = chardet.detect(msg)
    print(result)

Benchmark

$ cd src/
$ pip install chardet
$ python tests/bench.py

Results

CPU: Intel(R) Core(TM) i5-4690 CPU @ 3.50GHz

RAM: DDR3 1600Mhz 16GB

Platform: Ubuntu 16.04 amd64

Python 2.7.13

Request (call/s)

chardet v3.0.2

0.36

cchardet v2.0.1

1396.42

Python 3.6.1

Request (call/s)

chardet v3.0.2

0.35

cchardet v2.0.1

1467.77

LICENSE

See COPYING file.

Contact

CHANGES

2.1.0 (2017-05-15)

2.0.1 (2017-04-25)

  • fix an issue where UTF-8 with a BOM would not be detected as UTF-8-SIG (fix #28)

  • pass NULL Byte to feed() / detect() (fix #27)

2.0.0 (2017-04-06)

  • Improve tests

2.0a4 (2017-04-05)

  • Update uchardet repo (Fix buffer overflow)

2.0a3 (2017-03-29)

  • Implement UniversalDetector (like chardet)

2.0a2 (2017-03-28)

  • Update uchardet repo (Fix memory leak)

2.0a1 (2017-03-28)

1.1.3 (2017-02-26)

  • Support AArch64

1.1.2 (2017-01-08)

  • Support Python 3.6

1.1.1 (2016-11-05)

  • Use len() function (9e61cb9e96b138b0d18e5f9e013e144202ae4067)

  • Remove detect function in _cchardet.pyx (25b581294fc0ae8f686ac9972c8549666766f695)

  • Support manylinux1 wheel

1.1.0 (2016-10-17)

  • Add Detector class

  • Improve unit tests

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cchardet-2.1.0.tar.gz (606.1 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

cchardet-2.1.0-cp36-cp36m-win_amd64.whl (93.3 kB view details)

Uploaded CPython 3.6mWindows x86-64

cchardet-2.1.0-cp36-cp36m-win32.whl (90.0 kB view details)

Uploaded CPython 3.6mWindows x86

cchardet-2.1.0-cp36-cp36m-manylinux1_x86_64.whl (202.0 kB view details)

Uploaded CPython 3.6m

cchardet-2.1.0-cp36-cp36m-manylinux1_i686.whl (193.1 kB view details)

Uploaded CPython 3.6m

cchardet-2.1.0-cp35-cp35m-win_amd64.whl (93.3 kB view details)

Uploaded CPython 3.5mWindows x86-64

cchardet-2.1.0-cp35-cp35m-win32.whl (90.0 kB view details)

Uploaded CPython 3.5mWindows x86

cchardet-2.1.0-cp35-cp35m-manylinux1_x86_64.whl (201.9 kB view details)

Uploaded CPython 3.5m

cchardet-2.1.0-cp35-cp35m-manylinux1_i686.whl (192.9 kB view details)

Uploaded CPython 3.5m

cchardet-2.1.0-cp34-cp34m-win_amd64.whl (90.5 kB view details)

Uploaded CPython 3.4mWindows x86-64

cchardet-2.1.0-cp34-cp34m-win32.whl (88.1 kB view details)

Uploaded CPython 3.4mWindows x86

cchardet-2.1.0-cp34-cp34m-manylinux1_x86_64.whl (202.3 kB view details)

Uploaded CPython 3.4m

cchardet-2.1.0-cp34-cp34m-manylinux1_i686.whl (193.3 kB view details)

Uploaded CPython 3.4m

cchardet-2.1.0-cp27-cp27mu-manylinux1_x86_64.whl (199.9 kB view details)

Uploaded CPython 2.7mu

cchardet-2.1.0-cp27-cp27mu-manylinux1_i686.whl (190.9 kB view details)

Uploaded CPython 2.7mu

cchardet-2.1.0-cp27-cp27m-win_amd64.whl (90.2 kB view details)

Uploaded CPython 2.7mWindows x86-64

cchardet-2.1.0-cp27-cp27m-win32.whl (87.7 kB view details)

Uploaded CPython 2.7mWindows x86

cchardet-2.1.0-cp27-cp27m-manylinux1_x86_64.whl (199.9 kB view details)

Uploaded CPython 2.7m

cchardet-2.1.0-cp27-cp27m-manylinux1_i686.whl (190.9 kB view details)

Uploaded CPython 2.7m

File details

Details for the file cchardet-2.1.0.tar.gz.

File metadata

  • Download URL: cchardet-2.1.0.tar.gz
  • Upload date:
  • Size: 606.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for cchardet-2.1.0.tar.gz
Algorithm Hash digest
SHA256 3d77f9ea11f12f1a4e0413f79f0876b7b10ff065b03bad4860a453b696856fcf
MD5 f4f0cf86d9e2699854d83abe06197df2
BLAKE2b-256 07e092e36353d3442163f6090a084c06436c5fab74bba5e7ac997b81e390e025

See more details on using hashes here.

File details

Details for the file cchardet-2.1.0-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for cchardet-2.1.0-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 8d4bf1a381f45983b34f3c13ad247e624df6470b2fc787c260fa753a9265b790
MD5 f0062c8ba70bf66198e9dae35385227e
BLAKE2b-256 b6f90b3694a07505f69d52439111665671f3a732764d9c276b9f1d76f59a4990

See more details on using hashes here.

File details

Details for the file cchardet-2.1.0-cp36-cp36m-win32.whl.

File metadata

File hashes

Hashes for cchardet-2.1.0-cp36-cp36m-win32.whl
Algorithm Hash digest
SHA256 427c99fcf9ab4d11ecb45fb31a53fa682f9123440206f0331342178c941938b6
MD5 b389042538496ce70892f7ccc1b5d82a
BLAKE2b-256 0b9a192cdc459e98a09af783813ccdaaee164e35a6280a03bb111799b8c77afb

See more details on using hashes here.

File details

Details for the file cchardet-2.1.0-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for cchardet-2.1.0-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 53d63c6c232fcbe1957b90c79edd96d91071aa3aa133df86903c0f6476de4e02
MD5 892d5b419c2ac25d52141d8aee98f010
BLAKE2b-256 57046ae5abbf9b75112918665f8577dd71e902f978e564bcbc27418c3b5012b8

See more details on using hashes here.

File details

Details for the file cchardet-2.1.0-cp36-cp36m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for cchardet-2.1.0-cp36-cp36m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 fc2615445acca8629d2d33ffa08b720c7d6c34150769f04770006c750c89382f
MD5 2a26183480524bca5e008c1162c1b6e6
BLAKE2b-256 243e1edf93410d0db5c4c8383670ea532c9bdb491df742a53a55738f41eccaae

See more details on using hashes here.

File details

Details for the file cchardet-2.1.0-cp35-cp35m-win_amd64.whl.

File metadata

File hashes

Hashes for cchardet-2.1.0-cp35-cp35m-win_amd64.whl
Algorithm Hash digest
SHA256 b21f937bfec8128cd8280d1522b5f87ef321e3ed846609e17fc040afb9596d45
MD5 d3955de974b738806ad91b899010ebaa
BLAKE2b-256 d81ed2342e452e3e0df9a0d4ef7e9af5836af0f491db18c721456502f5c9459a

See more details on using hashes here.

File details

Details for the file cchardet-2.1.0-cp35-cp35m-win32.whl.

File metadata

File hashes

Hashes for cchardet-2.1.0-cp35-cp35m-win32.whl
Algorithm Hash digest
SHA256 675ee8b386a18b6cb419b32a0baff1c95fd3cfd7e074669f2b8b35d8f70c8ea4
MD5 17a4edf0873ca19c6704b630f815225a
BLAKE2b-256 48fb61f6047fbcb2773c0c03bf3441867e21bf9707d4a8ece4b6d2103da6685c

See more details on using hashes here.

File details

Details for the file cchardet-2.1.0-cp35-cp35m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for cchardet-2.1.0-cp35-cp35m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 b4cc05d7258d4517f9ea224d7a19ac8a18c8609aeae47607eb797efef0f1b7e6
MD5 624b9d844fb302db3888a4a874d68253
BLAKE2b-256 db2e4d6bb672c8b475b54214d1da12e30f399c728b9e6f01b266bd1631b0dcf2

See more details on using hashes here.

File details

Details for the file cchardet-2.1.0-cp35-cp35m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for cchardet-2.1.0-cp35-cp35m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 9cd9aa6a887ffc46cc2a120fc46f5126b0d6bba85601fa86bcfe4b33c8db5e84
MD5 2efc15c031119b4136ed335d50096c75
BLAKE2b-256 b2508527b65920287e6cb197514ec1bd68ff54ba9418740e45e2942985a535e3

See more details on using hashes here.

File details

Details for the file cchardet-2.1.0-cp34-cp34m-win_amd64.whl.

File metadata

File hashes

Hashes for cchardet-2.1.0-cp34-cp34m-win_amd64.whl
Algorithm Hash digest
SHA256 d266fd72da5970fa4e1a37966091de0be178d4abb3152b6458a04d171204a4f7
MD5 69fa37a70a5ce436e20eb3ddbcb92d1b
BLAKE2b-256 1e938ce99170c1ea1e06371577f9421465566ddea0ee9dd8accfa57f65ed6cd6

See more details on using hashes here.

File details

Details for the file cchardet-2.1.0-cp34-cp34m-win32.whl.

File metadata

File hashes

Hashes for cchardet-2.1.0-cp34-cp34m-win32.whl
Algorithm Hash digest
SHA256 8595c205919a8d6363f8d0558171c88c32bb054d15f9bcb6e9abb4fd32739e08
MD5 d5f4c8b6fc6de4d134fd6fc5e0dd8933
BLAKE2b-256 169d333d2075174713333b824a4bc60f4b27a6be4acd370022812bc715aae055

See more details on using hashes here.

File details

Details for the file cchardet-2.1.0-cp34-cp34m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for cchardet-2.1.0-cp34-cp34m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 b644fe62355cf4384a89e86a706846f067bd855c09451a3f01e69fee0823baa6
MD5 acb257257372f20e45dacbd312bf7b9d
BLAKE2b-256 d3b440b641e281d8640b4c584e471aaab4fde7711ba47112ce126e0880a1c6a0

See more details on using hashes here.

File details

Details for the file cchardet-2.1.0-cp34-cp34m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for cchardet-2.1.0-cp34-cp34m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 5b070185c706d818647fb58060ccfdea33fbb7f6c7d6650f9b070163fef3a08b
MD5 704ee2d08ec00069260de9a11687e6e0
BLAKE2b-256 ce4547b65a02a81a734b28dcb8feaf06c791aad879f1cc69004ed98f2e4ef41c

See more details on using hashes here.

File details

Details for the file cchardet-2.1.0-cp27-cp27mu-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for cchardet-2.1.0-cp27-cp27mu-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 8289875be2cc7bf7c7b9727ceafef4d94348389c47730179ba8bef3c28570b83
MD5 cfeb69b3382b6e20f7caf67ef20db673
BLAKE2b-256 53f58265aa2a693cd396d745670acc984388e1a60e4444961443bd610913d7d0

See more details on using hashes here.

File details

Details for the file cchardet-2.1.0-cp27-cp27mu-manylinux1_i686.whl.

File metadata

File hashes

Hashes for cchardet-2.1.0-cp27-cp27mu-manylinux1_i686.whl
Algorithm Hash digest
SHA256 363b3336ba3dee9138a300584261c92d2f8f7dd06f6d22cd3cbf8d73ee47decc
MD5 66094d1bbf5f758b87b01d7848f221db
BLAKE2b-256 9697a1cdafa63e202e26a392e4dea983107638db1929946db622bd0b2bba778d

See more details on using hashes here.

File details

Details for the file cchardet-2.1.0-cp27-cp27m-win_amd64.whl.

File metadata

File hashes

Hashes for cchardet-2.1.0-cp27-cp27m-win_amd64.whl
Algorithm Hash digest
SHA256 13e6af0c9dc877e3d26c745a22384c01f7f1a2448d6e3eb52421bc1c44b10b40
MD5 bcc7bf4f1b2021669f2da6491c37795f
BLAKE2b-256 c7190b97edc864dff552ce280f8237d332bd824f957b4ce4d52d0ef036d6611b

See more details on using hashes here.

File details

Details for the file cchardet-2.1.0-cp27-cp27m-win32.whl.

File metadata

File hashes

Hashes for cchardet-2.1.0-cp27-cp27m-win32.whl
Algorithm Hash digest
SHA256 228854958a698e80c1b18c064c51ff6b60d4831a452e1f59a148bd07cb3a3d2b
MD5 eff7e0017ee7389de1ccbdbee466e1aa
BLAKE2b-256 1823501ff63bd48b0aa0a755bdc98b07847a8b8f03a6b21a52377ae8111fa387

See more details on using hashes here.

File details

Details for the file cchardet-2.1.0-cp27-cp27m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for cchardet-2.1.0-cp27-cp27m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 bee7ed9fcd6a5155500073ed8bb40ee25c585bd979720af8c6a806971aa02789
MD5 9bd9dc47a62c8da1ce6bf1dbb6626f7b
BLAKE2b-256 9290567b5af04b893849c7271aae2bcf72428708a4760f53906deff120d89d89

See more details on using hashes here.

File details

Details for the file cchardet-2.1.0-cp27-cp27m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for cchardet-2.1.0-cp27-cp27m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 4edbc89715c7527f02b119eaaf079e8d246434928c25097a19152fc859459eeb
MD5 8d5d0546159ef83956086ba96f34bd89
BLAKE2b-256 6caa6f395bb931d550b0434b4884fb9bda7f982e3f5184c887ea5f50aafa376e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page