Skip to main content

cChardet is high speed universal character encoding detector.

Project description

cChardet

cChardet is high speed universal character encoding detector. - binding to uchardet.

PyPI version Travis Ci build status AppVeyor build status

Supported Languages/Encodings

  • International (Unicode)

    • UTF-8

    • UTF-16BE / UTF-16LE

    • UTF-32BE / UTF-32LE / X-ISO-10646-UCS-4-34121 / X-ISO-10646-UCS-4-21431

  • Arabic

    • ISO-8859-6

    • WINDOWS-1256

  • Bulgarian

    • ISO-8859-5

    • WINDOWS-1251

  • Chinese

    • ISO-2022-CN

    • BIG5

    • EUC-TW

    • GB18030

    • HZ-GB-2312

  • Croatian:

    • ISO-8859-2

    • ISO-8859-13

    • ISO-8859-16

    • Windows-1250

    • IBM852

    • MAC-CENTRALEUROPE

  • Czech

    • Windows-1250

    • ISO-8859-2

    • IBM852

    • MAC-CENTRALEUROPE

  • Danish

    • ISO-8859-1

    • ISO-8859-15

    • WINDOWS-1252

  • English

    • ASCII

  • Esperanto

    • ISO-8859-3

  • Estonian

    • ISO-8859-4

    • ISO-8859-13

    • ISO-8859-13

    • Windows-1252

    • Windows-1257

  • Finnish

    • ISO-8859-1

    • ISO-8859-4

    • ISO-8859-9

    • ISO-8859-13

    • ISO-8859-15

    • WINDOWS-1252

  • French

    • ISO-8859-1

    • ISO-8859-15

    • WINDOWS-1252

  • German

    • ISO-8859-1

    • WINDOWS-1252

  • Greek

    • ISO-8859-7

    • WINDOWS-1253

  • Hebrew

    • ISO-8859-8

    • WINDOWS-1255

  • Hungarian:

    • ISO-8859-2

    • WINDOWS-1250

  • Irish Gaelic

    • ISO-8859-1

    • ISO-8859-9

    • ISO-8859-15

    • WINDOWS-1252

  • Italian

    • ISO-8859-1

    • ISO-8859-3

    • ISO-8859-9

    • ISO-8859-15

    • WINDOWS-1252

  • Japanese

    • ISO-2022-JP

    • SHIFT_JIS

    • EUC-JP

  • Korean

    • ISO-2022-KR

    • EUC-KR / UHC

  • Lithuanian

    • ISO-8859-4

    • ISO-8859-10

    • ISO-8859-13

  • Latvian

    • ISO-8859-4

    • ISO-8859-10

    • ISO-8859-13

  • Maltese

    • ISO-8859-3

  • Polish:

    • ISO-8859-2

    • ISO-8859-13

    • ISO-8859-16

    • Windows-1250

    • IBM852

    • MAC-CENTRALEUROPE

  • Portuguese

    • ISO-8859-1

    • ISO-8859-9

    • ISO-8859-15

    • WINDOWS-1252

  • Romanian:

    • ISO-8859-2

    • ISO-8859-16

    • Windows-1250

    • IBM852

  • Russian

    • ISO-8859-5

    • KOI8-R

    • WINDOWS-1251

    • MAC-CYRILLIC

    • IBM866

    • IBM855

  • Slovak

    • Windows-1250

    • ISO-8859-2

    • IBM852

    • MAC-CENTRALEUROPE

  • Slovene

    • ISO-8859-2

    • ISO-8859-16

    • Windows-1250

    • IBM852

    • M

Example

# -*- coding: utf-8 -*-
import cchardet as chardet
with open(r"src/tests/samples/wikipediaJa_One_Thousand_and_One_Nights_SJIS.txt", "rb") as f:
    msg = f.read()
    result = chardet.detect(msg)
    print(result)

Benchmark

$ cd src/
$ pip install chardet
$ python tests/bench.py

Results

CPU: Intel(R) Core(TM) i5-4690 CPU @ 3.50GHz

RAM: DDR3 1600Mhz 16GB

Platform: Ubuntu 16.04 amd64

Python 2.7.12

Request (call/s)

chardet

0.26

cchardet

1341.81

Python 3.6.0

Request (call/s)

chardet

0.26

cchardet

1472.43

LICENSE

See COPYING file.

Contact

CHANGES

2.0.1 (2017-04-25)

  • fix an issue where UTF-8 with a BOM would not be detected as UTF-8-SIG (fix #28)

  • pass NULL Byte to feed() / detect() (fix #27)

2.0.0 (2017-04-06)

  • Improve tests

2.0a4 (2017-04-05)

  • Update uchardet repo (Fix buffer overflow)

2.0a3 (2017-03-29)

  • Implement UniversalDetector (like chardet)

2.0a2 (2017-03-28)

  • Update uchardet repo (Fix memory leak)

2.0a1 (2017-03-28)

1.1.3 (2017-02-26)

  • Support AArch64

1.1.2 (2017-01-08)

  • Support Python 3.6

1.1.1 (2016-11-05)

  • Use len() function (9e61cb9e96b138b0d18e5f9e013e144202ae4067)

  • Remove detect function in _cchardet.pyx (25b581294fc0ae8f686ac9972c8549666766f695)

  • Support manylinux1 wheel

1.1.0 (2016-10-17)

  • Add Detector class

  • Improve unit tests

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cchardet-2.0.1.tar.gz (606.0 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

cchardet-2.0.1-cp36-cp36m-win_amd64.whl (92.4 kB view details)

Uploaded CPython 3.6mWindows x86-64

cchardet-2.0.1-cp36-cp36m-win32.whl (89.1 kB view details)

Uploaded CPython 3.6mWindows x86

cchardet-2.0.1-cp36-cp36m-manylinux1_x86_64.whl (201.1 kB view details)

Uploaded CPython 3.6m

cchardet-2.0.1-cp36-cp36m-manylinux1_i686.whl (192.2 kB view details)

Uploaded CPython 3.6m

cchardet-2.0.1-cp35-cp35m-win_amd64.whl (92.4 kB view details)

Uploaded CPython 3.5mWindows x86-64

cchardet-2.0.1-cp35-cp35m-win32.whl (89.1 kB view details)

Uploaded CPython 3.5mWindows x86

cchardet-2.0.1-cp35-cp35m-manylinux1_x86_64.whl (201.0 kB view details)

Uploaded CPython 3.5m

cchardet-2.0.1-cp35-cp35m-manylinux1_i686.whl (192.1 kB view details)

Uploaded CPython 3.5m

cchardet-2.0.1-cp34-cp34m-win_amd64.whl (89.6 kB view details)

Uploaded CPython 3.4mWindows x86-64

cchardet-2.0.1-cp34-cp34m-win32.whl (87.3 kB view details)

Uploaded CPython 3.4mWindows x86

cchardet-2.0.1-cp34-cp34m-manylinux1_x86_64.whl (201.4 kB view details)

Uploaded CPython 3.4m

cchardet-2.0.1-cp34-cp34m-manylinux1_i686.whl (192.5 kB view details)

Uploaded CPython 3.4m

cchardet-2.0.1-cp27-cp27mu-manylinux1_x86_64.whl (199.0 kB view details)

Uploaded CPython 2.7mu

cchardet-2.0.1-cp27-cp27mu-manylinux1_i686.whl (190.0 kB view details)

Uploaded CPython 2.7mu

cchardet-2.0.1-cp27-cp27m-win_amd64.whl (89.3 kB view details)

Uploaded CPython 2.7mWindows x86-64

cchardet-2.0.1-cp27-cp27m-win32.whl (86.8 kB view details)

Uploaded CPython 2.7mWindows x86

cchardet-2.0.1-cp27-cp27m-manylinux1_x86_64.whl (199.1 kB view details)

Uploaded CPython 2.7m

cchardet-2.0.1-cp27-cp27m-manylinux1_i686.whl (190.1 kB view details)

Uploaded CPython 2.7m

File details

Details for the file cchardet-2.0.1.tar.gz.

File metadata

  • Download URL: cchardet-2.0.1.tar.gz
  • Upload date:
  • Size: 606.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for cchardet-2.0.1.tar.gz
Algorithm Hash digest
SHA256 926173277d3d5dbf1bcf24996a013533ac9b298ba79fb307ce1a4a4cbda67571
MD5 8fb08c5a61e7fcdc11a6923a19ff313f
BLAKE2b-256 329eda753b91eef8dc7f701bfff20a2cf09d90c1ee905ef75fcd34dc060941fb

See more details on using hashes here.

File details

Details for the file cchardet-2.0.1-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for cchardet-2.0.1-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 4090704c6f229fe52f5c089656df6e63a6e27e231fc051a1a50be56fb7c50f7f
MD5 fda21be0ae08dabf8b00b29bc1ef7976
BLAKE2b-256 c987421a0863ec27eaa2d64c759184c72ca5d22fe4a5f8c851564773a6413ac8

See more details on using hashes here.

File details

Details for the file cchardet-2.0.1-cp36-cp36m-win32.whl.

File metadata

File hashes

Hashes for cchardet-2.0.1-cp36-cp36m-win32.whl
Algorithm Hash digest
SHA256 fc6de5c77b102360bb1ed1f0e1e738a7a12037eb6f525a86ef2b0f3c3da57d9d
MD5 0feb54c43ab5a35b3a2ec05d23c3534f
BLAKE2b-256 bcdbd545f3554ba14b2a2eb3c30647a281a8f8ea74c5e133c08fc378a772ab1f

See more details on using hashes here.

File details

Details for the file cchardet-2.0.1-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for cchardet-2.0.1-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 3aa9a61574a205477333fe5cfa2862692bbaffdce4ffca8f19d112c6d239f10c
MD5 391c28dd7a5d96baf2e9fccfdcb6b372
BLAKE2b-256 6e4f698e136eb9fc6b922254eb8de85b021562bd0c5cbd43568918b67c96cc91

See more details on using hashes here.

File details

Details for the file cchardet-2.0.1-cp36-cp36m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for cchardet-2.0.1-cp36-cp36m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 4cf19163d001eb8910206f511daec67e1300b0fcffdd386c213fce3cdae38bb4
MD5 bb99b0aa8d51fb78a4d9e8208e5e4376
BLAKE2b-256 d51e997d5d20b21bc6209556f91e36fc2ef2e942d9b12df8612d989d4702154b

See more details on using hashes here.

File details

Details for the file cchardet-2.0.1-cp35-cp35m-win_amd64.whl.

File metadata

File hashes

Hashes for cchardet-2.0.1-cp35-cp35m-win_amd64.whl
Algorithm Hash digest
SHA256 096ab4c0ee868249d5dcdb9c6dfa72bdfc1f393e876fdaf2ec60657c7664c908
MD5 dda818b233ae3fc092cc10e236459ebf
BLAKE2b-256 618baed2d4d406a55aeab0ed1a8e8c642df2788c526bccddc79915829fab2881

See more details on using hashes here.

File details

Details for the file cchardet-2.0.1-cp35-cp35m-win32.whl.

File metadata

File hashes

Hashes for cchardet-2.0.1-cp35-cp35m-win32.whl
Algorithm Hash digest
SHA256 065a14bb8d083bfc28c33e7b87a0b79dceefa352d5c40a39cb20c9497aafe8be
MD5 dd4be4b2999c66419a3410630b5702d3
BLAKE2b-256 6f41103a2320ad7a0e6c795bd145d827f30f16acc088dab91d332629a71e043a

See more details on using hashes here.

File details

Details for the file cchardet-2.0.1-cp35-cp35m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for cchardet-2.0.1-cp35-cp35m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 5d0f259a3eee0e7d89da4428f3015d1fe670e9350f4a1f3bf48551fc590958a4
MD5 cc569eaeb3902655a1494b544e95bd94
BLAKE2b-256 311a463e786ce9a12c326e2f2f81bb191a1827cd6b2323f56f183cc31a0a98fc

See more details on using hashes here.

File details

Details for the file cchardet-2.0.1-cp35-cp35m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for cchardet-2.0.1-cp35-cp35m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 d6c4799927719f002c8e2d9bbf5c82625c40a06ab34d5256bca2f07292001baa
MD5 0ca7a9166b41d82ab4b8f0a553bd4711
BLAKE2b-256 be1053aaf3082da79dda6478b764787802250e750bb28c6e602d4d73b1cb66da

See more details on using hashes here.

File details

Details for the file cchardet-2.0.1-cp34-cp34m-win_amd64.whl.

File metadata

File hashes

Hashes for cchardet-2.0.1-cp34-cp34m-win_amd64.whl
Algorithm Hash digest
SHA256 81f7c3db4394e6811c01fd907a2782dc305d2bb87f864842cc49bbca804cccb8
MD5 78573ccf2836fcdf45fa5f320e46f4ec
BLAKE2b-256 a0ce2ae7d39e2565c6d12d335db870b81bd43e04a8ca75e72f503acd701f78f5

See more details on using hashes here.

File details

Details for the file cchardet-2.0.1-cp34-cp34m-win32.whl.

File metadata

File hashes

Hashes for cchardet-2.0.1-cp34-cp34m-win32.whl
Algorithm Hash digest
SHA256 64aa48e6a95537115d651c7943f0a9c35011b68880fc093f09c7b5970f15d173
MD5 75d708650088a734be866a7de1d3a1cd
BLAKE2b-256 c6dfa65fd5416962433945f4f1e506a9cb07ca23c4624fb04af51511c012c28f

See more details on using hashes here.

File details

Details for the file cchardet-2.0.1-cp34-cp34m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for cchardet-2.0.1-cp34-cp34m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 ff9ccb38b3c0eb0ec29a4256d139fc7801620c70600515c23365d60b3d8894e0
MD5 0698793085197e2047431ef03ed657d9
BLAKE2b-256 9aca20f87ec1a2d6a987dcd436730f90786e1df7c6e675c189879a1ab0879ed8

See more details on using hashes here.

File details

Details for the file cchardet-2.0.1-cp34-cp34m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for cchardet-2.0.1-cp34-cp34m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 4704ac375ea879352600bff1bceafd3ad381b56bef5feef5ea0061a30fba1f13
MD5 55cee9488d125ed397351817cd172336
BLAKE2b-256 9df8e5a0e7e297370efc9ff0417150133a6c3a3a25164084e2a43e395f95f7b6

See more details on using hashes here.

File details

Details for the file cchardet-2.0.1-cp27-cp27mu-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for cchardet-2.0.1-cp27-cp27mu-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 2e0f36bbe98d4ec55493a1586f3327eeddb15fcbf8dfea2d701433f2ca285a35
MD5 945c8ca07bf3ffd36ea7d254fab3c291
BLAKE2b-256 ebf830764da9a95b3a870bec88a467b915d3612522fb2f26ae5faa7831457de8

See more details on using hashes here.

File details

Details for the file cchardet-2.0.1-cp27-cp27mu-manylinux1_i686.whl.

File metadata

File hashes

Hashes for cchardet-2.0.1-cp27-cp27mu-manylinux1_i686.whl
Algorithm Hash digest
SHA256 32bbf0ce47315fa4c296a6ce4ef924c825e4e109f2b6ef8df084e82613fb0f40
MD5 e104f444b57005b0c81010dd67aa2332
BLAKE2b-256 5c1190789ff8688495f17a4031806df4d23004e568fc452d68bb0b542907921f

See more details on using hashes here.

File details

Details for the file cchardet-2.0.1-cp27-cp27m-win_amd64.whl.

File metadata

File hashes

Hashes for cchardet-2.0.1-cp27-cp27m-win_amd64.whl
Algorithm Hash digest
SHA256 2295c167820a07a77ee872ace95b89b0d4b3e97e16ab646634dfa840d84ce6c4
MD5 5b40a00c98b4207db589ada8161b7683
BLAKE2b-256 e64fc1c808e73f805d5fa4a5bab5ddb7c4bc6d6b9c3341862e46c964117479b4

See more details on using hashes here.

File details

Details for the file cchardet-2.0.1-cp27-cp27m-win32.whl.

File metadata

File hashes

Hashes for cchardet-2.0.1-cp27-cp27m-win32.whl
Algorithm Hash digest
SHA256 8c95f77d9a2d2fad08656f35f3ffd04b7f008aaa5a15d16d36f2d297af8f69d2
MD5 0237e1a85fbe0fbdcb11fb1d2d03abc4
BLAKE2b-256 35400ea8141332bfc155be31d91b76460643f33e96f0faf3ff28548204d01a56

See more details on using hashes here.

File details

Details for the file cchardet-2.0.1-cp27-cp27m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for cchardet-2.0.1-cp27-cp27m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 d9c06bc139c76273967f26030675637d3acd19129622651426a082dac0fa0103
MD5 3c8e5b4379cef0feacbbe3739816c3ff
BLAKE2b-256 a511ce99b44a021289246a33d04fca0bf458ec2d1af7ee75200212025c40e1fc

See more details on using hashes here.

File details

Details for the file cchardet-2.0.1-cp27-cp27m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for cchardet-2.0.1-cp27-cp27m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 d660c810e1e29c5795114b270fac5b99bf7ab4d409fa9644f862ddb49e2104dd
MD5 4d28132c1cf3938a9a2c1da3556fcd4f
BLAKE2b-256 7a079c0fd9f1b56419fd6cc9afae276b00e7ed167520535b8efde08989d43e07

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page