Skip to main content

cChardet is high speed universal character encoding detector.

Project description

cChardet

Work In Progress Branch

cChardet is high speed universal character encoding detector. - binding to uchardet.

PyPI version Travis Ci build status AppVeyor build status

Supported Languages/Encodings

  • International (Unicode)

    • UTF-8

    • UTF-16BE / UTF-16LE

    • UTF-32BE / UTF-32LE / X-ISO-10646-UCS-4-34121 / X-ISO-10646-UCS-4-21431

  • Arabic

    • ISO-8859-6

    • WINDOWS-1256

  • Bulgarian

    • ISO-8859-5

    • WINDOWS-1251

  • Chinese

    • ISO-2022-CN

    • BIG5

    • EUC-TW

    • GB18030

    • HZ-GB-2312

  • Croatian:

    • ISO-8859-2

    • ISO-8859-13

    • ISO-8859-16

    • Windows-1250

    • IBM852

    • MAC-CENTRALEUROPE

  • Czech

    • Windows-1250

    • ISO-8859-2

    • IBM852

    • MAC-CENTRALEUROPE

  • Danish

    • ISO-8859-1

    • ISO-8859-15

    • WINDOWS-1252

  • English

    • ASCII

  • Esperanto

    • ISO-8859-3

  • Estonian

    • ISO-8859-4

    • ISO-8859-13

    • ISO-8859-13

    • Windows-1252

    • Windows-1257

  • Finnish

    • ISO-8859-1

    • ISO-8859-4

    • ISO-8859-9

    • ISO-8859-13

    • ISO-8859-15

    • WINDOWS-1252

  • French

    • ISO-8859-1

    • ISO-8859-15

    • WINDOWS-1252

  • German

    • ISO-8859-1

    • WINDOWS-1252

  • Greek

    • ISO-8859-7

    • WINDOWS-1253

  • Hebrew

    • ISO-8859-8

    • WINDOWS-1255

  • Hungarian:

    • ISO-8859-2

    • WINDOWS-1250

  • Irish Gaelic

    • ISO-8859-1

    • ISO-8859-9

    • ISO-8859-15

    • WINDOWS-1252

  • Italian

    • ISO-8859-1

    • ISO-8859-3

    • ISO-8859-9

    • ISO-8859-15

    • WINDOWS-1252

  • Japanese

    • ISO-2022-JP

    • SHIFT_JIS

    • EUC-JP

  • Korean

    • ISO-2022-KR

    • EUC-KR / UHC

  • Lithuanian

    • ISO-8859-4

    • ISO-8859-10

    • ISO-8859-13

  • Latvian

    • ISO-8859-4

    • ISO-8859-10

    • ISO-8859-13

  • Maltese

    • ISO-8859-3

  • Polish:

    • ISO-8859-2

    • ISO-8859-13

    • ISO-8859-16

    • Windows-1250

    • IBM852

    • MAC-CENTRALEUROPE

  • Portuguese

    • ISO-8859-1

    • ISO-8859-9

    • ISO-8859-15

    • WINDOWS-1252

  • Romanian:

    • ISO-8859-2

    • ISO-8859-16

    • Windows-1250

    • IBM852

  • Russian

    • ISO-8859-5

    • KOI8-R

    • WINDOWS-1251

    • MAC-CYRILLIC

    • IBM866

    • IBM855

  • Slovak

    • Windows-1250

    • ISO-8859-2

    • IBM852

    • MAC-CENTRALEUROPE

  • Slovene

    • ISO-8859-2

    • ISO-8859-16

    • Windows-1250

    • IBM852

    • M

Example

# -*- coding: utf-8 -*-
import cchardet as chardet
with open(r"src/tests/samples/wikipediaJa_One_Thousand_and_One_Nights_SJIS.txt", "rb") as f:
    msg = f.read()
    result = chardet.detect(msg)
    print(result)

Benchmark

$ cd src/
$ pip install chardet
$ python tests/bench.py

Results

CPU: Intel(R) Core(TM) i5-4690 CPU @ 3.50GHz

RAM: DDR3 1600Mhz 16GB

Platform: Ubuntu 16.04 amd64

Python 2.7.12

Request (call/s)

chardet

0.26

cchardet

1341.81

Python 3.6.0

Request (call/s)

chardet

0.26

cchardet

1472.43

LICENSE

See COPYING file.

Contact

CHANGES

2.0.0 (2017-04-06)

  • Improve tests

2.0a4 (2017-04-05)

  • Update uchardet repo (Fix buffer overflow)

2.0a3 (2017-03-29)

  • Implement UniversalDetector (like chardet)

2.0a2 (2017-03-28)

  • Update uchardet repo (Fix memory leak)

2.0a1 (2017-03-28)

1.1.3 (2017-02-26)

  • Support AArch64

1.1.2 (2017-01-08)

  • Support Python 3.6

1.1.1 (2016-11-05)

  • Use len() function (9e61cb9e96b138b0d18e5f9e013e144202ae4067)

  • Remove detect function in _cchardet.pyx (25b581294fc0ae8f686ac9972c8549666766f695)

  • Support manylinux1 wheel

1.1.0 (2016-10-17)

  • Add Detector class

  • Improve unit tests

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cchardet-2.0.0.tar.gz (600.8 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

cchardet-2.0.0-cp36-cp36m-win_amd64.whl (92.0 kB view details)

Uploaded CPython 3.6mWindows x86-64

cchardet-2.0.0-cp36-cp36m-win32.whl (88.7 kB view details)

Uploaded CPython 3.6mWindows x86

cchardet-2.0.0-cp36-cp36m-manylinux1_x86_64.whl (200.1 kB view details)

Uploaded CPython 3.6m

cchardet-2.0.0-cp36-cp36m-manylinux1_i686.whl (191.3 kB view details)

Uploaded CPython 3.6m

cchardet-2.0.0-cp35-cp35m-win_amd64.whl (92.0 kB view details)

Uploaded CPython 3.5mWindows x86-64

cchardet-2.0.0-cp35-cp35m-win32.whl (88.7 kB view details)

Uploaded CPython 3.5mWindows x86

cchardet-2.0.0-cp35-cp35m-manylinux1_x86_64.whl (199.9 kB view details)

Uploaded CPython 3.5m

cchardet-2.0.0-cp35-cp35m-manylinux1_i686.whl (191.2 kB view details)

Uploaded CPython 3.5m

cchardet-2.0.0-cp34-cp34m-win_amd64.whl (89.0 kB view details)

Uploaded CPython 3.4mWindows x86-64

cchardet-2.0.0-cp34-cp34m-win32.whl (86.8 kB view details)

Uploaded CPython 3.4mWindows x86

cchardet-2.0.0-cp34-cp34m-manylinux1_x86_64.whl (200.3 kB view details)

Uploaded CPython 3.4m

cchardet-2.0.0-cp34-cp34m-manylinux1_i686.whl (191.6 kB view details)

Uploaded CPython 3.4m

cchardet-2.0.0-cp27-cp27mu-manylinux1_x86_64.whl (197.8 kB view details)

Uploaded CPython 2.7mu

cchardet-2.0.0-cp27-cp27mu-manylinux1_i686.whl (189.0 kB view details)

Uploaded CPython 2.7mu

cchardet-2.0.0-cp27-cp27m-win_amd64.whl (88.5 kB view details)

Uploaded CPython 2.7mWindows x86-64

cchardet-2.0.0-cp27-cp27m-win32.whl (86.3 kB view details)

Uploaded CPython 2.7mWindows x86

cchardet-2.0.0-cp27-cp27m-manylinux1_x86_64.whl (197.8 kB view details)

Uploaded CPython 2.7m

cchardet-2.0.0-cp27-cp27m-manylinux1_i686.whl (189.0 kB view details)

Uploaded CPython 2.7m

File details

Details for the file cchardet-2.0.0.tar.gz.

File metadata

  • Download URL: cchardet-2.0.0.tar.gz
  • Upload date:
  • Size: 600.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for cchardet-2.0.0.tar.gz
Algorithm Hash digest
SHA256 9028658dc69c87735f4af81e800a96418f88092a49123e5fc517adc03e130d6f
MD5 4cbe9f264a1af3b54e6ef75dfcf86264
BLAKE2b-256 5a876678a0f74397fb08008cb05ed8c7f2a0d97233b037619d6f580d96ef2a23

See more details on using hashes here.

File details

Details for the file cchardet-2.0.0-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for cchardet-2.0.0-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 0e7b6d8956dccffae81cd73cfad659703b68e1b762a642488b37b8fcd1b760a4
MD5 d6e61e8d45c319dc191c2b153f0ebd2c
BLAKE2b-256 44e924ecd522260d052c4de5040dd04675deafa8f44845c70ca5c03668b54e03

See more details on using hashes here.

File details

Details for the file cchardet-2.0.0-cp36-cp36m-win32.whl.

File metadata

File hashes

Hashes for cchardet-2.0.0-cp36-cp36m-win32.whl
Algorithm Hash digest
SHA256 df1c5ce7c2d390423c763029056952cc9ebfae834886fcd78e2a9d01570ea5be
MD5 012c4d2dceaa5e2e639c5e3f6320413d
BLAKE2b-256 a088bddd7e4f472c3d03f417366a7b8d96d9b249d47b5e569e8fa1a85014ea5d

See more details on using hashes here.

File details

Details for the file cchardet-2.0.0-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for cchardet-2.0.0-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 9a8d86713a7ff8edfff5c57541464e5716adeeba85ea2a176e7fbbb5d208ad31
MD5 eec33cab43b98583ac0123ff87abeab7
BLAKE2b-256 7aafb4fa3eba61963adb163b843a358cc73c31fc0535dcfd3489d2fab6a0b20f

See more details on using hashes here.

File details

Details for the file cchardet-2.0.0-cp36-cp36m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for cchardet-2.0.0-cp36-cp36m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 8967f47d402c55c7621dda0eec5e71d7469f2f044de560252926eb3b57ec23f5
MD5 90fcedfa8b6c431d9560102354a53dff
BLAKE2b-256 bbc7c3bb3b368f6012f375add4f581a741a8add188b802e936f42d6f028607dc

See more details on using hashes here.

File details

Details for the file cchardet-2.0.0-cp35-cp35m-win_amd64.whl.

File metadata

File hashes

Hashes for cchardet-2.0.0-cp35-cp35m-win_amd64.whl
Algorithm Hash digest
SHA256 34665343d6304fd81afc1f5a9525880d19ae71fe0bcd2a461583510041443b0e
MD5 f2964ad454f9dae0e807a594f37d80c7
BLAKE2b-256 7217040c0a32704b12e2f255a7aa4e8a9a46c099e9d59a27bc91b5c503f9185f

See more details on using hashes here.

File details

Details for the file cchardet-2.0.0-cp35-cp35m-win32.whl.

File metadata

File hashes

Hashes for cchardet-2.0.0-cp35-cp35m-win32.whl
Algorithm Hash digest
SHA256 f8c0e5c84b06d6b678cf2170c9a74e73bca5d6282c05617cef61928276d0aa97
MD5 8e7836be1d398d9fe603f4c4261ccfb2
BLAKE2b-256 28a20197bd1cdb238a80a010c38296df9d32568cb7066d5aefbe353b82f9692b

See more details on using hashes here.

File details

Details for the file cchardet-2.0.0-cp35-cp35m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for cchardet-2.0.0-cp35-cp35m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 6b3da8f4aa2a587b23713710abebb425583a0b1617af30271201b08f4136a332
MD5 cd369b6015e620f38754645f913873e7
BLAKE2b-256 a1971310a825fc3bd8f1779d267f5b06205f78631036eec1fd57fcebb27d1fd3

See more details on using hashes here.

File details

Details for the file cchardet-2.0.0-cp35-cp35m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for cchardet-2.0.0-cp35-cp35m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 b11c8743b8b77850da94107c4c073498feb34527fa762fbba5b567c22a301917
MD5 92b4cfaea7261b02f59bbb9502a91f2e
BLAKE2b-256 7a8269ec657dab81384f9fd493f9e483fa40c11ae96bb8ae4e20f7b113f7802e

See more details on using hashes here.

File details

Details for the file cchardet-2.0.0-cp34-cp34m-win_amd64.whl.

File metadata

File hashes

Hashes for cchardet-2.0.0-cp34-cp34m-win_amd64.whl
Algorithm Hash digest
SHA256 a9780943fe202ebb23fa604808f1b3c76c391d962fdb385bd8e699ee6284eeee
MD5 0cf11344f94383e2e56d1ddb439d0c34
BLAKE2b-256 349014de4a7db064d9cc11d076587e2346f5d5e140289e901a10d10f34d267af

See more details on using hashes here.

File details

Details for the file cchardet-2.0.0-cp34-cp34m-win32.whl.

File metadata

File hashes

Hashes for cchardet-2.0.0-cp34-cp34m-win32.whl
Algorithm Hash digest
SHA256 f06fb13d5d9dbc2bfdad4449c216aa3a029a1762477d314d66de5736e6e853d6
MD5 938fe3e36120ff17467b3e17f651392b
BLAKE2b-256 28c2126bc4a9b3369b571b2aa4a068408d7dc379c100c1996824249120941bf3

See more details on using hashes here.

File details

Details for the file cchardet-2.0.0-cp34-cp34m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for cchardet-2.0.0-cp34-cp34m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 7c5bf71aae5df7360222729ed5b5f5106bcd21454042924b61a76b2d0da1b53a
MD5 5298836b248f783333b7abf46537ad56
BLAKE2b-256 e1c8b8ce3d7d43de204372f0840d67b8f962167229d04a19f28eb6304663732d

See more details on using hashes here.

File details

Details for the file cchardet-2.0.0-cp34-cp34m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for cchardet-2.0.0-cp34-cp34m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 2e315dd2221c60246acd12c1a9cc74d38bc42efa842c1fce75386b1f71f5e3a0
MD5 7ffd739d45a5ba1f71bd51f8b328adda
BLAKE2b-256 c0d55fe7575473cb0d395f66570ec6efb98c24d67b6714c774666afe2f6338f6

See more details on using hashes here.

File details

Details for the file cchardet-2.0.0-cp27-cp27mu-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for cchardet-2.0.0-cp27-cp27mu-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 c5f67dc633258843e252f47d4979ad91e092d80097868bdf64d9c1cc3a3cf120
MD5 630a1d434c200bda4154dff95f49ee52
BLAKE2b-256 4ea38b48365927d17ea09f0d201b5bf95325c5dbb9fb026266d57d704c2fd556

See more details on using hashes here.

File details

Details for the file cchardet-2.0.0-cp27-cp27mu-manylinux1_i686.whl.

File metadata

File hashes

Hashes for cchardet-2.0.0-cp27-cp27mu-manylinux1_i686.whl
Algorithm Hash digest
SHA256 776df64e37bd2868d2d8f85ddc8bac039bda25f14cf6f144c532e39f8d907f86
MD5 ed2fe00ee4ae2ac6039b04a7ee2f56c9
BLAKE2b-256 baa824d714d58c38ebaed34bab9efbe5fcebfde09ad9d425c5872ddfadd025ae

See more details on using hashes here.

File details

Details for the file cchardet-2.0.0-cp27-cp27m-win_amd64.whl.

File metadata

File hashes

Hashes for cchardet-2.0.0-cp27-cp27m-win_amd64.whl
Algorithm Hash digest
SHA256 5bee513cbe4dbbee42de5fd69950c26133f3a92889d7787b34c8ac01dc9051da
MD5 bbd49a9529c622487bb4d1ce9aee40d5
BLAKE2b-256 c67119706c5a09668be7af8432e5121462b4507372c0f6273bd66eed105093a5

See more details on using hashes here.

File details

Details for the file cchardet-2.0.0-cp27-cp27m-win32.whl.

File metadata

File hashes

Hashes for cchardet-2.0.0-cp27-cp27m-win32.whl
Algorithm Hash digest
SHA256 0ea55cbf777c350e0e3cd3f13676f32b18f1865c5458174cf45010d4e81e2b87
MD5 18062a9fe1c9f11c510622e4b129d00c
BLAKE2b-256 8cb7b374bed99a712a780bc44d8abd9205a1d593045033f24b0c43ff7583dbc4

See more details on using hashes here.

File details

Details for the file cchardet-2.0.0-cp27-cp27m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for cchardet-2.0.0-cp27-cp27m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 e174b0114185a6d2b487b5fd2a1adb18075c39c4a1dc528d02c5f2a3952321ab
MD5 ff979d13ed230b1593d29486fd2d012b
BLAKE2b-256 ff2709088a0fbc4bb7527f16f381070e41357bfa21eb6c6c97df66d65728c631

See more details on using hashes here.

File details

Details for the file cchardet-2.0.0-cp27-cp27m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for cchardet-2.0.0-cp27-cp27m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 10f8c0877c7a465401a8dc3a3c93eb73fc455b9e075242dc9aff474a087177a9
MD5 cd72c3a630bcf8018172e61cabba918a
BLAKE2b-256 0fbe929162ca9fee3399eda89e59629136c16e1a88a058202a1b487dff8c7d05

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page