Skip to main content

cChardet is high speed universal character encoding detector.

Project description

cChardet

Work In Progress Branch

cChardet is high speed universal character encoding detector. - binding to uchardet.

PyPI version Travis Ci build status AppVeyor build status

Supported Languages/Encodings

  • International (Unicode)

    • UTF-8

    • UTF-16BE / UTF-16LE

    • UTF-32BE / UTF-32LE / X-ISO-10646-UCS-4-34121 / X-ISO-10646-UCS-4-21431

  • Arabic

    • ISO-8859-6

    • WINDOWS-1256

  • Bulgarian

    • ISO-8859-5

    • WINDOWS-1251

  • Chinese

    • ISO-2022-CN

    • BIG5

    • EUC-TW

    • GB18030

    • HZ-GB-2312

  • Croatian:

    • ISO-8859-2

    • ISO-8859-13

    • ISO-8859-16

    • Windows-1250

    • IBM852

    • MAC-CENTRALEUROPE

  • Czech

    • Windows-1250

    • ISO-8859-2

    • IBM852

    • MAC-CENTRALEUROPE

  • Danish

    • ISO-8859-1

    • ISO-8859-15

    • WINDOWS-1252

  • English

    • ASCII

  • Esperanto

    • ISO-8859-3

  • Estonian

    • ISO-8859-4

    • ISO-8859-13

    • ISO-8859-13

    • Windows-1252

    • Windows-1257

  • Finnish

    • ISO-8859-1

    • ISO-8859-4

    • ISO-8859-9

    • ISO-8859-13

    • ISO-8859-15

    • WINDOWS-1252

  • French

    • ISO-8859-1

    • ISO-8859-15

    • WINDOWS-1252

  • German

    • ISO-8859-1

    • WINDOWS-1252

  • Greek

    • ISO-8859-7

    • WINDOWS-1253

  • Hebrew

    • ISO-8859-8

    • WINDOWS-1255

  • Hungarian:

    • ISO-8859-2

    • WINDOWS-1250

  • Irish Gaelic

    • ISO-8859-1

    • ISO-8859-9

    • ISO-8859-15

    • WINDOWS-1252

  • Italian

    • ISO-8859-1

    • ISO-8859-3

    • ISO-8859-9

    • ISO-8859-15

    • WINDOWS-1252

  • Japanese

    • ISO-2022-JP

    • SHIFT_JIS

    • EUC-JP

  • Korean

    • ISO-2022-KR

    • EUC-KR / UHC

  • Lithuanian

    • ISO-8859-4

    • ISO-8859-10

    • ISO-8859-13

  • Latvian

    • ISO-8859-4

    • ISO-8859-10

    • ISO-8859-13

  • Maltese

    • ISO-8859-3

  • Polish:

    • ISO-8859-2

    • ISO-8859-13

    • ISO-8859-16

    • Windows-1250

    • IBM852

    • MAC-CENTRALEUROPE

  • Portuguese

    • ISO-8859-1

    • ISO-8859-9

    • ISO-8859-15

    • WINDOWS-1252

  • Romanian:

    • ISO-8859-2

    • ISO-8859-16

    • Windows-1250

    • IBM852

  • Russian

    • ISO-8859-5

    • KOI8-R

    • WINDOWS-1251

    • MAC-CYRILLIC

    • IBM866

    • IBM855

  • Slovak

    • Windows-1250

    • ISO-8859-2

    • IBM852

    • MAC-CENTRALEUROPE

  • Slovene

    • ISO-8859-2

    • ISO-8859-16

    • Windows-1250

    • IBM852

    • M

Example

# -*- coding: utf-8 -*-
import cchardet as chardet
with open(r"src/tests/samples/wikipediaJa_One_Thousand_and_One_Nights_SJIS.txt", "rb") as f:
    msg = f.read()
    result = chardet.detect(msg)
    print(result)

Benchmark

$ cd src/
$ pip install chardet
$ python tests/bench.py

Results

CPU: Intel(R) Core(TM) i5-4690 CPU @ 3.50GHz

RAM: DDR3 1600Mhz 16GB

Platform: Ubuntu 16.04 amd64

Python 2.7.12

Request (call/s)

chardet

0.26

cchardet

1341.81

Python 3.6.0

Request (call/s)

chardet

0.26

cchardet

1472.43

LICENSE

See COPYING file.

Contact

CHANGES

2.0a4 (2017-04-05)

  • Update uchardet repo (Fix buffer overflow)

2.0a3 (2017-03-29)

  • Implement UniversalDetector (like chardet)

2.0a2 (2017-03-28)

  • Update uchardet repo (Fix memory leak)

2.0a1 (2017-03-28)

1.1.3 (2017-02-26)

  • Support AArch64

1.1.2 (2017-01-08)

  • Support Python 3.6

1.1.1 (2016-11-05)

  • Use len() function (9e61cb9e96b138b0d18e5f9e013e144202ae4067)

  • Remove detect function in _cchardet.pyx (25b581294fc0ae8f686ac9972c8549666766f695)

  • Support manylinux1 wheel

1.1.0 (2016-10-17)

  • Add Detector class

  • Improve unit tests

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cchardet-2.0a4.tar.gz (600.4 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

cchardet-2.0a4-cp36-cp36m-win_amd64.whl (92.0 kB view details)

Uploaded CPython 3.6mWindows x86-64

cchardet-2.0a4-cp36-cp36m-win32.whl (88.7 kB view details)

Uploaded CPython 3.6mWindows x86

cchardet-2.0a4-cp36-cp36m-manylinux1_x86_64.whl (200.1 kB view details)

Uploaded CPython 3.6m

cchardet-2.0a4-cp36-cp36m-manylinux1_i686.whl (191.3 kB view details)

Uploaded CPython 3.6m

cchardet-2.0a4-cp35-cp35m-win_amd64.whl (92.0 kB view details)

Uploaded CPython 3.5mWindows x86-64

cchardet-2.0a4-cp35-cp35m-win32.whl (88.7 kB view details)

Uploaded CPython 3.5mWindows x86

cchardet-2.0a4-cp35-cp35m-manylinux1_x86_64.whl (199.9 kB view details)

Uploaded CPython 3.5m

cchardet-2.0a4-cp35-cp35m-manylinux1_i686.whl (191.1 kB view details)

Uploaded CPython 3.5m

cchardet-2.0a4-cp34-cp34m-win_amd64.whl (89.0 kB view details)

Uploaded CPython 3.4mWindows x86-64

cchardet-2.0a4-cp34-cp34m-win32.whl (86.7 kB view details)

Uploaded CPython 3.4mWindows x86

cchardet-2.0a4-cp34-cp34m-manylinux1_x86_64.whl (200.3 kB view details)

Uploaded CPython 3.4m

cchardet-2.0a4-cp34-cp34m-manylinux1_i686.whl (191.6 kB view details)

Uploaded CPython 3.4m

cchardet-2.0a4-cp27-cp27mu-manylinux1_x86_64.whl (197.8 kB view details)

Uploaded CPython 2.7mu

cchardet-2.0a4-cp27-cp27mu-manylinux1_i686.whl (188.9 kB view details)

Uploaded CPython 2.7mu

cchardet-2.0a4-cp27-cp27m-win_amd64.whl (88.5 kB view details)

Uploaded CPython 2.7mWindows x86-64

cchardet-2.0a4-cp27-cp27m-win32.whl (86.3 kB view details)

Uploaded CPython 2.7mWindows x86

cchardet-2.0a4-cp27-cp27m-manylinux1_x86_64.whl (197.8 kB view details)

Uploaded CPython 2.7m

cchardet-2.0a4-cp27-cp27m-manylinux1_i686.whl (189.0 kB view details)

Uploaded CPython 2.7m

File details

Details for the file cchardet-2.0a4.tar.gz.

File metadata

  • Download URL: cchardet-2.0a4.tar.gz
  • Upload date:
  • Size: 600.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for cchardet-2.0a4.tar.gz
Algorithm Hash digest
SHA256 ec4f1d465011acccce2638a8f03d5a4d45c8f0365de36f4ad39f9b10d1b69da2
MD5 eff26c6323098b2f6d688f4690b10f75
BLAKE2b-256 d7b63548edd6025f23379e8ce93d91a2501b5ac2ffb4363ea294a14a940fc76e

See more details on using hashes here.

File details

Details for the file cchardet-2.0a4-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for cchardet-2.0a4-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 83157e64b69d4901f690a8d19d2cb2e06b80b0be3819ea56e1ce2f70ab7e8bb4
MD5 f19be30b0f69be74e6ec6c3a5cf9d042
BLAKE2b-256 1a19069c7f6d0799af93847d3e0220f94c34766513d4efdf761f2f0476fdcd4b

See more details on using hashes here.

File details

Details for the file cchardet-2.0a4-cp36-cp36m-win32.whl.

File metadata

File hashes

Hashes for cchardet-2.0a4-cp36-cp36m-win32.whl
Algorithm Hash digest
SHA256 883b14eac809f0e8e60106190a8beb45d15f8ab4234e5ea7e5abe2be9c4847c8
MD5 4a202d2d7a31e148490dddb685037b14
BLAKE2b-256 188d45d94ba9b06f6c09052ef35ffa1d498f94d9cdf1b9ac81dd03e3d0c96139

See more details on using hashes here.

File details

Details for the file cchardet-2.0a4-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for cchardet-2.0a4-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 bed20b38acc2275a2b5a7e1f065f390a56c907aa1ca6df6fb18dfaeea59455ca
MD5 47dcea25ee6c46199fb1ae5201024b9b
BLAKE2b-256 2800b8c503529eae456fbf7406cbf1a3288f409deda9a5f6157ede0681ee8ba0

See more details on using hashes here.

File details

Details for the file cchardet-2.0a4-cp36-cp36m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for cchardet-2.0a4-cp36-cp36m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 a8b2cc002362266e9e7fa2067429446443b7cfeab3fb422e26b5c20d3dd05874
MD5 606d67e1e4bcee8eb3ea01a0b8fb8cd9
BLAKE2b-256 169fe1e646b79daef68341e958e4263a2c3c0efd50f96ca2d927a02c235b4ce3

See more details on using hashes here.

File details

Details for the file cchardet-2.0a4-cp35-cp35m-win_amd64.whl.

File metadata

File hashes

Hashes for cchardet-2.0a4-cp35-cp35m-win_amd64.whl
Algorithm Hash digest
SHA256 f437b93a866f3d2098905d5ea9291fbab09acc6d67baf380680c07191915e4ad
MD5 f5b22b4ac07e25331852cced35a02db3
BLAKE2b-256 be6efbf68995f8da510ea68143fdb3d8b2aa26fedfb92a18ee6a9a37f369d24f

See more details on using hashes here.

File details

Details for the file cchardet-2.0a4-cp35-cp35m-win32.whl.

File metadata

File hashes

Hashes for cchardet-2.0a4-cp35-cp35m-win32.whl
Algorithm Hash digest
SHA256 44c536efdcf6bb35dec21cc183f13615162033e88e9404d48f1178647f7532da
MD5 3c4756d7c13ef823dbd56fbcfa94530d
BLAKE2b-256 01861e2770527cb6e6eba452587c050ced26de27cc2af38493bdc6e520af77ca

See more details on using hashes here.

File details

Details for the file cchardet-2.0a4-cp35-cp35m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for cchardet-2.0a4-cp35-cp35m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 5b1cc98add9f2f5ec54a4781329109f97b86d46ec7b9a106c4fa0ec733cf35c5
MD5 e3ca784250e03d4460e9afd7804458b3
BLAKE2b-256 97cb4d6c4d503e2abd883c8651cf280867e76af5aad4538e73babb96da5ddf94

See more details on using hashes here.

File details

Details for the file cchardet-2.0a4-cp35-cp35m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for cchardet-2.0a4-cp35-cp35m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 79d38e10083a80fd8dec0a5a8268130d6cde1a94efbc38fa31341971ffae53e3
MD5 b5c0d3437f37ecaca1ed1ec31a7bf50d
BLAKE2b-256 ad8e9c9575bd5d9fe9f217ca13f54b36e6fcf894b59b3a126e1720ad2c3ae870

See more details on using hashes here.

File details

Details for the file cchardet-2.0a4-cp34-cp34m-win_amd64.whl.

File metadata

File hashes

Hashes for cchardet-2.0a4-cp34-cp34m-win_amd64.whl
Algorithm Hash digest
SHA256 8b6bb1d81054a05931d4a05b78ecd734e11f8bd153f4579428588931391412f2
MD5 79c8dd39761a18be4a360d6d1543feae
BLAKE2b-256 04b3d081c70724419da10c651bd9b3248858510d70e2b4e6403951222585e7d3

See more details on using hashes here.

File details

Details for the file cchardet-2.0a4-cp34-cp34m-win32.whl.

File metadata

File hashes

Hashes for cchardet-2.0a4-cp34-cp34m-win32.whl
Algorithm Hash digest
SHA256 912568b0dc8eacb7000bf5a61a47e07274a9628b0e0edf0e027324bf6571a94c
MD5 e504962fb660f35022c754b79adbe628
BLAKE2b-256 b30f37bd425be34e8d3a9c49bd8622814d13dbcd51a8477b76c361a22fcf9995

See more details on using hashes here.

File details

Details for the file cchardet-2.0a4-cp34-cp34m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for cchardet-2.0a4-cp34-cp34m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 ae2453e29f9a1e62f1e5da0fee46297ca8be81f11b51e49dfc5308e766016043
MD5 24354c2b7e54ec52288e8c676fa92a7f
BLAKE2b-256 53ef05295f206743a1daa7ac46875fbaea4320fb795dd95a704a8e00e382c02d

See more details on using hashes here.

File details

Details for the file cchardet-2.0a4-cp34-cp34m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for cchardet-2.0a4-cp34-cp34m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 148d1c16bdc693b7649e42cff4af6d9e9d30636211e5018f59c1de776e57e74c
MD5 b9e7a6c520b8fae7a8c723b93095b5cc
BLAKE2b-256 04d7df2f17093ca4fe94124e47ccc512b28634d87b09b23e58f4944e5261df55

See more details on using hashes here.

File details

Details for the file cchardet-2.0a4-cp27-cp27mu-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for cchardet-2.0a4-cp27-cp27mu-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 f772783885626a8e5f549c98b1a21d03d6090bc16e2c87ffe20e053bb4dcde58
MD5 47226a55bb2342744c3fcd1bed29e4ac
BLAKE2b-256 d16966fba4fc78d91937522679aaaf6574f7df34e98a0066e3123e37764deb79

See more details on using hashes here.

File details

Details for the file cchardet-2.0a4-cp27-cp27mu-manylinux1_i686.whl.

File metadata

File hashes

Hashes for cchardet-2.0a4-cp27-cp27mu-manylinux1_i686.whl
Algorithm Hash digest
SHA256 3dc5bb8a71d3fabcd1d2cedb6874324f348c5c0427bb284ab90558824115b378
MD5 24bfc81d4375ce29b2cbf90b11aec332
BLAKE2b-256 28e0980cb3b422e90346dc06632189c6b83795da42d948b6d35965b1080e6f45

See more details on using hashes here.

File details

Details for the file cchardet-2.0a4-cp27-cp27m-win_amd64.whl.

File metadata

File hashes

Hashes for cchardet-2.0a4-cp27-cp27m-win_amd64.whl
Algorithm Hash digest
SHA256 af84d477ca4d78e05f57e24d0654052db9db143af88fa33420d2fcb791820382
MD5 03d096e74dbd0f79a0d2b7466cf249de
BLAKE2b-256 850640d8678c719a207043aeec56bf8149671864ced052b02ba95d25156ef1b6

See more details on using hashes here.

File details

Details for the file cchardet-2.0a4-cp27-cp27m-win32.whl.

File metadata

File hashes

Hashes for cchardet-2.0a4-cp27-cp27m-win32.whl
Algorithm Hash digest
SHA256 abc82b9c2fa2a1854015d160f2d890b00911c3c6f80ab1827a37a8a98b2d96ac
MD5 0ed1684b2c1f47af2bdef48d6524590f
BLAKE2b-256 4ef234920eb957bdf5afbb8128fa55872adfe15ca1bc68c9a65ad0f10ab4f6bd

See more details on using hashes here.

File details

Details for the file cchardet-2.0a4-cp27-cp27m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for cchardet-2.0a4-cp27-cp27m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 a06ed7596d6905496ac01b95067b320a3ee66ea1963c8bdc033f4cf643d09aea
MD5 e6a88daeeee73559ce38192e5b735af4
BLAKE2b-256 9e4c87f8b7f587a65eb10c1e906c8cf64b3ab486b195138a96289f578dadbb72

See more details on using hashes here.

File details

Details for the file cchardet-2.0a4-cp27-cp27m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for cchardet-2.0a4-cp27-cp27m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 61489acd53aafe952a4a0e7e8458ee496870b04fd7ddbd140d8ac169a6efe5a0
MD5 a27d6913e42922a8a9da1c4e6953041e
BLAKE2b-256 74e152f3d324df1de35e24793cfabd0901cd923917fd2996e0c75485f913d53a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page