Skip to main content

cChardet is high speed universal character encoding detector.

Project description

cChardet

Work In Progress Branch

cChardet is high speed universal character encoding detector. - binding to uchardet.

PyPI version Travis Ci build status AppVeyor build status

Supported Languages/Encodings

  • International (Unicode)

    • UTF-8

    • UTF-16BE / UTF-16LE

    • UTF-32BE / UTF-32LE / X-ISO-10646-UCS-4-34121 / X-ISO-10646-UCS-4-21431

  • Arabic

    • ISO-8859-6

    • WINDOWS-1256

  • Bulgarian

    • ISO-8859-5

    • WINDOWS-1251

  • Chinese

    • ISO-2022-CN

    • BIG5

    • EUC-TW

    • GB18030

    • HZ-GB-2312

  • Croatian:

    • ISO-8859-2

    • ISO-8859-13

    • ISO-8859-16

    • Windows-1250

    • IBM852

    • MAC-CENTRALEUROPE

  • Czech

    • Windows-1250

    • ISO-8859-2

    • IBM852

    • MAC-CENTRALEUROPE

  • Danish

    • ISO-8859-1

    • ISO-8859-15

    • WINDOWS-1252

  • English

    • ASCII

  • Esperanto

    • ISO-8859-3

  • Estonian

    • ISO-8859-4

    • ISO-8859-13

    • ISO-8859-13

    • Windows-1252

    • Windows-1257

  • Finnish

    • ISO-8859-1

    • ISO-8859-4

    • ISO-8859-9

    • ISO-8859-13

    • ISO-8859-15

    • WINDOWS-1252

  • French

    • ISO-8859-1

    • ISO-8859-15

    • WINDOWS-1252

  • German

    • ISO-8859-1

    • WINDOWS-1252

  • Greek

    • ISO-8859-7

    • WINDOWS-1253

  • Hebrew

    • ISO-8859-8

    • WINDOWS-1255

  • Hungarian:

    • ISO-8859-2

    • WINDOWS-1250

  • Irish Gaelic

    • ISO-8859-1

    • ISO-8859-9

    • ISO-8859-15

    • WINDOWS-1252

  • Italian

    • ISO-8859-1

    • ISO-8859-3

    • ISO-8859-9

    • ISO-8859-15

    • WINDOWS-1252

  • Japanese

    • ISO-2022-JP

    • SHIFT_JIS

    • EUC-JP

  • Korean

    • ISO-2022-KR

    • EUC-KR / UHC

  • Lithuanian

    • ISO-8859-4

    • ISO-8859-10

    • ISO-8859-13

  • Latvian

    • ISO-8859-4

    • ISO-8859-10

    • ISO-8859-13

  • Maltese

    • ISO-8859-3

  • Polish:

    • ISO-8859-2

    • ISO-8859-13

    • ISO-8859-16

    • Windows-1250

    • IBM852

    • MAC-CENTRALEUROPE

  • Portuguese

    • ISO-8859-1

    • ISO-8859-9

    • ISO-8859-15

    • WINDOWS-1252

  • Romanian:

    • ISO-8859-2

    • ISO-8859-16

    • Windows-1250

    • IBM852

  • Russian

    • ISO-8859-5

    • KOI8-R

    • WINDOWS-1251

    • MAC-CYRILLIC

    • IBM866

    • IBM855

  • Slovak

    • Windows-1250

    • ISO-8859-2

    • IBM852

    • MAC-CENTRALEUROPE

  • Slovene

    • ISO-8859-2

    • ISO-8859-16

    • Windows-1250

    • IBM852

    • M

Example

# -*- coding: utf-8 -*-
import cchardet as chardet
with open(r"src/tests/samples/wikipediaJa_One_Thousand_and_One_Nights_SJIS.txt", "rb") as f:
    msg = f.read()
    result = chardet.detect(msg)
    print(result)

Benchmark

$ cd src/
$ pip install chardet
$ python tests/bench.py

Results

CPU: Intel(R) Core(TM) i5-4690 CPU @ 3.50GHz

RAM: DDR3 1600Mhz 16GB

Platform: Ubuntu 16.04 amd64

Python 2.7.12

Request (call/s)

chardet

0.26

cchardet

1341.81

Python 3.6.0

Request (call/s)

chardet

0.26

cchardet

1472.43

LICENSE

See COPYING file.

Contact

CHANGES

2.0a3 (2017-03-29)

  • Implement UniversalDetector (like chardet)

2.0a2 (2017-03-28)

  • Update uchardet repo (Fix memory leak)

2.0a1 (2017-03-28)

1.1.3 (2017-02-26)

  • Support AArch64

1.1.2 (2017-01-08)

  • Support Python 3.6

1.1.1 (2016-11-05)

  • Use len() function (9e61cb9e96b138b0d18e5f9e013e144202ae4067)

  • Remove detect function in _cchardet.pyx (25b581294fc0ae8f686ac9972c8549666766f695)

  • Support manylinux1 wheel

1.1.0 (2016-10-17)

  • Add Detector class

  • Improve unit tests

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cchardet-2.0a3.tar.gz (603.3 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

cchardet-2.0a3-cp36-cp36m-win_amd64.whl (92.0 kB view details)

Uploaded CPython 3.6mWindows x86-64

cchardet-2.0a3-cp36-cp36m-win32.whl (88.7 kB view details)

Uploaded CPython 3.6mWindows x86

cchardet-2.0a3-cp36-cp36m-manylinux1_x86_64.whl (200.0 kB view details)

Uploaded CPython 3.6m

cchardet-2.0a3-cp36-cp36m-manylinux1_i686.whl (191.3 kB view details)

Uploaded CPython 3.6m

cchardet-2.0a3-cp35-cp35m-win_amd64.whl (92.0 kB view details)

Uploaded CPython 3.5mWindows x86-64

cchardet-2.0a3-cp35-cp35m-win32.whl (88.7 kB view details)

Uploaded CPython 3.5mWindows x86

cchardet-2.0a3-cp35-cp35m-manylinux1_x86_64.whl (199.9 kB view details)

Uploaded CPython 3.5m

cchardet-2.0a3-cp35-cp35m-manylinux1_i686.whl (191.1 kB view details)

Uploaded CPython 3.5m

cchardet-2.0a3-cp34-cp34m-win_amd64.whl (88.9 kB view details)

Uploaded CPython 3.4mWindows x86-64

cchardet-2.0a3-cp34-cp34m-win32.whl (86.7 kB view details)

Uploaded CPython 3.4mWindows x86

cchardet-2.0a3-cp34-cp34m-manylinux1_x86_64.whl (200.3 kB view details)

Uploaded CPython 3.4m

cchardet-2.0a3-cp34-cp34m-manylinux1_i686.whl (191.6 kB view details)

Uploaded CPython 3.4m

cchardet-2.0a3-cp27-cp27mu-manylinux1_x86_64.whl (197.8 kB view details)

Uploaded CPython 2.7mu

cchardet-2.0a3-cp27-cp27mu-manylinux1_i686.whl (188.9 kB view details)

Uploaded CPython 2.7mu

cchardet-2.0a3-cp27-cp27m-win_amd64.whl (88.5 kB view details)

Uploaded CPython 2.7mWindows x86-64

cchardet-2.0a3-cp27-cp27m-win32.whl (86.2 kB view details)

Uploaded CPython 2.7mWindows x86

cchardet-2.0a3-cp27-cp27m-manylinux1_x86_64.whl (197.8 kB view details)

Uploaded CPython 2.7m

cchardet-2.0a3-cp27-cp27m-manylinux1_i686.whl (188.9 kB view details)

Uploaded CPython 2.7m

File details

Details for the file cchardet-2.0a3.tar.gz.

File metadata

  • Download URL: cchardet-2.0a3.tar.gz
  • Upload date:
  • Size: 603.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for cchardet-2.0a3.tar.gz
Algorithm Hash digest
SHA256 233b89281b30a645ff220e6ae642279c8a4d2aea85e5d71f7ebd13ec6ad87082
MD5 948afa6fc3d5ef0ce8dc320b6538cfc4
BLAKE2b-256 5931edb87576473b75e0f0ce5b2266e2fa8b7224ebe52b2843a5dfd72432790e

See more details on using hashes here.

File details

Details for the file cchardet-2.0a3-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for cchardet-2.0a3-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 fc786f31a400eaf03661bfc91f3b2dffe5beb7eef647f5c218dbfa9868a5e8bd
MD5 6a52a2a0d0c32fc419144e7b679db364
BLAKE2b-256 383cf6a48a147e5ce0690096a8ac3c1c6bf6b0b31af829f04045d82b765779fa

See more details on using hashes here.

File details

Details for the file cchardet-2.0a3-cp36-cp36m-win32.whl.

File metadata

File hashes

Hashes for cchardet-2.0a3-cp36-cp36m-win32.whl
Algorithm Hash digest
SHA256 dd1ec4f8975279d2d1b693904a94b8dd729726c57d5a5de1053e8f6e75e65e0d
MD5 4e37da428caa9ac65701b17f87d86da5
BLAKE2b-256 277e844fd8918cb56a56f113ec1746a15849cf64f9e4277589356a28d15c58b7

See more details on using hashes here.

File details

Details for the file cchardet-2.0a3-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for cchardet-2.0a3-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 e9dc04e5f777f8c24cf6d7cd920ff6e3ff905265e7f8e76a48f04308775e4160
MD5 00d076c32695022dac3c4af9a6dff00f
BLAKE2b-256 0accefc2c353beccf6b171588d86801eebc779548e1319ed917e986dfe7e2836

See more details on using hashes here.

File details

Details for the file cchardet-2.0a3-cp36-cp36m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for cchardet-2.0a3-cp36-cp36m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 d49a696042e25e85dd4da17ab807bc5f4388849f9a8de4e7625e78d9824e3cea
MD5 5aec1e756e9fa1ee88e62d16048daa38
BLAKE2b-256 6d79b595142460773bca207faf461560de058a929820aaab60afc46e65babb7a

See more details on using hashes here.

File details

Details for the file cchardet-2.0a3-cp35-cp35m-win_amd64.whl.

File metadata

File hashes

Hashes for cchardet-2.0a3-cp35-cp35m-win_amd64.whl
Algorithm Hash digest
SHA256 bb6fbc58cf34c2b350d18b5c39799626f5529f959d26386d5218fbc07e510453
MD5 8b8f24c2d7707aad1f992c69dea14b30
BLAKE2b-256 738371cc56f6af476ae82d48caacf857d57b14ef8543d03f03255281ab16adde

See more details on using hashes here.

File details

Details for the file cchardet-2.0a3-cp35-cp35m-win32.whl.

File metadata

File hashes

Hashes for cchardet-2.0a3-cp35-cp35m-win32.whl
Algorithm Hash digest
SHA256 4fb211d71eda7ae34730bd859d524e2a20347f416907b09e4c0f7610348253cc
MD5 17849c73e1cc1bedc7189f569055c1ff
BLAKE2b-256 ee1cf66587b276260b21a63361cfe24af016288c355e9d2e00f283dd41364f59

See more details on using hashes here.

File details

Details for the file cchardet-2.0a3-cp35-cp35m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for cchardet-2.0a3-cp35-cp35m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 3d295636acef2ad01d2243d9db5cbd7196c8ea29c22dc4ec2e035ee452cff3b6
MD5 7ab6e95a4ede12bd74d773cc05ff9a01
BLAKE2b-256 e8fbf52ee5cc800fa97347d7c69bb1f89519c6aed63c92471cecb6dd1dd68108

See more details on using hashes here.

File details

Details for the file cchardet-2.0a3-cp35-cp35m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for cchardet-2.0a3-cp35-cp35m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 b3edb6c97efe9390f8318ce0bc116dd37c0b4effdf009edbc6d1d76d4dbe2094
MD5 15e19844860679aee19ce193e4705722
BLAKE2b-256 a0be6a4d138af2e836d10ec81a8f14d58f672600b425197948a6552bdd50c0dd

See more details on using hashes here.

File details

Details for the file cchardet-2.0a3-cp34-cp34m-win_amd64.whl.

File metadata

File hashes

Hashes for cchardet-2.0a3-cp34-cp34m-win_amd64.whl
Algorithm Hash digest
SHA256 17b43800a5442731e9086f08a23867051258d216e425aaa1712be2fdfef334ac
MD5 39f6fc2adb032df9ab35fd78a9005f15
BLAKE2b-256 cba5c349516a363e5b4ad1e12c254b6f53a7c9e2e796ec647b760c795ae53216

See more details on using hashes here.

File details

Details for the file cchardet-2.0a3-cp34-cp34m-win32.whl.

File metadata

File hashes

Hashes for cchardet-2.0a3-cp34-cp34m-win32.whl
Algorithm Hash digest
SHA256 18cf651da1a05c4bc30b4d0a79df79ebf378ede354770bbda751dff20fde3ac2
MD5 42caf66c4d42399cd545e8787ac5e787
BLAKE2b-256 f7e579e89ec25bfb366b6c0f1980260aea6a171ae6bcc7afa6ef204e35fe41f7

See more details on using hashes here.

File details

Details for the file cchardet-2.0a3-cp34-cp34m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for cchardet-2.0a3-cp34-cp34m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 41c4447c164a2a74c456ab8e13bb9f8d94bef1e7804d6be894f75a7f38703080
MD5 e685af51f946291b9fc6d64157dd2687
BLAKE2b-256 b213bccc49e820e1283655f06889cf1eb60ea50611a61b1d485ac3d65c9798bc

See more details on using hashes here.

File details

Details for the file cchardet-2.0a3-cp34-cp34m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for cchardet-2.0a3-cp34-cp34m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 de3c9f373c9ba30db5884fa1b64a694b28455cefb844f8153ffafb8f8b64d6ea
MD5 c24deb671779e5f82eddb38746371b9d
BLAKE2b-256 6d8f675a292471aed4fcd9a71abdf8dcef73d7298aa2d2377cc7b325665103d6

See more details on using hashes here.

File details

Details for the file cchardet-2.0a3-cp27-cp27mu-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for cchardet-2.0a3-cp27-cp27mu-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 5992b143f2407f461a7abbea9c9b44cb7e36f7e9e267ad401a4b637d5e630ed7
MD5 c757868809f2927f29d458ef616fe871
BLAKE2b-256 c2676f08d67a346809e647122797a7e02aa999b7d45925b9e2396bdef67404e9

See more details on using hashes here.

File details

Details for the file cchardet-2.0a3-cp27-cp27mu-manylinux1_i686.whl.

File metadata

File hashes

Hashes for cchardet-2.0a3-cp27-cp27mu-manylinux1_i686.whl
Algorithm Hash digest
SHA256 52805dc36542270cb42c62a17977448ae0fddf173cc37364a154947131533133
MD5 9e2e34f5600aa4850c82bb3117e995e2
BLAKE2b-256 473057fcabfd69ccc038375f438405345ad0902b73ba76aa5f9fbfeadccde5b0

See more details on using hashes here.

File details

Details for the file cchardet-2.0a3-cp27-cp27m-win_amd64.whl.

File metadata

File hashes

Hashes for cchardet-2.0a3-cp27-cp27m-win_amd64.whl
Algorithm Hash digest
SHA256 9d7edb8046a0767918271b412ecb166547009f8a8764ad62facc4c392b2ff334
MD5 7c51e61eaa21e1a3fda8e22c4acbf236
BLAKE2b-256 862d81972e8e3975823e35e2e91df886dbb868c400eb419d837d7e2464f58c03

See more details on using hashes here.

File details

Details for the file cchardet-2.0a3-cp27-cp27m-win32.whl.

File metadata

File hashes

Hashes for cchardet-2.0a3-cp27-cp27m-win32.whl
Algorithm Hash digest
SHA256 b2fc980e0e7a24be21adcf0f94a759c01f0cb05ab9effee06de1ad90e8f7c93f
MD5 07bfb91056c507603d0ca754aa199470
BLAKE2b-256 56d3046cc7d6568ef2cf071094c9c52181fae1df4164d0709e4248b4e14a8fd7

See more details on using hashes here.

File details

Details for the file cchardet-2.0a3-cp27-cp27m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for cchardet-2.0a3-cp27-cp27m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 0cb9ff0cc603aa6686f6e03d18d86c1d320d0dc9112d4cdfaa15fa92a8c5af65
MD5 37476757cabe9cddea34fb197c517b6b
BLAKE2b-256 7f778699fdb54446a9b3ebd8dd7b25ffabce302448430d55e957708d1ecab4f4

See more details on using hashes here.

File details

Details for the file cchardet-2.0a3-cp27-cp27m-manylinux1_i686.whl.

File metadata

File hashes

Hashes for cchardet-2.0a3-cp27-cp27m-manylinux1_i686.whl
Algorithm Hash digest
SHA256 158e4c64729f65335fe69dde89cc54467ee42e8a1f815ac311f1607df5e19ecb
MD5 a90b216f97b8ff5ec61858b380cb56da
BLAKE2b-256 547379581a760e5cab04e07bc9d0fd1fcb68eb79396545bb1c149da041de97f7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page