Skip to main content

Universal encoding detector for Python 2 and 3

Project description

Chardet: The Universal Character Encoding Detector

Build status https://img.shields.io/coveralls/chardet/chardet/stable.svg Latest version on PyPI License
Detects
  • ASCII, UTF-8, UTF-16 (2 variants), UTF-32 (4 variants)

  • Big5, GB2312, EUC-TW, HZ-GB-2312, ISO-2022-CN (Traditional and Simplified Chinese)

  • EUC-JP, SHIFT_JIS, CP932, ISO-2022-JP (Japanese)

  • EUC-KR, ISO-2022-KR (Korean)

  • KOI8-R, MacCyrillic, IBM855, IBM866, ISO-8859-5, windows-1251 (Cyrillic)

  • ISO-8859-5, windows-1251 (Bulgarian)

  • ISO-8859-1, windows-1252 (Western European languages)

  • ISO-8859-7, windows-1253 (Greek)

  • ISO-8859-8, windows-1255 (Visual and Logical Hebrew)

  • TIS-620 (Thai)

Requires Python 2.7 or 3.5+.

Installation

Install from PyPI:

pip install chardet

Documentation

For users, docs are now available at https://chardet.readthedocs.io/.

Command-line Tool

chardet comes with a command-line script which reports on the encodings of one or more files:

% chardetect somefile someotherfile
somefile: windows-1252 with confidence 0.5
someotherfile: ascii with confidence 1.0

About

This is a continuation of Mark Pilgrim’s excellent chardet. Previously, two versions needed to be maintained: one that supported python 2.x and one that supported python 3.x. We’ve recently merged with Ian Cordasco’s charade fork, so now we have one coherent version that works for Python 2.7+ and 3.4+.

maintainer:

Dan Blanchard

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chardet-4.0.0.tar.gz (1.9 MB view hashes)

Uploaded source

Built Distribution

chardet-4.0.0-py2.py3-none-any.whl (178.7 kB view hashes)

Uploaded py2 py3

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Huawei Huawei PSF Sponsor Microsoft Microsoft PSF Sponsor NVIDIA NVIDIA PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page