Skip to main content

Universal encoding detector

Project description

Universal character encoding detector

Detects
  • ASCII, UTF-8, UTF-16 (2 variants), UTF-32 (4 variants)

  • Big5, GB2312, EUC-TW, HZ-GB-2312, ISO-2022-CN (Traditional and Simplified Chinese)

  • EUC-JP, SHIFT_JIS, ISO-2022-JP (Japanese)

  • EUC-KR, ISO-2022-KR (Korean)

  • KOI8-R, MacCyrillic, IBM855, IBM866, ISO-8859-5, windows-1251 (Cyrillic)

  • ISO-8859-2, windows-1250 (Hungarian)

  • ISO-8859-5, windows-1251 (Bulgarian)

  • windows-1252 (English)

  • ISO-8859-7, windows-1253 (Greek)

  • ISO-8859-8, windows-1255 (Visual and Logical Hebrew)

  • TIS-620 (Thai)

Requires Python 2.1 or later

Command-line Tool

chardet comes with a command-line script which reports on the encodings of one or more files:

% chardetect.py somefile someotherfile
somefile: windows-1252 with confidence 0.5
someotherfile: ascii with confidence 1.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chardet-1.1.tar.gz (153.9 kB view details)

Uploaded Source

File details

Details for the file chardet-1.1.tar.gz.

File metadata

  • Download URL: chardet-1.1.tar.gz
  • Upload date:
  • Size: 153.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for chardet-1.1.tar.gz
Algorithm Hash digest
SHA256 2a9cc3bcba09a9e795efcf63ff1714980beb8dea1660f0931f675f52f4264e5c
MD5 15838de570d0703baf191dcf831cf0de
BLAKE2b-256 11da101ef38e05881445c1dec36dbd0573f9561e357a2da9e2409656e4677ffa

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page