Skip to main content

Read files in all available codes in your env, so that you can pick the one that fits best!

Project description

Have you ever seen this?

UnicodeEncodeError: 'XXXXX' codec can't encode character 'XXXXX' in position 15: ordinal ...

Probably more than once, right? :) After having spent too much time on finding the right codecs for files, I wrote BruteCodecChecker. BruteCodecChecker (MIT) opens a file in all codecs available in your environment and prints the results. It also works for byte objects.

If you work, like me, with a lot of text files, it will save you a lot of time.

Install it:

pip install BruteCodecChecker

Try it:

from BruteCodecChecker import CodecChecker

teststuff = b"""This is a test! 

Hi there!

A little test! """

testfilename = "test_utf8.tmp"

with open("test_utf8.tmp", mode="w", encoding="utf-8-sig") as f:

    f.write(teststuff.decode("utf-8-sig"))

codechecker = CodecChecker()

codechecker.try_open_file(testfilename, readlines=2).print_results(

    pause_after_interval=1, items_per_interval=10

)

codechecker.try_open_file(testfilename).print_results()

codechecker.try_convert_bytes(teststuff.decode("cp850").encode()).print_results(

    pause_after_interval=1, items_per_interval=10

)

Output


Codec               : palmos                                                       

Mode                : strict

Length              : 32

Converted           : 

Line: 0              This is a test! 

Line: 1                  Hi there!

Codec               : ptcp154                                                      

Mode                : strict

Length              : 32

Converted           : 

Line: 0              п»ҝThis is a test! 

Line: 1                  Hi there!

Codec               : punycode                                                     

Mode                : strict

Codec               : quopri_codec                                                 

Mode                : strict

Codec               : raw_unicode_escape                                           

Mode                : strict

Length              : 32

Converted           : 

Line: 0              This is a test! 

Line: 1                  Hi there!

Codec               : rot_13                                                       

Mode                : strict

Codec               : shift_jis                                                    

Mode                : strict

Codec               : shift_jisx0213                                               

Mode                : strict

Length              : 31

Converted           : 

Line: 0              鬠ソThis is a test! 

Line: 1                  Hi there!

Codec               : shift_jis_2004                                               

Mode                : strict

Length              : 31

Converted           : 

Line: 0              鬠ソThis is a test! 

Line: 1                  Hi there!

Codec               : tis_620                                                      

Mode                : strict

Length              : 32

Converted           : 

Line: 0              ๏ปฟThis is a test! 

Line: 1                  Hi there!

Codec               : undefined                                                    

Mode                : strict

Codec               : unicode_escape                                               

Mode                : strict

Length              : 32

Converted           : 

Line: 0              This is a test! 

Line: 1                  Hi there!

Codec               : utf_16                                                       

Mode                : strict

Codec               : utf_16_be                                                    

Mode                : strict

Codec               : utf_16_le                                                    

Mode                : strict

Codec               : utf_32                                                       

Mode                : strict

Codec               : utf_32_be                                                    

Mode                : strict

Codec               : utf_32_le                                                    

Mode                : strict

Codec               : utf_7                                                        

Mode                : strict

Codec               : utf_8                                                        

Mode                : strict

Length              : 30

Converted           : 

Line: 0              This is a test! 

Line: 1                  Hi there!

Codec               : utf_8_sig                                                    

Mode                : strict

Length              : 29

Converted           : 

Line: 0              This is a test! 

Line: 1                  Hi there!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

BruteCodecChecker-0.21.tar.gz (6.6 kB view details)

Uploaded Source

Built Distribution

BruteCodecChecker-0.21-py3-none-any.whl (7.6 kB view details)

Uploaded Python 3

File details

Details for the file BruteCodecChecker-0.21.tar.gz.

File metadata

  • Download URL: BruteCodecChecker-0.21.tar.gz
  • Upload date:
  • Size: 6.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for BruteCodecChecker-0.21.tar.gz
Algorithm Hash digest
SHA256 3023af65fbb433d525bcd4e5c93a87f34dafd2ffb035c7ebf9346e766baed2bc
MD5 0a004f5cf84a3d65ecf7605e4999ac39
BLAKE2b-256 14e198fe9305f40b2982b8905d6c0cb9165ffe12d55e7dfbeedfa1ccb906d6e9

See more details on using hashes here.

File details

Details for the file BruteCodecChecker-0.21-py3-none-any.whl.

File metadata

File hashes

Hashes for BruteCodecChecker-0.21-py3-none-any.whl
Algorithm Hash digest
SHA256 7e25711794f0780c664d53d6eb619d7124d5a4ab26e4c18d999f4bbf13a447e4
MD5 cce05643bcc9ab68a7bd18d9a242ee7a
BLAKE2b-256 ddda67bc2c7b822ec783af442dde623c6b0daae1c0099b13bba4c8167b313e82

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page