Clases for charset detection. Uses chardet and mozilla universal charset detection.
Project description
What is this package
The charset detects character encodings using the Universal Charset Detector implemented by Mozilla. If the text cannot be converted with the charset detected using the Universal Charset Detector then it uses the chardet package.
Intallation
You can install using pip:
$ pip install charset
Example
In [1]: from charset import Detector, text_to_unicode, text_to_utf8 In [2]: det = Detector() In [3]: input_text = open('input.txt').read() In [3]: text1 = text_to_unicode(input_text) In [4]: text2 = text_to_utf8(input_text)
Changelog
Version 1.0.1 (2013-11-20)
Modified setup.py.
Added a README.txt.
Added a MANIFEST.in to include data files missing in version 1.0.
Removed dependencies form cython and setuptools-cython.
Add dependency of chardet.
Version 1.0 (2013-04-21)
Initial version.
Support for character encoding detection.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
charset-1.0.1.tar.gz
(189.7 kB
view details)
File details
Details for the file charset-1.0.1.tar.gz
.
File metadata
- Download URL: charset-1.0.1.tar.gz
- Upload date:
- Size: 189.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 320d58f82fa64e95700a04d726b4a5593985270d303f680807fd2faa89327503 |
|
MD5 | e2e405b27c41e4ba1f25afd47b2dca12 |
|
BLAKE2b-256 | a6a42ba69b918bc2e02f598a9a88fb7314bab3e8e74043872d8d792869e006ec |