Skip to main content

A utility to clean and convert MD files to ASCII.

Project description

MDCleaner

A utility to clean and convert MD files to ASCII.

Installation

You can install MDCleaner via pip:

pip install mdcleaner

Usage

After installation, you can use the package in your Python scripts:

from mdcleaner import clean_md

cleaned_content = clean_md("path_to_md_file.md")
print(cleaned_content)

Features

  • Encoding Detection: The utility can automatically detect the file's encoding to ensure compatibility with various text files.
  • ASCII Conversion: Converts non-ASCII characters to their closest ASCII representation using the unidecode library.
  • Template Replacements: Provides an easy way to replace placeholders within the MD files with specific content.
  • Graceful Error Handling: Provides warnings for unmatched templates, ensuring placeholders without corresponding replacements are retained as-is. Additionally, handles improperly formatted templates and gives clear warnings.

Using templates in your Markdown file

Imagine you have an MD file named sample.md with the following content:

This is a test: {my_variable}

In your script, you can replace the {my_variable} placeholder with a specific value:

from mdcleaner import clean_md

replacements = {'my_variable': 'Hello, World!'}
cleaned_content = clean_md("sample.md", contexts=replacements)
print(cleaned_content)  # This will print: "This is a test: Hello, World!"

By passing the contexts parameter with a dictionary, any placeholders inside {} in your MD file will be replaced by the corresponding values.

Encoding Detection

MDCleaner reads a certain number of bytes from the file to determine its encoding:

  • By default, it reads the first 1024 bytes.
  • You can specify a different number of bytes using the encoding_detection_bytes parameter.
  • If you set encoding_detection_bytes to 'auto', the entire file will be read to determine its encoding.
  • If no encoding is specified, the default option used is utf-8.

Example:

clean_md("sample.md", encoding_detection_bytes=500)

Manual Encoding

If you're certain about the encoding of your file, you can specify it directly using the encoding parameter, which bypasses the automatic detection process:

clean_md("sample.md", encoding="utf-8")

Contributing

If you find any bugs or want to propose a new feature, please open an issue or submit a pull request.

License

This project is licensed under the MIT License. See the LICENSE.txt file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mdcleaner-0.1.4.1.tar.gz (4.5 kB view details)

Uploaded Source

Built Distribution

mdcleaner-0.1.4.1-py3-none-any.whl (4.7 kB view details)

Uploaded Python 3

File details

Details for the file mdcleaner-0.1.4.1.tar.gz.

File metadata

  • Download URL: mdcleaner-0.1.4.1.tar.gz
  • Upload date:
  • Size: 4.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.9

File hashes

Hashes for mdcleaner-0.1.4.1.tar.gz
Algorithm Hash digest
SHA256 e40d31c6da28b41d21cb63aad511a8a723a32320b00bda18a65e8986edafb27b
MD5 91fdf9b67ff724f1b58c0ef8df24b065
BLAKE2b-256 4cc3e5586577abcf154f0f4543ed2969f7c048d93dacfe06d8dbdd00c647d5c9

See more details on using hashes here.

File details

Details for the file mdcleaner-0.1.4.1-py3-none-any.whl.

File metadata

  • Download URL: mdcleaner-0.1.4.1-py3-none-any.whl
  • Upload date:
  • Size: 4.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.9

File hashes

Hashes for mdcleaner-0.1.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1372a9cc1364802bc5b3f271611fa7c466cb0abcc99544eff7eb1990c61c9f6f
MD5 d022be647e1d1d83ff7d6df6017872ea
BLAKE2b-256 59890e7764a3d7d05ef4679af04a91109302aa341a3241606eeb89ac7fab03a4

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page