Skip to main content

A utility to clean and convert MD files to ASCII.

Project description

MDCleaner

A utility to clean and convert MD files to ASCII.

Installation

You can install MDCleaner via pip:

pip install mdcleaner

Usage

After installation, you can use the package in your Python script:

from mdcleaner import clean_md

cleaned_content = clean_md("path_to_md_file.md")
print(cleaned_content)

Features

  • Automatically detects file encoding.
  • Converts non-ASCII characters to their closest ASCII representation.
  • Provides warnings for unmatched templates, ensuring placeholders without corresponding variables are retained as-is.
  • Handles improperly formatted templates, like unmatched curly braces {, and gives a clear warning while returning the content as-is.

Using templates in your Markdown file

Imagine you have an MD file named sample.md with the following content:

This is a test: {my_variable}

In your script, you can replace the {my_variable} placeholder with the value of a variable defined in your script:

from mdcleaner import clean_md

# Read and format the content of "sample.md"
replacements = {'user_name': 'Devon', 'role': 'admin'}
cleaned_content = clean_md("sample.md", contexts=replacements)
print(cleaned_content) # This will print: "This is a test: Hello, World!"

By passing contexts option with a Dictionary, any placeholders inside {} in your MD file will be replaced by the corresponding variables in your script. If a placeholder doesn't have a corresponding variable in your script, a warning will be logged, and the placeholder will be retained in the output.

Encoding Detection Bytes Param

The encoding_detection_bytes parameter will allow the user to define how many bytes it will read from the md file before deciding on its encoding type.

Example

global_test = "Global Test here"


def greet():
    new_test = "Local Test here"
    context = {'new_test': new_test, 'global_test': global_test}
    print(clean_md(file_path='test.md', contexts=context, encoding_detection_bytes=500))


greet()

In the above example, clean_md will read the first 500 bytes before deciding its encoding type. This is helpful when dealing with larger files that have a lot of bytes. The default for encoding_detection_bytes is 1024.

Additionally, you can pass the string value of auto inside encoding_detection_bytes which will allot it to read the entire file content before making a decision.

global_test = "Global Test here"

def greet():
    new_test = "Local Test here"
    context = {'new_test': new_test, 'global_test': global_test}
    print(clean_md(file_path='test.md', contexts=context, encoding_detection_bytes='auto'))

Manual Encoding

If you know the type of encoding of the file beforehand, you can specify the encoding type by using the manual_encoding parameter. This will allow the script to bypass the encoding detection when reading the md file. If the manual_encoding provided is invalid, we will catch the error and then retry with encoding detection.

We will assume in our Markdown file, the encoding type is utf-8, and pass it in manual_encoding like such:

global_test = "Global Test here"

def greet():
    new_test = "Local Test here"
    context = {'new_test': new_test, 'global_test': global_test}
    print(clean_md(file_path='test.md', contexts=context, encoding_detection_bytes='auto', manual_encoding='utf-8'))

Contributing

If you find any bugs or want to propose a new feature, please open an issue or submit a pull request.

License

This project is licensed under the MIT License. See the LICENSE.txt file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mdcleaner-0.1.3.tar.gz (4.6 kB view details)

Uploaded Source

Built Distribution

mdcleaner-0.1.3-py3-none-any.whl (4.9 kB view details)

Uploaded Python 3

File details

Details for the file mdcleaner-0.1.3.tar.gz.

File metadata

  • Download URL: mdcleaner-0.1.3.tar.gz
  • Upload date:
  • Size: 4.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.9

File hashes

Hashes for mdcleaner-0.1.3.tar.gz
Algorithm Hash digest
SHA256 0acf6e002446d5b37ff1fd9587f17dab1ea0053c0b3ba04fe413dd94070dec40
MD5 46e832bb1c1a4c998a776cfce9bca98b
BLAKE2b-256 0e08241ac5e4b8a27c8bab6ea0929f9ed4d9104bb4ddd282e241403a0b9614ab

See more details on using hashes here.

File details

Details for the file mdcleaner-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: mdcleaner-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 4.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.9

File hashes

Hashes for mdcleaner-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 a810a787256290404d0991248c824d7b9caedb0dad20851f9f39467d4e97ff11
MD5 2b283cad4811dc1bc72f0936b06fcc91
BLAKE2b-256 d93e43e8ef4908628be32a8ade51ed2b4d4e6fec8d6b484c88db5729e3bc763f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page