A utility to clean and convert MD files to ASCII.
Project description
MDCleaner
A utility to clean and convert MD files to ASCII.
Installation
You can install MDCleaner via pip:
pip install mdcleaner
Usage
After installation, you can use the package in your Python scripts:
from mdcleaner import clean_md
cleaned_content = clean_md("path_to_md_file.md")
print(cleaned_content)
Features
- Encoding Detection: The utility can automatically detect the file's encoding to ensure compatibility with various text files.
- ASCII Conversion: Converts non-ASCII characters to their closest ASCII representation using the
unidecode
library. - Template Replacements: Provides an easy way to replace placeholders within the MD files with specific content.
- Graceful Error Handling: Provides warnings for unmatched templates, ensuring placeholders without corresponding replacements are retained as-is. Additionally, handles improperly formatted templates and gives clear warnings.
Using templates in your Markdown file
Imagine you have an MD file named sample.md
with the following content:
This is a test: {my_variable}
In your script, you can replace the {my_variable}
placeholder with a specific value:
from mdcleaner import clean_md
replacements = {'my_variable': 'Hello, World!'}
cleaned_content = clean_md("sample.md", contexts=replacements)
print(cleaned_content) # This will print: "This is a test: Hello, World!"
By passing the contexts
parameter with a dictionary, any placeholders inside {}
in your MD file will be replaced by the corresponding values.
Encoding Detection
MDCleaner reads a certain number of bytes from the file to determine its encoding:
- By default, it reads the first 1024 bytes.
- You can specify a different number of bytes using the
encoding_detection_bytes
parameter. - If you set
encoding_detection_bytes
to 'auto', the entire file will be read to determine its encoding. - If no encoding is specified, the default option used is
utf-8
.
Example:
clean_md("sample.md", encoding_detection_bytes=500)
Manual Encoding
If you're certain about the encoding of your file, you can specify it directly using the encoding
parameter, which bypasses the automatic detection process:
clean_md("sample.md", encoding="utf-8")
Contributing
If you find any bugs or want to propose a new feature, please open an issue or submit a pull request.
License
This project is licensed under the MIT License. See the LICENSE.txt file for details.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file mdcleaner-0.1.4.1.tar.gz
.
File metadata
- Download URL: mdcleaner-0.1.4.1.tar.gz
- Upload date:
- Size: 4.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e40d31c6da28b41d21cb63aad511a8a723a32320b00bda18a65e8986edafb27b |
|
MD5 | 91fdf9b67ff724f1b58c0ef8df24b065 |
|
BLAKE2b-256 | 4cc3e5586577abcf154f0f4543ed2969f7c048d93dacfe06d8dbdd00c647d5c9 |
File details
Details for the file mdcleaner-0.1.4.1-py3-none-any.whl
.
File metadata
- Download URL: mdcleaner-0.1.4.1-py3-none-any.whl
- Upload date:
- Size: 4.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1372a9cc1364802bc5b3f271611fa7c466cb0abcc99544eff7eb1990c61c9f6f |
|
MD5 | d022be647e1d1d83ff7d6df6017872ea |
|
BLAKE2b-256 | 59890e7764a3d7d05ef4679af04a91109302aa341a3241606eeb89ac7fab03a4 |