A configurable XML and HTML formatter.
Project description
Markuplift provides flexible, configurable formatting of XML and HTML documents. Unlike basic pretty-printers, Markuplift gives you complete control over how your markup is formatted through user-defined predicates for block vs inline elements, whitespace handling, and custom text content formatters.
Key Features
- Configurable element classification - Define block/inline elements using XPath expressions or Python predicates
- Flexible whitespace control - Normalize, preserve, or strip whitespace on a per-element basis
- External formatter integration - Pipe element text content through external tools (e.g., js-beautify, prettier)
- Comprehensive format options - Control indentation, attribute wrapping, self-closing tags, and more
- CLI and Python API - Use from command line or integrate into your Python applications
Quick Start
Installation
Install from PyPI using pip:
pip install markuplift
Or using uv (recommended for modern Python development):
uv add markuplift
For development installation with all dependencies:
git clone https://github.com/rob-smallshire/markuplift.git
cd markuplift
uv sync --all-extras
CLI Usage
# Basic formatting
markuplift format input.xml
# Format with custom block elements
markuplift format input.html --block "//div | //section | //article"
# Use external JavaScript formatter for script tags
markuplift format input.html --text-formatter "//script[@type='text/javascript']" "js-beautify"
# Format from stdin to stdout
cat messy.xml | markuplift format --output formatted.xml
Python API
from markuplift import Formatter
from markuplift.predicates import html_block_elements, html_inline_elements
# Create formatter with HTML-aware defaults
formatter = Formatter(
block_predicate_factory=html_block_elements(),
inline_predicate_factory=html_inline_elements(),
indent_size=2
)
# Format complex nested HTML (minified input)
messy_html = (
'<ul><li>Getting Started<ul><li>Installation via <code>pip install markuplift</code>'
'</li><li>Basic <em>configuration</em> and setup</li></ul></li><li>Advanced Features'
'<ul><li>Custom <strong>predicates</strong> and XPath</li><li>External formatter <co'
'de>integration</code></li></ul></li></ul>'
)
formatted = formatter.format_str(messy_html)
print(formatted)
Output:
<ul>
<li>Getting Started
<ul>
<li>Installation via <code>pip install markuplift</code></li>
<li>Basic <em>configuration</em> and setup</li>
</ul>
</li>
<li>Advanced Features
<ul>
<li>Custom <strong>predicates</strong> and XPath</li>
<li>External formatter <code>integration</code></li>
</ul>
</li>
</ul>
Real-World Example
Here's Markuplift formatting a complex article structure with mixed content:
from markuplift import Formatter
from markuplift.predicates import html_block_elements, html_inline_elements
# Real-world messy HTML (imagine this came from a CMS or generator)
messy_html = (
'<article><h1>Using Markuplift</h1><section><h2>Introduction</h2><p>Markuplift '
'is a <em>powerful</em> formatter for <strong>XML and HTML</strong>.</p><p>Key '
'features include:</p><ul><li>Configurable <code>block</code> and <code>inline<'
'/code> elements</li><li>XPath-based element selection</li><li>Custom text form'
'atters for <pre><code>code blocks</code></pre></li></ul></section></article>'
)
formatter = Formatter(
block_predicate_factory=html_block_elements(),
inline_predicate_factory=html_inline_elements(),
indent_size=2
)
formatted = formatter.format_str(messy_html)
print(formatted)
Output:
<article>
<h1>Using Markuplift</h1>
<section>
<h2>Introduction</h2>
<p>Markuplift is a <em>powerful</em> formatter for <strong>XML and HTML</strong>.</p>
<p>Key features include:</p>
<ul>
<li>Configurable <code>block</code> and <code>inline</code> elements</li>
<li>XPath-based element selection</li>
<li>Custom text formatters for
<pre><code>code blocks</code></pre>
</li>
</ul>
</section>
</article>
Advanced Example
Complex HTML form with custom formatting rules:
from markuplift import Formatter
from markuplift.predicates import html_block_elements, html_inline_elements
# HTML form structure (typical from form builders)
messy_form = (
'<form><fieldset><legend>User Information</legend><div><label>Name: <input type="text" '
'name="name" required="required"/></label></div><div><label>Email: <input type="email" '
'name="email"/></label></div><div><label><input type="checkbox" name="subscribe"/> Subs'
'cribe to <em>newsletter</em></label></div></fieldset><button type="submit">Submit <str'
'ong>Form</strong></button></form>'
)
formatter = Formatter(
block_predicate_factory=html_block_elements(),
inline_predicate_factory=html_inline_elements(),
indent_size=2
)
formatted = formatter.format_str(messy_form)
print(formatted)
Output:
<form>
<fieldset>
<legend>User Information</legend>
<div>
<label>Name: <input type="text" name="name" required="required" /></label>
</div>
<div>
<label>Email: <input type="email" name="email" /></label>
</div>
<div>
<label><input type="checkbox" name="subscribe" /> Subscribe to <em>newsletter</em></label>
</div>
</fieldset>
<button type="submit">Submit <strong>Form</strong></button>
</form>
Documentation
- API Documentation - Comprehensive API reference
- User Guide - Detailed usage examples and tutorials
- Predicate Reference - Built-in predicates and custom predicate creation
- CLI Reference - Complete command-line interface documentation
Use Cases
Markuplift is perfect for:
- Web development - Format HTML templates and components with consistent styling
- Data processing - Clean up XML data feeds and configuration files
- Documentation - Standardize markup in documentation systems
- Code generation - Format dynamically generated XML/HTML with precise control
- CI/CD pipelines - Ensure consistent markup formatting across your codebase
- Diffing and version control - Improve readability of markup changes in version control systems
License
Markuplift is released under the MIT License.
Contributing
Contributions are welcome! Please see our Contributing Guide for details on:
- Setting up the development environment
- Running tests and linting
- Submitting pull requests
- Reporting issues
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file markuplift-2.1.1-py3-none-any.whl.
File metadata
- Download URL: markuplift-2.1.1-py3-none-any.whl
- Upload date:
- Size: 25.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.22
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0f54c2e4b8224fbdd92887be06610e0e5bcf5743cf0a16fb514adffa80d4b6d0
|
|
| MD5 |
144b2a80de18b8a0f328bd4f82f4540b
|
|
| BLAKE2b-256 |
2c8e6b91847be29db13f9aaff73d028f764dcf7d4b1c8f686b4ab80d0a74adc6
|