Skip to main content

A configurable XML and HTML formatter.

Project description

Markuplift Logo

Markuplift

A configurable XML and HTML formatter for Python

CI PyPI version Python versions License Downloads

Markuplift provides flexible, configurable formatting of XML and HTML documents. Unlike basic pretty-printers, Markuplift gives you complete control over how your markup is formatted through user-defined predicates for block vs inline elements, whitespace handling, and custom text content formatters.

Key Features

  • Configurable element classification - Define block/inline elements using XPath expressions or Python predicates
  • Flexible whitespace control - Normalize, preserve, or strip whitespace on a per-element basis
  • External formatter integration - Pipe element text content through external tools (e.g., js-beautify, prettier)
  • Comprehensive format options - Control indentation, attribute wrapping, self-closing tags, and more
  • CLI and Python API - Use from command line or integrate into your Python applications

Quick Start

Installation

Install from PyPI using pip:

pip install markuplift

Or using uv (recommended for modern Python development):

uv add markuplift

For development installation with all dependencies:

git clone https://github.com/rob-smallshire/markuplift.git
cd markuplift
uv sync --all-extras

CLI Usage

# Basic formatting
markuplift format input.xml

# Format with custom block elements
markuplift format input.html --block "//div | //section | //article"

# Use external JavaScript formatter for script tags
markuplift format input.html --text-formatter "//script[@type='text/javascript']" "js-beautify"

# Format from stdin to stdout
cat messy.xml | markuplift format --output formatted.xml

Python API

from markuplift import Formatter
from markuplift.predicates import html_block_elements, tag_in

# Create formatter with HTML-aware defaults
formatter = Formatter(
    block_predicate_factory=html_block_elements(),
    inline_predicate_factory=tag_in("em", "strong", "code", "a"),
    indent_size=2
)

# Format HTML string
messy_html = "<div><p>Hello <em>world</em>!</p><p>Another paragraph.</p></div>"
formatted = formatter.format_str(messy_html)
print(formatted)

Output:

<div>
  <p>Hello <em>world</em>!</p>
  <p>Another paragraph.</p>
</div>

Advanced Example

from markuplift import Formatter
from markuplift.predicates import matches_xpath, html_block_elements

# Custom formatter with XPath-based rules
formatter = Formatter(
    block_predicate_factory=html_block_elements(),
    inline_predicate_factory=matches_xpath("//code | //kbd | //var"),
    normalize_whitespace_predicate_factory=matches_xpath("//p | //div"),
    preserve_whitespace_predicate_factory=matches_xpath("//pre | //script"),
    text_content_formatters={
        matches_xpath("//script[@type='text/javascript']"): lambda text, fmt, level: js_beautify(text),
    }
)

result = formatter.format_str(your_html)

Documentation

Use Cases

Markuplift is perfect for:

  • Web development - Format HTML templates and components with consistent styling
  • Data processing - Clean up XML data feeds and configuration files
  • Documentation - Standardize markup in documentation systems
  • Code generation - Format dynamically generated XML/HTML with precise control
  • CI/CD pipelines - Ensure consistent markup formatting across your codebase
  • Diffing and version control - Improve readability of markup changes in version control systems

Requirements

  • Python 3.12+
  • Dependencies: lxml, click

License

Markuplift is released under the MIT License.

Contributing

Contributions are welcome! Please see our Contributing Guide for details on:

  • Setting up the development environment
  • Running tests and linting
  • Submitting pull requests
  • Reporting issues

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

markuplift-2.0.0-py3-none-any.whl (24.3 kB view details)

Uploaded Python 3

File details

Details for the file markuplift-2.0.0-py3-none-any.whl.

File metadata

  • Download URL: markuplift-2.0.0-py3-none-any.whl
  • Upload date:
  • Size: 24.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.22

File hashes

Hashes for markuplift-2.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 07395c820993fd950eaafc6e3c9c70c8c6a7115425725a7c94f4fb21abca6660
MD5 41fe39f23f43cf221e9def401879be61
BLAKE2b-256 7696a30c733a398f2452da371cf6b5081fe1c7b420c090ce144b96a1d0e6cc82

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page