Aspose.HTML for Python via .NET is a powerful API for Python that provides a headless browser functionality, allowing you to work with HTML documents in a variety of ways. With this API, you can easily create new HTML documents or open existing ones from different sources. Once you have the document, you can perform various manipulation operations, such as removing and replacing HTML nodes.

These details have not been verified by PyPI

Project links

Project description

Process & Manipulate HTML via Python API

Aspose.HTML for Python via .NET is a powerful API for Python that provides headless browser functionality, allowing you to work with HTML documents. With this API, you can easily create new HTML documents or open existing ones from different sources. Once you have the document, you can perform various manipulation operations, such as removing and replacing HTML nodes, rendering, and converting HTML to other popular formats, etc.

HTML API Features

The following are some popular features of Aspose.HTML for Python via .NET:

General Features

Create, Load, and Read Documents. Create, load, and modify HTML, XHTML, Markdown, or SVG documents with full control over elements, attributes, and structure using a powerful DOM-based API.
Load EPUB and MHTML file Formats. Open, read, and convert EPUB and MHTML documents with full support for their internal structure and linked resources.
Edit Documents. Insert, remove, clone, or replace HTML elements at any level of the DOM tree for granular control over content.
Save HTML Documents. Save documents along with all linked resources like CSS, fonts, and images using customizable saving options.
Navigate HTML. Navigate through documents using either NodeIterator or TreeWalker.
Sandboxing. Configure a Sandbox environment that is independent of the execution machine, ensuring a secure and isolated environment for running and testing.

Data Extraction

DOM Traversal. Navigate and manipulate the DOM tree using W3C-compliant traversal interfaces to inspect and retrieve content from HTML documents.
XPath Queries. Perform high-performance XPath queries to find and extract target content from large HTML documents.
CSS Selector and JavaScript. Use CSS selector queries and JavaScript execution to dynamically locate and extract specific elements.
Extract CSS Styling Information. Retrieve and analyze inline styles, embedded <style> blocks, and external stylesheets within HTML documents.
Extract any Data from HTML Documents. Text, attributes, form values, metadata, tables, links, or media elements: Aspose.HTML for Python via .NET enables the accurate and efficient extraction of any content for processing, analysis, or editing.

Conversion and Rendering

Convert Documents. Convert HTML, XHTML, SVG, MHTML, MD, and EPUB files to a wide range of formats, including PDF, XPS, DOCX, and different image formats (PNG, JPEG, BMP, TIFF, and GIF).
Custom Conversion Settings. Adjust page size, resolution, stylesheets, resource management, script execution, and other settings during conversion to fine-tune the output.
Markdown Support. Convert HTML to Markdown or vice versa for content migration and Markdown-based workflows.
Timeout Control. Set and control the timeout for the rendering process.

Advanced HTML Features

Monitor DOM Changes. Use MutationObserver to monitor DOM modifications.
HTML Templates. Populate HTML documents with external data sources such as XML and JSON.
Output Streams. Support for both single (PDF, XPS) and multiple (image formats) output file streams.
Check Web Accessibility. Check web documents against WCAG standards using built-in validators and accessibility rule sets.

Supported File Formats

Format	Description	Load	Save
HTML	HyperText Markup Language format	✔️	✔️
XHTML	eXtensible HyperText Markup Language format	✔️	✔️
MHTML	MIME HTML format	✔️	✔️
EPUB	E-book file format	✔️
SVG	Scalable Vector Graphics format	✔️	✔️
MD	Markdown markup language format	✔️	✔️
PDF	Portable Document Format		✔️
XPS	XML Paper Specification format		✔️
DOCX	Microsoft Word Open XML document format		✔️
TIFF	Tagged Image File Format		✔️
JPEG	Joint Photographic Experts Group format		✔️
PNG	Portable Network Graphics format		✔️
BMP	Bitmap Picture format		✔️
GIF	Graphics Interchange Format		✔️
WEBP	Modern image format providing both lossy and lossless compression		✔️

Platform Independence

Aspose.HTML for Python via .NET can be used to develop applications for a vast range of operating systems, such as Windows, where Python 3.5 or later is installed. You can build both 32-bit and 64-bit Python applications.

Get Started

Are you ready to give Aspose.HTML for Python via .NET a try?

Simply run pip install aspose-html-net from the Console to fetch the package. If you already have Aspose.HTML for Python via .NET and want to upgrade the version, please run pip install --upgrade aspose-html-net to get the latest version.

You can run the following snippets in your environment to see how Aspose.HTML works, or check out the GitHub Repository or Aspose.HTML for Python via .NET Documentation for other common use cases.

Create a New HTML Document

If you want to create an HTML document programmatically from scratch, use the parameterless constructor:

import aspose.html as ah

# Initialize an empty HTML document
with ah.HTMLDocument() as document:
    # Create a text node and add it to the document
    text = document.create_text_node("Hello, World!")
    document.body.append_child(text)

    # Save the document to a file
    document.save("create-new-document.html")

Source - Create a Document in Python

Extract Images from Website

Here is an example of how to use Aspose.HTML for Python via .NET to find images specified by the <img> element:

import os
import aspose.html as ah
import aspose.html.net as ahnet

# Prepare output directory
output_dir = "output/"
os.makedirs(output_dir, exist_ok=True)

# Open HTML document from URL
with ah.HTMLDocument("https://docs.aspose.com/svg/net/drawing-basics/svg-color/") as doc:
    # Collect all <img> elements
    images = doc.get_elements_by_tag_name("img")

    # Get distinct relative image URLs
    urls = set(img.get_attribute("src") for img in images)

    # Create absolute image URLs
    abs_urls = [ah.Url(url, doc.base_uri) for url in urls]

    for url in abs_urls:
        # Create a network request
        request = ahnet.RequestMessage(url)

        # Send request
        response = doc.context.network.send(request)

        # Check whether a response is successful
        if response.is_success:
            # Parse the URL to get the file name
            file_name = os.path.basename(url.pathname)

            # Save image to the local file system
            with open(os.path.join(output_dir, file_name), "wb") as f:
                f.write(response.content.read_as_byte_array())

Source - Extract Images From Website in Python

HTML to PDF in one line of code

Aspose.HTML for Python via .NET allows you to convert HTML to PDF, XPS, Markdown, MHTML, PNG, JPEG, and other file formats. The following snippet demonstrates the conversion from HTML to PDF literally with a single line of code!

import aspose.html.converters as conv
import aspose.html.saving as sav

# Convert HTML to PDF
conv.Converter.convert_html("document.html", sav.PdfSaveOptions(), "document.pdf")

Source - Convert HTML to PDF in Python

Convert HTML to Markdown (MD)

The following snippet demonstrates the conversion from HTML to GIT-based Markdown (MD) Format:

import aspose.html.converters as conv
import aspose.html.saving as sav

# Prepare HTML code and save it to a file
code = "<h1>Header 1</h1>" \
         "<h2>Header 2</h2>" \
         "<p>Hello, World!!</p>"
with open("document.html", "w", encoding="utf-8") as f:
         f.write(code)
         f.close()
         # Call the convert_html() method to convert HTML to Markdown
         conv.Converter.convert_html("document.html", sav.MarkdownSaveOptions.git, "output.md")

Source - Creating an HTML Document

Convert EPUB to PDF using SaveOptions

The PdfSaveOptions class provides numerous properties that give you full control over a wide range of parameters and improve the process of converting EPUB to PDF format. In the example, we use the page_setup, jpeg_quality, and css.media_type properties:

import os
import aspose.html.converters as conv
import aspose.html.saving as sav
import aspose.html.drawing as dr

# Setup directories and define paths
output_dir = "output/"
input_dir = "data/"
os.makedirs(output_dir, exist_ok=True)

document_path = os.path.join(input_dir, "input.epub")
save_path = os.path.join(output_dir, "epub-to-pdf.pdf")

# Open an existing EPUB file for reading
with open(document_path, "rb") as stream:

    # Create an instance of PdfSaveOptions
    options = sav.PdfSaveOptions()
    options.page_setup.any_page = dr.Page(dr.Size(800, 600), dr.Margin(10, 10, 10, 10))
    options.css.media_type.PRINT

    # Convert EPUB to PDF
    conv.Converter.convert_epub(stream, options, save_path)

Source - Convert EPUB to PDF in Python

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

26.6.0

Jun 26, 2026

26.5.0

May 31, 2026

26.4.0

Apr 26, 2026

26.3.0

Mar 23, 2026

26.2.0

Feb 27, 2026

26.1.0

Jan 29, 2026

25.12.0

Dec 22, 2025

25.11.0

Dec 15, 2025

25.10.0

Nov 17, 2025

25.9.0

Sep 26, 2025

25.8.0

Aug 27, 2025

25.7.0

Jul 23, 2025

25.6.0

Jun 19, 2025

25.5.0

May 16, 2025

25.4.0

Apr 3, 2025

25.3.0

Mar 5, 2025

25.2.0

Feb 19, 2025

25.1.0

Jan 31, 2025

24.12.0

Dec 26, 2024

24.11.0

Nov 30, 2024

24.10.0

Oct 31, 2024

24.9.0

Sep 27, 2024

24.8.0

Aug 29, 2024

24.7.0

Jul 31, 2024

24.6.0

Jun 26, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

aspose_html_net-26.6.0-py3-none-win_amd64.whl (58.8 MB view details)

Uploaded Jun 26, 2026 Python 3Windows x86-64

aspose_html_net-26.6.0-py3-none-win32.whl (51.3 MB view details)

Uploaded Jun 26, 2026 Python 3Windows x86

aspose_html_net-26.6.0-py3-none-manylinux1_x86_64.whl (83.4 MB view details)

Uploaded Jun 26, 2026 Python 3

aspose_html_net-26.6.0-py3-none-macosx_11_0_arm64.whl (56.5 MB view details)

Uploaded Jun 26, 2026 Python 3macOS 11.0+ ARM64

aspose_html_net-26.6.0-py3-none-macosx_10_14_x86_64.whl (70.6 MB view details)

Uploaded Jun 26, 2026 Python 3macOS 10.14+ x86-64

File details

Details for the file aspose_html_net-26.6.0-py3-none-win_amd64.whl.

File metadata

Download URL: aspose_html_net-26.6.0-py3-none-win_amd64.whl
Upload date: Jun 26, 2026
Size: 58.8 MB
Tags: Python 3, Windows x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.0.0 CPython/3.9.0

File hashes

Hashes for aspose_html_net-26.6.0-py3-none-win_amd64.whl
Algorithm	Hash digest
SHA256	`1d24a01152baaef837c29d59238273b892a2680be94dacf4fcd51b45723017c1`
MD5	`2ac1a97a861c81ffdcdfde56b66980b2`
BLAKE2b-256	`ec44842ca71cc58a441e527ff6cdd1071771b9340c6957321cdc1c977ecdac16`

See more details on using hashes here.

File details

Details for the file aspose_html_net-26.6.0-py3-none-win32.whl.

File metadata

Download URL: aspose_html_net-26.6.0-py3-none-win32.whl
Upload date: Jun 26, 2026
Size: 51.3 MB
Tags: Python 3, Windows x86
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.0.0 CPython/3.9.0

File hashes

Hashes for aspose_html_net-26.6.0-py3-none-win32.whl
Algorithm	Hash digest
SHA256	`4d0d5583c5ce402d9f08f902f98fd1537f89317e3208262715b0114bde2dc349`
MD5	`2ce2aef9c213b8dc8b1bf225915e009d`
BLAKE2b-256	`4582239d5faeefa3033cadd4ce2b380699f0f177e06a4dfbc94a87a9ea4c6d34`

See more details on using hashes here.

File details

Details for the file aspose_html_net-26.6.0-py3-none-manylinux1_x86_64.whl.

File metadata

Download URL: aspose_html_net-26.6.0-py3-none-manylinux1_x86_64.whl
Upload date: Jun 26, 2026
Size: 83.4 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.0.0 CPython/3.9.0

File hashes

Hashes for aspose_html_net-26.6.0-py3-none-manylinux1_x86_64.whl
Algorithm	Hash digest
SHA256	`2ba35cc7ca3781358074fac4c9396fc3f6c7835cfd3970547e5369b0934ef5fa`
MD5	`fd6964fb403d72a706f0ca0259b9450f`
BLAKE2b-256	`5720cd234dd39e1c53e779a0d682a05c973e09d49f73d320e41f50b28da5746e`

See more details on using hashes here.

File details

Details for the file aspose_html_net-26.6.0-py3-none-macosx_11_0_arm64.whl.

File metadata

Download URL: aspose_html_net-26.6.0-py3-none-macosx_11_0_arm64.whl
Upload date: Jun 26, 2026
Size: 56.5 MB
Tags: Python 3, macOS 11.0+ ARM64
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.0.0 CPython/3.9.0

File hashes

Hashes for aspose_html_net-26.6.0-py3-none-macosx_11_0_arm64.whl
Algorithm	Hash digest
SHA256	`4128e8d7f58b1743e9f75b2705806792c00d0c825cec5f1efacda310217c5d05`
MD5	`e3d5a4f4caf5d923db9ac4501cce1127`
BLAKE2b-256	`5820a26bf9c5cc9e8a04eee96fc5761b59f70af7fbbaab837b6da3ea96d03d4b`

See more details on using hashes here.

File details

Details for the file aspose_html_net-26.6.0-py3-none-macosx_10_14_x86_64.whl.

File metadata

Download URL: aspose_html_net-26.6.0-py3-none-macosx_10_14_x86_64.whl
Upload date: Jun 26, 2026
Size: 70.6 MB
Tags: Python 3, macOS 10.14+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.0.0 CPython/3.9.0

File hashes

Hashes for aspose_html_net-26.6.0-py3-none-macosx_10_14_x86_64.whl
Algorithm	Hash digest
SHA256	`24c5dc740dfa440d56f46e29fde8930ff9c9ae86badcdb6c9e59bd16629b327f`
MD5	`519072291d60e44b9e9195d71754ba5b`
BLAKE2b-256	`ca8cc0270449e3a1e88d10f1aeba90ac64fc6c3f08be8d63b66c6deee072bd5f`

See more details on using hashes here.

aspose-html-net 26.6.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Process & Manipulate HTML via Python API

HTML API Features

General Features

Data Extraction

Conversion and Rendering

Advanced HTML Features

Supported File Formats

Platform Independence

Get Started

Create a New HTML Document

Extract Images from Website

HTML to PDF in one line of code

Convert HTML to Markdown (MD)

Convert EPUB to PDF using SaveOptions

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distributions

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes