Skip to main content

Complete lxml external type annotation

Project description

PyPI version Supported Python Wheel

Important note

  • Since 2025.03.04:

    • BeautifulSoup4 package is added as dependency to utilise its inline annotation, thus dropping types-beautifulsoup4 dependency.
    • Fixes compatibility with older versions of type checkers, as well as mypy 1.14+.
  • Since 2025.02.24:

    • Add basedpyright type checker support (an enhanced fork of pyright)
  • Since 2024.11.08:

    • pyright and vscode users will receive warnings if certain lxml API usage would result in exception or undesirable runtime behavior.
    • It is possible to verify release files indeed come from GitHub and not maliciously altered.

Introduction

This repository contains external type annotations for lxml. It can be used by type-checking tools to check code that uses lxml, or used within IDEs like VSCode to facilitate development.

Goal ① : Completion

Now the coverage of lxml submodules is complete (unless intentionally rejected, see further below), thus no more considered as partial:

  • lxml.etree
  • lxml.html
    • lxml.html.builder
    • lxml.html.clean (already removed in lxml 5.2.0, this project will follow suite in future)
    • lxml.html.diff
    • lxml.html.html5parser
    • lxml.html.soupparser
  • lxml.isoschematron
  • lxml.objectify
  • lxml.builder
  • lxml.cssselect
  • lxml.sax
  • lxml.ElementInclude

Following submodules will not be implemented due to irrelevance to type checking or other reasons:

  • lxml.etree.Schematron (obsolete and superseded by lxml.isoschematron)
  • lxml.usedoctest
  • lxml.html.usedoctest
  • lxml.html.formfill (shouldn't have existed, this would belong to HTTP libraries like requests or httpx)

Check out project page for future plans and progress.

Goal ② : Support multiple type checkers

Currently the annotations are validated for following type checkers:

pyright and basedpyright are recommended for their greater flexibility and early adoption of newer type checking features. In the future, there is plan to bring even more type checker support.

Goal ③: Review and test suite

  • All prior lxml-stubs contributions are reviewed thoroughly, bringing coherency of annotation across the whole package
  • Perform runtime check, and compare against static type checker result; this guarantees annotations are indeed working in real world, not just within some cooked up test suite
  • Existing static test suite already vastly expanded, and is under progress of migrating to runtime test
  • Modernize package building infrastructure

Goal ④ : Geared towards users

Docstring

This package tries to bring type annotation specific docstrings for some classes and functions, explaining how they can be used. Following screenshot demonstrates annotation specific docstring in Visual Studio Code:

Stub docstring in VSCode mouseover tooltip

Warnings for exception and wrong code

pyright (and therefore vscode) users receive additional benefit of being forewarned when their lxml code will likely cause undesirable runtime behavior or outright exception.

  • #64 covers one such example where such warnings are warrented.
  • Another example is html.html5parser submodule functions causing exception when str input and guess_charset parameter are used together.

[!NOTE] This feature makes use of @deprecated decorator from Python 3.13. mypy disables such warnings by default, and need to be turned on explicitly.

image showing deprecation warning

Class inheritance change

Current annotations are geared towards convenience for programmers' convenience instead of absolute logical 'correctness'. The deviation of class inheritance for HtmlComment and friends is one prominent example.


Installation

The normal choice for most people is to fetch package from PyPI, like:

uv pip install -U types-lxml  # using uv
pip install -U types-lxml  # using pip

In the unlikely case PyPI is down, one can directly download wheel from latest release in GitHub, and then perform installation as local file.

As convenience, it is possible to pull type checker directly with extras:

uv pip install -U types-lxml[pyright]
pip install -U types-lxml[mypy]

Choosing the build

Since 2024.08.07 release, there will be two versions of types-lxml. First one is the default one; if there's no problem using it, there's no need to switch.

The second version, types-lxml-multi-subclass, is intended for specific need, namely creation of multiple lxml element subclasses. For example:

  graph TD;
      etree.ElementBase-->MyBaseElement;
      MyBaseElement-->MySubElement1;
      MyBaseElement-->MySubElement2;

If a parsed or constructed element tree consists of single type of element nodes, it is safe to assume the children or parent of a node are of the same type too. But this assumption does not hold for multiple subclasses. Using diagram above as example, calling .iter() method from MyBaseElement node may produce element of any subelement or even MyBaseElement itself. Therefore output type should be simply MyBaseElement only.

Such scenario is already in effect for lxml.html. <form> element (FormElement) is supposed to contain other form related tags like <input>, <select> etc. But we can't possibly pinpoint single subelement type, so <form> children can only possibly be of type HtmlElement. The multiple subelement scenario is already hardcoded for HtmlElement and ObjectifiedElement within this annotation package, but users may choose to have their own overridden element subclasses (inherit from ElementBase) too.

The 2 paradigms can't coexist within a single type annotation package. See bug #51 that illustrated why multiple build is necessary.

[!IMPORTANT] Users can only choose to install either build, not both. pip would arbitrarily overwrite conflicting files with one another. If in doubt, removing existing package first, then install the one you needed.

Release file attestation

[!TIP] For those haven't heard of it, this is sort of like gnupg or minisign signatures, but with GitHub backed infrastructure.

Since 2024.11.08 users can download types-lxml release files and verify that they indeed do originate from GitHub. After downloading release wheel file (say pip download types-lxml, or browser access to PyPI directly), one can use GitHub cli to verify it comes from this GitHub repository without being altered:

gh at verify types_lxml-2024.11.8-py3-none-any.whl --repo abelcheung/types-lxml

Should generate following result:

Loaded digest sha256:4b4fa7f9e2f1d5f58b98ac9852a75927e4e0f69363249f9cebc78db095c046e0 for file://types_lxml-2024.11.8-py3-none-any.whl
Loaded 1 attestation from GitHub API
✓ Verification succeeded!

sha256:4b4fa7f9e2f1d5f58b98ac9852a75927e4e0f69363249f9cebc78db095c046e0 was attested by:
REPO                   PREDICATE_TYPE                  WORKFLOW
abelcheung/types-lxml  https://slsa.dev/provenance/v1  .github/workflows/release.yml@refs/tags/2024.11.08

History

Type annotations for lxml were initially included in typeshed, but as it was still incomplete at that time, the stubs are ripped out as a separate project. The code was since then under governance of lxml, until 2022 when this fork intended to revamp lxml-stubs completely and emerge into separate project.

types-lxml is a fork of lxml-stubs that strives for the goals described above, so that most people would find it more useful.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

types_lxml-2025.3.30.tar.gz (153.0 kB view details)

Uploaded Source

Built Distribution

types_lxml-2025.3.30-py3-none-any.whl (93.2 kB view details)

Uploaded Python 3

File details

Details for the file types_lxml-2025.3.30.tar.gz.

File metadata

  • Download URL: types_lxml-2025.3.30.tar.gz
  • Upload date:
  • Size: 153.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for types_lxml-2025.3.30.tar.gz
Algorithm Hash digest
SHA256 ea4da0d2b61a73e114e14e2dcca1f62671182d80016a483369c553a24ba8f163
MD5 4bd23d140abba69821210a5618417143
BLAKE2b-256 84019e45ccd26ca36982ef505174a6e5f661abb68565fa27c77d2e56a3f0e9f5

See more details on using hashes here.

Provenance

The following attestation bundles were made for types_lxml-2025.3.30.tar.gz:

Publisher: release.yml on abelcheung/types-lxml

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file types_lxml-2025.3.30-py3-none-any.whl.

File metadata

  • Download URL: types_lxml-2025.3.30-py3-none-any.whl
  • Upload date:
  • Size: 93.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for types_lxml-2025.3.30-py3-none-any.whl
Algorithm Hash digest
SHA256 310b0ddf9e2104c8684362445808adf8a2cf41cf102d4ea0d16ced413cc46f66
MD5 2bc253fe28636f30b3cf81f1cc74153b
BLAKE2b-256 0da497b5427717f6499bdc093228b695e7ff8a2972c9e47e2492c30a21cc85e0

See more details on using hashes here.

Provenance

The following attestation bundles were made for types_lxml-2025.3.30-py3-none-any.whl:

Publisher: release.yml on abelcheung/types-lxml

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page