Skip to main content

Complete lxml external type annotation

Project description

PyPI version Supported Python Wheel

This repository contains external type annotations for lxml. It can be used by type-checking tools (currently supporting mypy and pyright) to check code that uses lxml, or used within IDEs like VSCode or PyCharm to facilitate development.

Goal ① : Completion

Now the coverage of lxml submodules is complete (unless intentionally rejected, see further below), thus no more considered as partial:

  • lxml.etree
  • lxml.html
    • lxml.html.builder
    • lxml.html.clean
    • lxml.html.diff
    • lxml.html.html5parser
    • lxml.html.soupparser
  • lxml.isoschematron
  • lxml.objectify
  • lxml.builder
  • lxml.cssselect
  • lxml.sax
  • lxml.ElementInclude

Following submodules will not be implemented due to irrelevance to type checking or other reasons:

  • lxml.etree.Schematron (obsolete and superseded by lxml.isoschematron)
  • lxml.usedoctest
  • lxml.html.usedoctest
  • lxml.html.formfill (shouldn't have existed, this would belong to HTTP libraries like requests or httpx)

Check out project page for future plans and progress.

Goal ② : Support multiple type checkers

Currently the annotations are validated for both mypy and pyright strict mode.

In the future, there is plan to bring even more type checker support.

Goal ③: Review and test suite

  • All prior lxml-stubs contributions are reviewed thoroughly, bringing coherency of annotation across the whole package
  • Much more extensive test cases
    • Mypy test suite already vastly expanded
    • Perform runtime check, and compare against static type checker result; this guarantees annotations are indeed working in real code, not just in some cooked up test suite
      • Proof of concept for incorporating pyright result under progress, currently just comparing reveal_type() results
      • Migrate static mypy tests to runtime pyright tests in future
  • Modernize package building infrastructure

Goal ④ : Support for IDEs

Despite having no official PEP, some IDEs support showing docstring from external annotations. This package tries to bring type annotation specific docstrings for some lxml classes and functions, explaining how they can be used. Following screenshots show what would look like in Visual Studio Code, behaving as if docstrings come from real python code:

Stub docstring in VSCode mouseover tooltip

Besides docstring, current annotations are geared towards convenience for code writers instead of absolute logical 'correctness'. The deviation of class inheritance for HtmlComment and friends is one prominent example.

Installation

The normal choice for most people is to fetch package from PyPI via pip:

pip install -U types-lxml

There are a few other alternatives though.

From downloaded wheel file

Head over to latest release in GitHub and download wheel file (with extension .whl), which can be installed in the same way as PyPI package:

pip install -U types-lxml*.whl

Bleeding edge from GitHub

pip install -U git+https://github.com/abelcheung/types-lxml.git

Special notes

Type checker support

Actually, pyright is the preferred type checker to use for lxml code. mypy can be either too restrictive or doesn't support some feature needed by lxml.

Here is one example: normalisation of element attributes.

It is employed by many other projects, so that users can supply common type of value while setting object attributes, and the code internally canonicalise/converts supplied argument to specific type. This is a convenience for library users, so they don't always need to do internal conversion by themselves. Consider the example below:

from typing_extensions import reveal_type
from lxml.etree import fromstring, QName

person = fromstring('<person><height>170</height></person>')
reveal_type(person[0].tag)
person[0].tag = QName('http://ns.prefix', person[0].tag)

Lxml supports stringify QNames when setting element tags. Of course, during runtime, everything work as expected:

>>> print(e.tostring(person, encoding=str))
<person><ns0:height xmlns:ns0="http://ns.prefix">170</ns0:height></person>

pyright correctly reports element tag type, and don't complain about assignment:

information: Type of "person[0].tag" is "str"

But mypy barks loudly about the feature:

error: Incompatible types in assignment (expression has type "QName", variable has type "str")  [assignment]

There are many, many more places in lxml that employs such normalisation.

ParserTarget

There is now only one stub-only classes that do not exist as concrete class in lxmllxml.etree.ParserTarget. However the support of custom parser target is shelved, so this virtual class is not very relevant for now.

History

Type annotations for lxml were initially included in typeshed, but as it was still incomplete at that time, the stubs are ripped out as a separate project. The code was extracted by Jelle Zijlstra and moved to lxml-stubs repository using git filter-branch.

types-lxml is a fork of lxml-stubs that strives for the goals described above, so that most people would find it more useful.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

types-lxml-2023.10.21.tar.gz (105.2 kB view hashes)

Uploaded source

Built Distribution

types_lxml-2023.10.21-py3-none-any.whl (72.4 kB view hashes)

Uploaded py3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page