Skip to main content

Complete lxml external type annotation

Project description

PyPI version Supported Python Wheel

This repository contains external type annotations for lxml. It can be used by type-checking tools (currently supporting mypy and pyright) to check code that uses lxml, or used within IDEs like VSCode or PyCharm to facilitate development.

Goal ①: Completion

Now the coverage of major lxml submodules is complete, thus no more considered as partial:

  • lxml.etree: 100%
    • etree.Schematron is obsolete and superseded by lxml.isoschematron, so won't implement
  • lxml.html proper: 100%
  • lxml.objectify: 100%
  • lxml.builder: 100%
  • lxml.cssselect: 100%
  • lxml.sax: 100%

Following list reflects current situation for less used lxml / html submodules:

  • lxml.ElementInclude
  • lxml.isoschematron
  • lxml.usedoctest
  • lxml.html.builder
  • lxml.html.clean
  • lxml.html.diff
  • lxml.html.formfill (may not implement)
  • lxml.html.html5parser
  • lxml.html.soupparser
  • lxml.html.usedoctest

Check out project page for future plans and progress.

Goal ②: Support multiple type checkers

Currently the annotations are validated for both mypy and pyright strict mode.

In the future, there is plan to bring even more type checker support.

Goal ③: Review and test suite

  • All prior lxml-stubs contributions are reviewed thoroughly, bringing coherency of annotation across the whole package
  • Much more extensive test cases
    • Mypy test suite only covered about half of the whole package currently
    • Lack of prior art for performing pyright checks within pytest
    • Plan to perform runtime check, and compare against type checker result
  • Modernize package building infrastructure

Goal ④: Support for IDEs

Despite having no official PEP, some IDEs support showing docstring from external annotations. This package is try to bring more and more of the original lxml class and function docstrings, since the majorify of lxml is written in Cython, and IDEs sometimes won't show Cython docstrings during code development. Following screenshots show what would look like, behaving if docstrings are coming from real python code:

Stub docstring in PyCharm Documentation Tool

Stub docstring in VSCode mouseover tooltip

Besides docstring, current annotations are geared towards convenience for code writers instead of absolute logical 'correctness'. The deviation of class inheritance for HtmlComment and friends is one prominent example.

Installation

The normal choice for most people is to fetch package from PyPI via pip:

pip install -U types-lxml

There are a few other alternatives though.

From downloaded wheel file

Head over to latest release in GitHub and download wheel file (with extension .whl), which can be installed in the same way as PyPI package:

pip install -U types-lxml*.whl

Bleeding edge from GitHub

pip install -U git+https://github.com/abelcheung/types-lxml.git

Special notes

Type checker support

Actually, pyright is the preferred type checker to use for lxml code. mypy can be either too restrictive or doesn't support some feature needed by lxml.

Here is one example: normalisation of element attributes.

It is employed by many other projects, so that users can supply common type of value while setting object attributes, and the code internally canonicalise/converts supplied argument to specific type. This is a convenience for library users, so they don't always need to do internal conversion by themselves. Consider the example below:

from typing_extensions import reveal_type
from lxml.etree import fromstring, QName

person = fromstring('<person><height>170</height></person>')
reveal_type(person[0].tag)
person[0].tag = QName('http://ns.prefix', person[0].tag)

Lxml supports stringify QNames when setting element tags. Of course, during runtime, everything work as expected:

>>> print(e.tostring(person, encoding=str))
<person><ns0:height xmlns:ns0="http://ns.prefix">170</ns0:height></person>

pyright correctly reports element tag type, and don't complain about assignment:

information: Type of "person[0].tag" is "str"

But mypy barks loudly about the feature:

error: Incompatible types in assignment (expression has type "QName", variable has type "str")  [assignment]

There are many, many more places in lxml that employs such normalisation.

ParserTarget

There is now only one stub-only classes that do not exist as concrete class in lxmllxml.etree.ParserTarget. However the support of custom parser target is shelved, so this virtual class is not very relevant for now.

History

Type annotations for lxml were initially included in typeshed, but as it was still incomplete at that time, the stubs are ripped out as a separate project. The code was extracted by Jelle Zijlstra and moved to lxml-stubs repository using git filter-branch.

types-lxml is a fork of lxml-stubs that strives for the goals described above, so that most people would find it more useful.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

types_lxml-2023.3.28.tar.gz (81.0 kB view hashes)

Uploaded Source

Built Distribution

types_lxml-2023.3.28-py3-none-any.whl (70.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page