Skip to main content

Parse HTML table as Python list or dict

Project description

HTML Table Parse

A lightweight HTML table parser that converts tables to Python data structures without pandas.

Installation

pip install html-table-parse

Usage

from html_table_parse import to_list, to_dict, to_dicts

html = """
<table>
    <tr><th>Name</th><th>Age</th><th>City</th></tr>
    <tr><td>Alice</td><td>30</td><td>NYC</td></tr>
    <tr><td>Bob</td><td>25</td><td>LA</td></tr>
</table>
"""

# List of lists
to_list(html)
# [['Name', 'Age', 'City'], ['Alice', '30', 'NYC'], ['Bob', '25', 'LA']]

# Dictionary of columns
to_dict(html)
# {'Name': ['Alice', 'Bob'], 'Age': ['30', '25'], 'City': ['NYC', 'LA']}

# List of dictionaries
to_dicts(html)
# [{'Name': 'Alice', 'Age': '30', 'City': 'NYC'}, 
#  {'Name': 'Bob', 'Age': '25', 'City': 'LA'}]

Features

  • No pandas required - lightweight alternative to pandas.read_html()
  • Supports colspan and rowspan attributes
  • Handles duplicate headers (auto-numbered)
  • Multiple output formats: lists, dict of columns, or list of dicts
  • Automatic whitespace normalization
  • Fast parsing with lxml

API

to_list(html: str, index: int = 0) -> list[list]

Parse table as list of rows.

to_dict(html: str, index: int = 0) -> dict[str, list]

Parse table as dictionary of columns (first row = headers).

to_dicts(html: str, index: int = 0) -> list[dict]

Parse table as list of dictionaries (first row = headers).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

html_table_parse-0.2.1.tar.gz (3.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

html_table_parse-0.2.1-py3-none-any.whl (3.5 kB view details)

Uploaded Python 3

File details

Details for the file html_table_parse-0.2.1.tar.gz.

File metadata

  • Download URL: html_table_parse-0.2.1.tar.gz
  • Upload date:
  • Size: 3.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.25 {"installer":{"name":"uv","version":"0.11.25","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for html_table_parse-0.2.1.tar.gz
Algorithm Hash digest
SHA256 0133caa6032f5b4283dc3bf96ec45ecea781cbf11a8125c3958bfe0fcaf01557
MD5 7c0720b470ff455b20e988167995db19
BLAKE2b-256 e7fff995d92509681fbbac619d2369565ac7be01336f3e527f84275dd9ca1cd1

See more details on using hashes here.

File details

Details for the file html_table_parse-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: html_table_parse-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 3.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.25 {"installer":{"name":"uv","version":"0.11.25","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for html_table_parse-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d0d34692938a7632c7cdc054bbdb88f393b5fd8bff27d71f58d142e900e22a4e
MD5 bfd7c502cc724f7b0044a8a6aabd71c1
BLAKE2b-256 6d78558ad0156f298b42d5a2a5fb3b791de3fc04bae5d4d6b45ae2d5b3f2c3b9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page