Skip to main content

Parse HTML table as Python list or dict

Project description

HTML Table Parse

A lightweight HTML table parser that converts tables to Python data structures without pandas.

Installation

pip install html-table-parse

Usage

from html_table_parse import to_list, to_dict, to_dicts

html = """
<table>
    <tr><th>Name</th><th>Age</th><th>City</th></tr>
    <tr><td>Alice</td><td>30</td><td>NYC</td></tr>
    <tr><td>Bob</td><td>25</td><td>LA</td></tr>
</table>
"""

# List of lists
to_list(html)
# [['Name', 'Age', 'City'], ['Alice', '30', 'NYC'], ['Bob', '25', 'LA']]

# Dictionary of columns
to_dict(html)
# {'Name': ['Alice', 'Bob'], 'Age': ['30', '25'], 'City': ['NYC', 'LA']}

# List of dictionaries
to_dicts(html)
# [{'Name': 'Alice', 'Age': '30', 'City': 'NYC'}, 
#  {'Name': 'Bob', 'Age': '25', 'City': 'LA'}]

Features

  • No pandas required - lightweight alternative to pandas.read_html()
  • Supports colspan and rowspan attributes
  • Handles duplicate headers (auto-numbered)
  • Multiple output formats: lists, dict of columns, or list of dicts
  • Automatic whitespace normalization
  • Fast parsing with lxml

API

to_list(html: str, index: int = 0) -> list[list]

Parse table as list of rows.

to_dict(html: str, index: int = 0) -> dict[str, list]

Parse table as dictionary of columns (first row = headers).

to_dicts(html: str, index: int = 0) -> list[dict]

Parse table as list of dictionaries (first row = headers).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

html_table_parse-0.2.2.tar.gz (3.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

html_table_parse-0.2.2-py3-none-any.whl (3.5 kB view details)

Uploaded Python 3

File details

Details for the file html_table_parse-0.2.2.tar.gz.

File metadata

  • Download URL: html_table_parse-0.2.2.tar.gz
  • Upload date:
  • Size: 3.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.25 {"installer":{"name":"uv","version":"0.11.25","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for html_table_parse-0.2.2.tar.gz
Algorithm Hash digest
SHA256 4c57fcfc440f54bb476902e14775d06007e06718ea4f04b077bd315cff058ffc
MD5 78c2aa60a611ae653943949421bced76
BLAKE2b-256 46ff9a6a96081f387e16a6e1f2cd0fb2d546ee11b411ced57f37d371402e4c27

See more details on using hashes here.

File details

Details for the file html_table_parse-0.2.2-py3-none-any.whl.

File metadata

  • Download URL: html_table_parse-0.2.2-py3-none-any.whl
  • Upload date:
  • Size: 3.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.25 {"installer":{"name":"uv","version":"0.11.25","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for html_table_parse-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 e8e202c8f193ed20fba0d49a7f0b526461cc4e4b4cfb46b0372b4d8a915f28f1
MD5 82100a5d4bdee4361a571548904b947a
BLAKE2b-256 79f88be73d481d99f68767fb8e65ffea968992c0712fd033d7a8244281a00d55

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page