Parse HTML table as Python list or dict
Project description
HTML Table Parse
A lightweight HTML table parser that converts tables to Python data structures without pandas.
Installation
pip install html-table-parse
Usage
from html_table_parse import to_list, to_dict, to_dicts
html = """
<table>
<tr><th>Name</th><th>Age</th><th>City</th></tr>
<tr><td>Alice</td><td>30</td><td>NYC</td></tr>
<tr><td>Bob</td><td>25</td><td>LA</td></tr>
</table>
"""
# List of lists
to_list(html)
# [['Name', 'Age', 'City'], ['Alice', '30', 'NYC'], ['Bob', '25', 'LA']]
# Dictionary of columns
to_dict(html)
# {'Name': ['Alice', 'Bob'], 'Age': ['30', '25'], 'City': ['NYC', 'LA']}
# List of dictionaries
to_dicts(html)
# [{'Name': 'Alice', 'Age': '30', 'City': 'NYC'},
# {'Name': 'Bob', 'Age': '25', 'City': 'LA'}]
Features
- No pandas required - lightweight alternative to
pandas.read_html() - Supports
colspanandrowspanattributes - Handles duplicate headers (auto-numbered)
- Multiple output formats: lists, dict of columns, or list of dicts
- Automatic whitespace normalization
- Fast parsing with
lxml
API
to_list(html: str, index: int = 0) -> list[list]
Parse table as list of rows.
to_dict(html: str, index: int = 0) -> dict[str, list]
Parse table as dictionary of columns (first row = headers).
to_dicts(html: str, index: int = 0) -> list[dict]
Parse table as list of dictionaries (first row = headers).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file html_table_parse-0.2.2.tar.gz.
File metadata
- Download URL: html_table_parse-0.2.2.tar.gz
- Upload date:
- Size: 3.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.25 {"installer":{"name":"uv","version":"0.11.25","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4c57fcfc440f54bb476902e14775d06007e06718ea4f04b077bd315cff058ffc
|
|
| MD5 |
78c2aa60a611ae653943949421bced76
|
|
| BLAKE2b-256 |
46ff9a6a96081f387e16a6e1f2cd0fb2d546ee11b411ced57f37d371402e4c27
|
File details
Details for the file html_table_parse-0.2.2-py3-none-any.whl.
File metadata
- Download URL: html_table_parse-0.2.2-py3-none-any.whl
- Upload date:
- Size: 3.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.25 {"installer":{"name":"uv","version":"0.11.25","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e8e202c8f193ed20fba0d49a7f0b526461cc4e4b4cfb46b0372b4d8a915f28f1
|
|
| MD5 |
82100a5d4bdee4361a571548904b947a
|
|
| BLAKE2b-256 |
79f88be73d481d99f68767fb8e65ffea968992c0712fd033d7a8244281a00d55
|