Visualize HTML DOM structure as a depth-limited, colorized ASCII tree
Project description
htmltree-view
Visualize HTML DOM structure as a depth-limited, colorized ASCII tree — like the
treecommand, but for HTML files.
<html> lang="en" [ L0 2ch ]
├── <head> [ L0 4ch ]
│ ├── <meta> charset="utf-8" [ L1 empty ]
│ ├── <meta> name="viewport" content="width=device-width" [ L1 empty ]
│ ├── <title> [ L1 empty ]
│ │ └── "My Page"
│ └── <link> rel="stylesheet" href="style.css" [ L1 empty ]
└── <body> [ L1 3ch ]
├── <header> [ L2 2ch ]
│ └── … (2 children hidden)
├── <main> id="main-content" [ L2 2ch ]
│ └── … (2 children hidden)
└── <footer> [ L2 2ch ]
└── … (2 children hidden)
────────────────────────────────────────────────────
Tags: 8 Text nodes: 1 Max depth: 2 (capped at 2)
Top tags: meta×2, html×1, head×1, title×1, link×1
Features
- Depth limiting —
-d Nstops at level N; truncated sub-trees show a… (X children hidden)hint - CSS selector zoom —
-s "#app"or-s "body > main"focuses any sub-tree - Semantic tag colors — headings in amber, structural in blue, forms in pink, links in cyan, etc.
- Depth-cycling pipe colors — guide lines change shade per nesting level
[L3 5ch]badges — depth level + direct child-tag count on every node- Text nodes — quoted inline, with
--text-limittruncation and whitespace collapsing - Attribute filtering —
--attrs id class hrefshows only what you care about;--attrshides all - Attribute value truncation —
--attr-limit 80prevents base64/data-URI blowout - HTML comments — hidden by default, shown with
--show-comments - URL fetching —
htmltree https://example.com -d 3 - stdin pipe —
curl ... | htmltree -orecho '<div/>' | htmltree - - Output to file —
-o tree.txt(auto-disables color) - Auto color detection — ANSI disabled when stdout is not a TTY; respects
NO_COLOR/FORCE_COLORenv vars - Streaming output —
iter_lines()yields one line at a time; never builds the full string unless you ask - No recursion — iterative DFS walk; handles arbitrarily deep HTML without
RecursionError - Stats summary — total tags, text nodes, comments, max depth seen, top-5 tag frequencies
Install
pip install htmltree-view
# With faster lxml parser:
pip install "htmltree-view[lxml]"
# With html5lib (most spec-accurate):
pip install "htmltree-view[html5lib]"
CLI
# Full tree
htmltree index.html
# Limit depth to 3 levels
htmltree index.html -d 3
# Focus on a CSS-selected sub-tree
htmltree index.html -s "body > main"
htmltree index.html -s "#app"
htmltree index.html -s ".container"
# Fetch from URL
htmltree https://example.com -d 4
# Read from stdin
curl https://example.com | htmltree -
echo '<div><p>hi</p></div>' | htmltree -
# Show only id and class attributes
htmltree index.html --attrs id class
# Hide all attributes
htmltree index.html --attrs
# Hide text nodes (structure only)
htmltree index.html --no-text
# Show HTML comments
htmltree index.html --show-comments
# Truncate text/attr at 40 chars
htmltree index.html --text-limit 40 --attr-limit 40
# Save to file (color auto-disabled)
htmltree index.html -o structure.txt
# Pipe to less with color preserved
htmltree index.html --force-color | less -R
# Use lxml backend (faster)
htmltree index.html --parser lxml
# Plain output (no ANSI)
htmltree index.html --no-color
Python API
from htmltree import HtmlTree
html = open("index.html").read()
# Basic usage
tree = HtmlTree(html)
tree.print()
# Limit depth, filter attributes
tree = HtmlTree(html, max_depth=3, show_attrs=["id", "class"])
tree.print()
# Zoom into a sub-tree
tree = HtmlTree(html, max_depth=5, show_text=False)
tree.print(root_selector="body > main")
# Render to string
tree = HtmlTree(html, max_depth=2, force_color=False)
output = tree.render(root_selector="body")
print(output)
# Stream line by line (memory-efficient for large pages)
tree = HtmlTree(html, max_depth=4)
for line in tree.iter_lines(root_selector="#content"):
print(line)
# Access stats after render
tree.render()
print(tree.stats.total_tags)
print(tree.stats.tag_counts) # dict: tag name → count
print(tree.stats.max_depth_seen)
print(tree.stats.total_text_nodes)
print(tree.stats.total_comments)
CLI reference
| Flag | Default | Description |
|---|---|---|
SOURCE |
— | HTML file path, http/https URL, or - for stdin |
-d N / --depth N |
unlimited | Max depth; negatives clamped to 0 |
-s CSS / --selector CSS |
<html> |
CSS selector for tree root |
--attrs [NAME …] |
all | Attributes to show; no names = hide all |
--no-text |
off | Hide text nodes |
--show-comments |
off | Show HTML comment nodes |
--text-limit N |
60 | Max chars per text node |
--attr-limit N |
80 | Max chars per attribute value |
--no-color |
off | Disable ANSI colors |
--force-color |
off | Force colors even when piped |
--no-summary |
off | Suppress stats footer |
-o FILE / --output FILE |
stdout | Write to file |
--parser BACKEND |
html.parser |
html.parser, lxml, html5lib |
--version |
— | Print version and exit |
Tree legend
| Symbol | Meaning |
|---|---|
[L3] |
Node is at depth 3 |
[5ch] |
5 direct tag children |
[empty] |
No children |
"text" |
Text node content (may be truncated) |
<!-- … --> |
HTML comment (with --show-comments) |
… (N children hidden) |
Sub-tree cut at depth limit |
Environment variables
| Variable | Effect |
|---|---|
NO_COLOR |
Any non-empty value disables ANSI colors (https://no-color.org/) |
FORCE_COLOR |
Any non-empty value forces ANSI colors even when piped |
Requirements
- Python ≥ 3.8
beautifulsoup4 ≥ 4.12- Optional:
lxml,html5lib
License
👤 Author
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
htmltree_view-0.2.1.tar.gz
(19.0 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file htmltree_view-0.2.1.tar.gz.
File metadata
- Download URL: htmltree_view-0.2.1.tar.gz
- Upload date:
- Size: 19.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
628171fd3152d4a2523e5c808857899b92e716397a9a945ee47668bd08f2d48d
|
|
| MD5 |
3e123822ac2e655d2d7d1da71b2ec080
|
|
| BLAKE2b-256 |
9461a695da03286024dda1e7682c3c3d84936967f5e145e8b54f1af4b7dfceb7
|
File details
Details for the file htmltree_view-0.2.1-py3-none-any.whl.
File metadata
- Download URL: htmltree_view-0.2.1-py3-none-any.whl
- Upload date:
- Size: 15.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
820dddfa562ddeaf42e7fe433669bd5d034504e0e35f28f92fc8d85f05cfe838
|
|
| MD5 |
04927031def9be8b850ba37382957131
|
|
| BLAKE2b-256 |
9ad7c98cffaa34345e10e5facbb4dff892b5bdf172857c9e84ba681ac2b9de7b
|