Skip to main content

Visualize HTML DOM structure as a depth-limited, colorized ASCII tree

Project description

htmltree-view

Visualize HTML DOM structure as a depth-limited, colorized ASCII tree — like the tree command, but for HTML files.

<html> lang="en"  [ L0 2ch ]
├── <head>  [ L0 4ch ]
│   ├── <meta> charset="utf-8"  [ L1 empty ]
│   ├── <meta> name="viewport" content="width=device-width"  [ L1 empty ]
│   ├── <title>  [ L1 empty ]
│   │   └── "My Page"
│   └── <link> rel="stylesheet" href="style.css"  [ L1 empty ]
└── <body>  [ L1 3ch ]
    ├── <header>  [ L2 2ch ]
    │   └── … (2 children hidden)
    ├── <main> id="main-content"  [ L2 2ch ]
    │   └── … (2 children hidden)
    └── <footer>  [ L2 2ch ]
        └── … (2 children hidden)

────────────────────────────────────────────────────
  Tags: 8  Text nodes: 1  Max depth: 2 (capped at 2)
  Top tags: meta×2, html×1, head×1, title×1, link×1

Features

  • Depth limiting-d N stops at level N; truncated sub-trees show a … (X children hidden) hint
  • CSS selector zoom-s "#app" or -s "body > main" focuses any sub-tree
  • Semantic tag colors — headings in amber, structural in blue, forms in pink, links in cyan, etc.
  • Depth-cycling pipe colors — guide lines change shade per nesting level
  • [L3 5ch] badges — depth level + direct child-tag count on every node
  • Text nodes — quoted inline, with --text-limit truncation and whitespace collapsing
  • Attribute filtering--attrs id class href shows only what you care about; --attrs hides all
  • Attribute value truncation--attr-limit 80 prevents base64/data-URI blowout
  • HTML comments — hidden by default, shown with --show-comments
  • URL fetchinghtmltree https://example.com -d 3
  • stdin pipecurl ... | htmltree - or echo '<div/>' | htmltree -
  • Output to file-o tree.txt (auto-disables color)
  • Auto color detection — ANSI disabled when stdout is not a TTY; respects NO_COLOR / FORCE_COLOR env vars
  • Streaming outputiter_lines() yields one line at a time; never builds the full string unless you ask
  • No recursion — iterative DFS walk; handles arbitrarily deep HTML without RecursionError
  • Stats summary — total tags, text nodes, comments, max depth seen, top-5 tag frequencies

Install

pip install htmltree-view

# With faster lxml parser:
pip install "htmltree-view[lxml]"

# With html5lib (most spec-accurate):
pip install "htmltree-view[html5lib]"

CLI

# Full tree
htmltree index.html

# Limit depth to 3 levels
htmltree index.html -d 3

# Focus on a CSS-selected sub-tree
htmltree index.html -s "body > main"
htmltree index.html -s "#app"
htmltree index.html -s ".container"

# Fetch from URL
htmltree https://example.com -d 4

# Read from stdin
curl https://example.com | htmltree -
echo '<div><p>hi</p></div>' | htmltree -

# Show only id and class attributes
htmltree index.html --attrs id class

# Hide all attributes
htmltree index.html --attrs

# Hide text nodes (structure only)
htmltree index.html --no-text

# Show HTML comments
htmltree index.html --show-comments

# Truncate text/attr at 40 chars
htmltree index.html --text-limit 40 --attr-limit 40

# Save to file (color auto-disabled)
htmltree index.html -o structure.txt

# Pipe to less with color preserved
htmltree index.html --force-color | less -R

# Use lxml backend (faster)
htmltree index.html --parser lxml

# Plain output (no ANSI)
htmltree index.html --no-color

Python API

from htmltree import HtmlTree

html = open("index.html").read()

# Basic usage
tree = HtmlTree(html)
tree.print()

# Limit depth, filter attributes
tree = HtmlTree(html, max_depth=3, show_attrs=["id", "class"])
tree.print()

# Zoom into a sub-tree
tree = HtmlTree(html, max_depth=5, show_text=False)
tree.print(root_selector="body > main")

# Render to string
tree = HtmlTree(html, max_depth=2, force_color=False)
output = tree.render(root_selector="body")
print(output)

# Stream line by line (memory-efficient for large pages)
tree = HtmlTree(html, max_depth=4)
for line in tree.iter_lines(root_selector="#content"):
    print(line)

# Access stats after render
tree.render()
print(tree.stats.total_tags)
print(tree.stats.tag_counts)      # dict: tag name → count
print(tree.stats.max_depth_seen)
print(tree.stats.total_text_nodes)
print(tree.stats.total_comments)

CLI reference

Flag Default Description
SOURCE HTML file path, http/https URL, or - for stdin
-d N / --depth N unlimited Max depth; negatives clamped to 0
-s CSS / --selector CSS <html> CSS selector for tree root
--attrs [NAME …] all Attributes to show; no names = hide all
--no-text off Hide text nodes
--show-comments off Show HTML comment nodes
--text-limit N 60 Max chars per text node
--attr-limit N 80 Max chars per attribute value
--no-color off Disable ANSI colors
--force-color off Force colors even when piped
--no-summary off Suppress stats footer
-o FILE / --output FILE stdout Write to file
--parser BACKEND html.parser html.parser, lxml, html5lib
--version Print version and exit

Tree legend

Symbol Meaning
[L3] Node is at depth 3
[5ch] 5 direct tag children
[empty] No children
"text" Text node content (may be truncated)
<!-- … --> HTML comment (with --show-comments)
… (N children hidden) Sub-tree cut at depth limit

Environment variables

Variable Effect
NO_COLOR Any non-empty value disables ANSI colors (https://no-color.org/)
FORCE_COLOR Any non-empty value forces ANSI colors even when piped

Requirements

  • Python ≥ 3.8
  • beautifulsoup4 ≥ 4.12
  • Optional: lxml, html5lib

License

MIT

👤 Author

Hadi Cahyadi

Buy Me a Coffee

Donate via Ko-fi

Support me on Patreon

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

htmltree_view-0.2.1.tar.gz (19.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

htmltree_view-0.2.1-py3-none-any.whl (15.0 kB view details)

Uploaded Python 3

File details

Details for the file htmltree_view-0.2.1.tar.gz.

File metadata

  • Download URL: htmltree_view-0.2.1.tar.gz
  • Upload date:
  • Size: 19.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for htmltree_view-0.2.1.tar.gz
Algorithm Hash digest
SHA256 628171fd3152d4a2523e5c808857899b92e716397a9a945ee47668bd08f2d48d
MD5 3e123822ac2e655d2d7d1da71b2ec080
BLAKE2b-256 9461a695da03286024dda1e7682c3c3d84936967f5e145e8b54f1af4b7dfceb7

See more details on using hashes here.

File details

Details for the file htmltree_view-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: htmltree_view-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 15.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for htmltree_view-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 820dddfa562ddeaf42e7fe433669bd5d034504e0e35f28f92fc8d85f05cfe838
MD5 04927031def9be8b850ba37382957131
BLAKE2b-256 9ad7c98cffaa34345e10e5facbb4dff892b5bdf172857c9e84ba681ac2b9de7b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page