Analyze HAR files and identify page-load bottlenecks

Project description

dvm-haranalyzer

A command-line tool that parses .har files and identifies page-load bottlenecks — slow requests, large assets, missing cache headers, ad/tracker overload, and more.

Installation

Run without installing (uvx)

uvx dvm-haranalyzer metrics/hars/mypage.har

Install as a persistent tool

uv tool install dvm-haranalyzer
dvm-haranalyzer metrics/hars/mypage.har

Install from source (editable)

git clone https://github.com/divyavanmahajan/dvm-haranalyzer
cd dvm-haranalyzer
uv tool install --editable .

Quick Start

# Analyze a HAR file and print the report
uvx dvm-haranalyzer metrics/hars/mypage.har

# Save the report to metrics/reports/ as well
uvx dvm-haranalyzer metrics/hars/mypage.har --output metrics/reports/

How to Capture a HAR File

Chrome / Edge

Open DevTools (F12 or Cmd+Option+I)
Go to the Network tab
Check Preserve log and Disable cache
Navigate to the page you want to analyze
Right-click the request list → Save all as HAR with content
Save the file into metrics/hars/

Firefox

Open DevTools → Network tab
Navigate to the page
Click the gear icon → Save All As HAR

Safari

Open Develop menu → Show Web Inspector
Go to the Network tab
Navigate to the page
Click Export (floppy disk icon) to save the HAR

Sanitizing HAR Files Before Analysis

Important: HAR files captured from a browser contain session cookies, auth tokens, API keys, and personal data. Sanitize them before sharing, committing, or storing.

har-capture handles sanitization. It requires no permanent installation — run it with uvx:

uvx "har-capture[cli]" <command>

Recommended workflow

capture in browser  →  validate  →  sanitize  →  analyze

1. Validate — check what's sensitive before touching it

# Check a single file
uvx "har-capture[cli]" validate metrics/hars/mypage.har

# Scan the whole hars/ folder (recursive)
uvx "har-capture[cli]" validate --dir metrics/hars/ --recursive

# Treat any warning as an error (useful in CI)
uvx "har-capture[cli]" validate metrics/hars/mypage.har --strict

The validator scans for passwords, tokens, API keys, MAC addresses, IP addresses, and other PII and exits non-zero if any are found.

2. Sanitize — redact PII and produce a clean file

# Basic — writes mypage.sanitized.har alongside the original
uvx "har-capture[cli]" sanitize metrics/hars/mypage.har

# Write to a specific path
uvx "har-capture[cli]" sanitize metrics/hars/mypage.har --output metrics/hars/mypage.clean.har

# Also produce a compressed .har.gz (useful for large captures)
uvx "har-capture[cli]" sanitize metrics/hars/mypage.har --compress

# Write a JSON report of everything that was redacted
uvx "har-capture[cli]" sanitize metrics/hars/mypage.har --report metrics/reports/redaction.json

# Skip the interactive review step (good for scripting)
uvx "har-capture[cli]" sanitize metrics/hars/mypage.har --no-interactive

How redaction works:

By default each sensitive value is replaced with a salted hash. The same value always maps to the same hash within a session, so cross-request correlation is preserved while the actual value is hidden. Pass --no-salt to use static [REDACTED] placeholders instead.

3. Capture directly from a URL (auto-sanitizes)

har-capture get drives a headless browser and sanitizes the output in one step:

# Capture and auto-sanitize (writes <hostname>.har + <hostname>.har.gz)
uvx "har-capture[cli]" get https://example.com

# Save to a specific file
uvx "har-capture[cli]" get https://example.com --output metrics/hars/example.har

# Keep the raw (unsanitized) file alongside the sanitized one
uvx "har-capture[cli]" get https://example.com --keep-raw

# Include images and fonts in the capture (excluded by default)
uvx "har-capture[cli]" get https://example.com --include-images --include-fonts

# Use Firefox instead of the default Chromium
uvx "har-capture[cli]" get https://example.com --browser firefox

# Skip sanitization (not recommended for sharing)
uvx "har-capture[cli]" get https://example.com --no-sanitize

Full workflow example

# 1. Capture from URL into the metrics/hars folder
uv run --with "har-capture[cli]" --python python3 \
  har-capture get https://www.example.com \
    --output metrics/hars/example.har \
    --include-images

# 2. Validate the sanitized file
uvx "har-capture[cli]" validate metrics/hars/example.har --strict

# 3. Analyze
uvx dvm-haranalyzer metrics/hars/example.har --output metrics/reports/

Or, for a HAR captured manually in the browser:

# 1. Sanitize the browser export
uvx "har-capture[cli]" sanitize metrics/hars/raw.har \
    --output metrics/hars/raw.clean.har \
    --report metrics/reports/redaction.json

# 2. Validate the result
uvx "har-capture[cli]" validate metrics/hars/raw.clean.har --strict

# 3. Analyze
uvx dvm-haranalyzer metrics/hars/raw.clean.har --output metrics/reports/

Output

The tool prints a report to stdout containing:

Section	What it shows
Overview	DOMContentLoaded, onLoad, request count, total transfer size
Bottleneck Summary	Ranked list of CRITICAL / WARNING findings with fix recommendations
Top Slowest Requests	Time, TTFB, SSL, status, KB for the 15 slowest requests
Large Resources	Resources over 50 KB with type and cache headers
Content Type Breakdown	Total KB per MIME type
Top Domains	Request count, KB, and average time per origin
Slow TTFB	Requests with >300ms wait time
Slow TLS	Cold TLS handshakes >100ms
Slow DNS	DNS lookups >50ms
Poorly Cached Resources	Large resources missing Cache-Control
Redirects	All 3xx chains
HTTP Version Breakdown	HTTP/1.1 vs HTTP/2 usage
Concurrency	Peak concurrent requests in the first 5 seconds

When --output is given, the report is also written to a timestamped file:

metrics/reports/<stem>_YYYYMMDD_HHMMSS.txt

All Options

dvm-haranalyzer <har> [options]

Positional:
  har                  Path to the .har file

Options:
  --output, -o DIR     Directory to write the text report
  --large-kb N         Threshold (KB) for "large resource" section (default: 50)
  --ttfb-ms N          Slow TTFB threshold in ms (default: 300)
  --ssl-ms N           Slow TLS threshold in ms (default: 100)
  --dns-ms N           Slow DNS threshold in ms (default: 50)
  --top-n N            Number of slowest requests to list (default: 15)

Folder Structure

dvm-haranalyzer/
├── main.py            # main script
├── README.md          # this file
├── pyproject.toml     # package configuration
└── metrics/
    ├── hars/          # drop your .har files here
    └── reports/       # generated reports land here

Examples

# Higher threshold — only flag resources over 200 KB
uvx dvm-haranalyzer metrics/hars/checkout.har --large-kb 200

# Show top 30 slowest requests
uvx dvm-haranalyzer metrics/hars/homepage.har --top-n 30

# Stricter TTFB — flag anything over 100ms
uvx dvm-haranalyzer metrics/hars/api-heavy.har --ttfb-ms 100 --output metrics/reports/

Demo — MSN.com Performance Analysis

This demo walks through the full workflow for capturing and analyzing a page's network performance:

Capture — har-capture get opens a Chromium browser, records all traffic as you interact with the page, and auto-sanitizes the result
Validate — confirm no PII leaked into the HAR before analysis
Sanitize — strip any remaining cookies, tokens, and personal data
Analyze — dvm-haranalyzer surfaces bottlenecks ranked by severity

Step 1 — Capture www.msn.com

har-capture get opens a real Chromium browser window pointed at the target URL. You interact with the page normally (scroll, click, wait for ads to load), then close the browser tab. The tool records all traffic, auto-sanitizes PII, and writes the result to the output path.

# Run this yourself — it opens a Chromium window. Browse the page, then close the tab.
uv run --with "har-capture[cli]" --python python3 \
  har-capture get https://www.msn.com \
    --output metrics/hars/msn_live.har \
    --include-images

Step 2 — Validate the HAR for PII

Before sanitizing, scan the file to see what sensitive data is present. validate exits non-zero if anything is found, making it safe to use in CI.

uv run --with "har-capture[cli]" --python python3 har-capture validate metrics/hars/msn.har 2>&1 | head -40

metrics/hars/msn.har:
  [ERROR] [Entry 0: https://www.msn.com/sv-se (request)]
     Cookie: MSFPC=GUID=17a4a9247e5241f5bee9b29a4f...
     Reason: Sensitive header 'cookie' with non-redacted value
  [ERROR] [Entry 0: https://www.msn.com/sv-se (response)]
     Set-Cookie: _C_ETH=1; domain=.msn.com; path=/; se...
     Reason: Sensitive header 'cookie' with non-redacted value
  [WARN] [Entry 0: https://www.msn.com/sv-se (content)]
     content: 165.85.67.0
     Reason: Potential public IP address
  ...

Validation found cookies, session tokens, and IP addresses in the raw capture — exactly what we need to strip.

Step 3 — Sanitize

Redacts all sensitive values using salted hashes. The same value maps to the same hash throughout the file, preserving cross-request correlation while hiding actual data.

uv run --with "har-capture[cli]" --python python3 har-capture sanitize metrics/hars/msn.har \
    --output metrics/hars/msn_clean.har \
    --report metrics/reports/msn_redaction.json \
    --no-interactive 2>&1

Sanitizing metrics/hars/msn.har...

  Auto-redacted    3219
    cookie         3114
    email          8
    field          55
    password       14
    public_ip      16
    serial_number  1
    token          11

  Output           metrics/hars/msn_clean.har
  Report: metrics/reports/msn_redaction.json

3,219 values automatically redacted. The sanitized HAR is safe to share and commit.

Step 4 — Analyze for Performance Bottlenecks

uvx dvm-haranalyzer metrics/hars/msn_clean.har --output metrics/reports/

========================================================================
  HAR ANALYSIS REPORT
========================================================================
  File    : metrics/hars/msn_clean.har
  Page    : https://www.msn.com/sv-se
  Captured: 2026-03-09T16:58:46.636Z

--- Overview ----------------------------------------------------------
  DOMContentLoaded : 4,900 ms
  onLoad           : 11,405 ms
  Requests         : 418
  Transferred      : 4158 KB

--- Bottleneck Summary (ranked by severity) ---------------------------
  [CRITICAL]  1. onLoad = 11.4s (>10s)
              Page takes over 10 seconds to fully load. Users will abandon.

  [CRITICAL]  2. DOMContentLoaded = 4.9s (>4s)
              Render-blocking resources or slow TTFB is delaying first parse.

  [CRITICAL]  3. 418 total requests
              Extremely high request count. Consolidate assets and defer third-party scripts.

  [CRITICAL]  4. JavaScript = 853 KB
              Excessive JS payload. Apply code splitting, tree-shaking, and defer non-critical bundles.

  [CRITICAL]  5. Images = 2558 KB, <20% modern format
              Most images are JPEG/PNG. Convert to WebP or AVIF to save 40–60% image weight.

  [CRITICAL]  6. 88 ad/tracker requests across 22 domains
              Ad/tracker network requests dominate load time. Load them after onLoad or use async facades.

  [WARNING ]  7. 20 requests with TTFB >300ms (worst: 717ms)
              Slow server response on acdn.adnxs.com. Check server-side rendering, CDN, or DB latency.

  [WARNING ]  8. 10 cold TLS handshakes >100ms
              Add <link rel='preconnect'> for top third-party origins to amortize TLS cost.

  [WARNING ]  9. 12 resources (396 KB) with no/short cache
              Add long-lived Cache-Control headers (use content-hash filenames for JS/CSS).

  [WARNING ]  10. 12 requests on HTTP/1.1
              Upgrade origins to HTTP/2 to enable multiplexing and reduce head-of-line blocking.

  [WARNING ]  11. Peak 35 concurrent requests in first 5s
              Browser connection pool is saturated. Defer non-critical requests.

Results Summary

Metric	Value
onLoad	11.4 s — CRITICAL
DOMContentLoaded	4.9 s — CRITICAL
Total requests	418
Transferred	4.16 MB

#	Severity	Finding
1	CRITICAL	onLoad >10s — users will abandon
2	CRITICAL	DOMContentLoaded >4s — render-blocking resources
3	CRITICAL	418 requests — consolidate and defer
4	CRITICAL	853 KB JavaScript — split and tree-shake bundles
5	CRITICAL	2.5 MB images, <20% WebP/AVIF — convert to modern formats
6	CRITICAL	88 ad/tracker requests across 22 domains — defer past onLoad
7	WARNING	20 requests with TTFB >300ms (worst: 717ms on adnxs.com)
8	WARNING	10 cold TLS handshakes >100ms — add `preconnect` hints

Project details

Release history Release notifications | RSS feed

This version

0.2.0

Mar 9, 2026

0.1.0

Mar 9, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dvm_haranalyzer-0.2.0.tar.gz (11.8 kB view details)

Uploaded Mar 9, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

dvm_haranalyzer-0.2.0-py3-none-any.whl (12.5 kB view details)

Uploaded Mar 9, 2026 Python 3

File details

Details for the file dvm_haranalyzer-0.2.0.tar.gz.

File metadata

Download URL: dvm_haranalyzer-0.2.0.tar.gz
Upload date: Mar 9, 2026
Size: 11.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for dvm_haranalyzer-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`7373be420c5de903075bcecf0c86f5e798a4f211e92044539df84ab7a9db0022`
MD5	`2e1bd295f8ec1164a04c859eb4a57270`
BLAKE2b-256	`c03de727fb19407c9bfd70bfc13bce3708b7dee9533c41327499268c44092c53`

See more details on using hashes here.

File details

Details for the file dvm_haranalyzer-0.2.0-py3-none-any.whl.

File metadata

Download URL: dvm_haranalyzer-0.2.0-py3-none-any.whl
Upload date: Mar 9, 2026
Size: 12.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for dvm_haranalyzer-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0ca24a81a1323c58783967bb1d3869797b0c9abd10e7590e56339f88975af4e3`
MD5	`10c3af16f214936c52912ac36b430b81`
BLAKE2b-256	`921cfb740ac8e3ee0bba6a74d3d7382b461033ef50020146631979cd0cf56855`

See more details on using hashes here.

dvm-haranalyzer 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

dvm-haranalyzer

Installation

Run without installing (uvx)

Install as a persistent tool

Install from source (editable)

Quick Start

How to Capture a HAR File

Chrome / Edge

Firefox

Safari

Sanitizing HAR Files Before Analysis

Recommended workflow

1. Validate — check what's sensitive before touching it

2. Sanitize — redact PII and produce a clean file

3. Capture directly from a URL (auto-sanitizes)

Full workflow example

Output

All Options

Folder Structure

Examples

Demo — MSN.com Performance Analysis

Step 1 — Capture www.msn.com

Step 2 — Validate the HAR for PII

Step 3 — Sanitize

Step 4 — Analyze for Performance Bottlenecks

Results Summary

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes