Skip to main content

Crawl a website with a headless browser and generate a draft Content-Security-Policy (CSP).

Project description

cspresso

CSPresso logo

Crawl up to N pages of a site using a headless Chromium (via Playwright), observe what assets are loaded, and emit a draft Content Security Policy (CSP).

This is meant as a starting point. Review and tighten the resulting policy before enforcing it.

Why "draft"?

  • A crawl rarely covers all user flows (auth-only pages, A/B tests, conditional loads, etc.).
  • Inline script/style handling is tricky:
    • If your pages use nonces, you must generate a new nonce per HTML response and insert it both in the CSP header and in the HTML tags.
    • Hashes work only if the inline content is stable byte-for-byte.

Requirements

  • Python 3.10+
  • Playwright's Chromium browser binaries (auto-installed by this tool if missing)

Install

If using my artifacts from the Releases page, you may wish to verify the GPG signatures with the key.

It can be found at https://mig5.net/static/mig5.asc . The fingerprint is 00AE817C24A10C2540461A9C1D7CDE0234DB458D.

Poetry

poetry install

pip/pipx

pip install cspresso

AppImage

Download the CSPresso.AppImage from the releases page, make it executable with chmod +x, and run it.

Run

cspresso https://example.com --max-pages 10

The tool will:

  1. attempt to launch Chromium headless
  2. if Chromium isn't installed, it will run: python -m playwright install chromium
  3. crawl same-origin links up to the page limit
  4. print the visited URLs and a CSP header

Avoiding an existing enforcing CSP header during analysis

NOTE: If you have an existing CSP header in place on your site, this could negatively influence cspresso's ability to evaluate what's on the page. Consider adding --bypass-csp to ignore the current CSP (noting that if your site is compromised, doing so could put your machine at risk if it evaluates malicious javascript/css etc).

See also the --evaluate option below.

Where Playwright installs browsers

By default, this project installs Playwright browsers into a local folder: ./.pw-browsers. This makes installs deterministic and easy to cache in CI.

You can override with --browsers-path or by setting PLAYWRIGHT_BROWSERS_PATH yourself.

Linux notes

If Chromium fails to start due to missing system libraries, try:

cspresso https://example.com --with-deps

That runs python -m playwright install --with-deps chromium (may require sudo depending on your environment).

Output

Default output is a single CSP header line.

For JSON:

cspresso https://example.com --json

Evaluate a proposed CSP without installing it

You can use cspresso to evaluate a proposed CSP against a site. When you do this, cspresso converts the response from the website to implant Content-Security-Policy-Report-Only headers using the CSP you supplied to --evaluate. If it detects any violations, it will report them and exit with code 1, which may be useful for CSP.

NOTE: It is highly recommended to use --bypass-csp in addition to --evaluate, so that your results are not influenced by any existing CSP's enforcement.

Example:

 poetry run cspresso https://mig5.net --evaluate "default-src 'none'" --bypass-csp --json
{
  "csp": "base-uri 'self'; default-src 'self'; form-action 'self'; frame-ancestors 'self'; object-src 'none'; style-src 'self' 'sha256-4Su6mBWzEIFnH4pAGMOuaeBrstwJN4Z3pq/s1Kn4/KQ=' 'unsafe-hashes'; style-src-attr 'sha256-4Su6mBWzEIFnH4pAGMOuaeBrstwJN4Z3pq/s1Kn4/KQ=' 'unsafe-hashes';",
  "directives": {},
  "evaluated_policy": "default-src 'none'",
  "nonce_detected": false,
  "notes": [
    "Detected inline attribute code (style=\"...\" and/or on*=\"...\"). Hashes for these require 'unsafe-hashes' (and modern browsers may use style-src-attr/script-src-attr)."
  ],
  "violations": [
    {
      "console": true,
      "disposition": "report",
      "documentURI": "https://mig5.net/",
      "text": "Loading the stylesheet 'https://mig5.net/style.css' violates the following Content Security Policy directive: \"default-src 'none'\". Note that 'style-src-elem' was not explicitly set, so 'default-src' is used as a fallback. The policy is report-only, so the violation has been logged but no further action has been taken.",
      "type": "info"
    },
    {
      "console": true,
      "disposition": "report",
      "documentURI": "https://mig5.net/static/mig5.asc",
      "text": "Applying inline style violates the following Content Security Policy directive 'default-src 'none''. Either the 'unsafe-inline' keyword, a hash ('sha256-4Su6mBWzEIFnH4pAGMOuaeBrstwJN4Z3pq/s1Kn4/KQ='), or a nonce ('nonce-...') is required to enable inline execution. Note that hashes do not apply to event handlers, style attributes and javascript: navigations unless the 'unsafe-hashes' keyword is present. Note also that 'style-src' was not explicitly set, so 'default-src' is used as a fallback. The policy is report-only, so the violation has been logged but no further action has been taken.",
      "type": "info"
    }
  ],
  "visited": [
    "https://mig5.net",
    "https://mig5.net/",
    "https://mig5.net/static/mig5.asc"
  ]
}

cspresso on  main [!] via 🐍 v3.13.5 took 18s
❯ echo $?
1

Full usage info

usage: cspresso [-h] [--max-pages MAX_PAGES] [--timeout-ms TIMEOUT_MS] [--settle-ms SETTLE_MS] [--headed] [--no-install] [--with-deps] [--browsers-path BROWSERS_PATH] [--allow-blob] [--unsafe-eval]
                [--upgrade-insecure-requests] [--include-sourcemaps] [--bypass-csp] [--evaluate CSP] [--ignore-non-html] [--json]
                url

Crawl up to N pages (same-origin) with Playwright and generate a draft CSP.

positional arguments:
  url                   Start URL (e.g. https://example.com)

options:
  -h, --help            show this help message and exit
  --max-pages MAX_PAGES
                        Maximum number of pages to visit (default: 10)
  --timeout-ms TIMEOUT_MS
                        Navigation timeout in ms (default: 20000)
  --settle-ms SETTLE_MS
                        Extra time after networkidle to allow hydration/delayed requests (default: 1500)
  --headed              Run with a visible browser window (not headless)
  --no-install          Do not auto-install Chromium if missing
  --with-deps           When installing, include Playwright OS deps (Linux). May require elevated privileges.
  --browsers-path BROWSERS_PATH
                        Directory to install/playwright browsers (default: ./.pw-browsers).
  --allow-blob          Include blob: in common directives (drafty)
  --unsafe-eval         Include 'unsafe-eval' in script-src (not recommended)
  --upgrade-insecure-requests
                        Add upgrade-insecure-requests directive
  --include-sourcemaps  Analyze JS/CSS for sourceMappingURL and add map origins to connect-src
  --bypass-csp          Strip any existing CSP/CSP-Report-Only response headers from HTML documents (useful for discovery or evaluation).
  --evaluate CSP        Inject the provided CSP string as Content-Security-Policy-Report-Only on HTML documents and exit 1 if any Report-Only violations are detected. Quote the value.
  --ignore-non-html     Ignore non-HTML pages that get crawled (which might trigger Chromium's word-wrap hash: https://stackoverflow.com/a/69838710)
  --json                Output JSON instead of a header line

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cspresso-0.1.3.tar.gz (28.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cspresso-0.1.3-py3-none-any.whl (27.4 kB view details)

Uploaded Python 3

File details

Details for the file cspresso-0.1.3.tar.gz.

File metadata

  • Download URL: cspresso-0.1.3.tar.gz
  • Upload date:
  • Size: 28.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.13.5 Linux/6.12.59-1.qubes.fc37.x86_64

File hashes

Hashes for cspresso-0.1.3.tar.gz
Algorithm Hash digest
SHA256 19b06017d23063bc78e132976e09466b82bbccfcff8ebe76c4a5d43bc3de94e3
MD5 87a82a169bc53ad52b4acca50a457cec
BLAKE2b-256 09c030f974355e2a5e52677ef0a4fbd61f45a31b4234f2b2565f0a5a45c8d84a

See more details on using hashes here.

File details

Details for the file cspresso-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: cspresso-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 27.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.13.5 Linux/6.12.59-1.qubes.fc37.x86_64

File hashes

Hashes for cspresso-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 f244d5e678fc1dd80f9f622e6c95e2d6e7a174fcdcd013e5adaf822ddb0d29eb
MD5 c507a24a21a305bc151f22097f6962c6
BLAKE2b-256 66b2936f913f6bfeba83dffacdbd2b355f21c9bb3ed4f3fe3e40e77743f9cf16

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page