Skip to main content

Wappalyzer-based tech stack detection library

Project description

Wappalyzer Next

This project is a command line tool and python library that uses the Wappalyzer browser extension and its fingerprints to detect technologies. Other projects that emerged after the discontinuation of the official open-source project are using outdated fingerprints and lack accuracy on dynamic web apps. This project bypasses those limitations by running the extension in Chromium through Playwright.

demo

Installation

After installing the Python package, install Playwright's Chromium browser:

python -m playwright install chromium

In minimal Linux containers, install Chromium's system dependencies as well:

python -m playwright install-deps chromium

Install as a command-line tool

pipx install wappalyzer
pipx run --spec playwright playwright install chromium

Install as a library

To use it as a library, install it with pip inside an isolated container e.g. venv or docker. You may also --break-system-packages to do a 'regular' install but it is not recommended.

pip install wappalyzer
python -m playwright install chromium

Install with docker

Steps
  1. Clone the repository:
git clone https://github.com/s0md3v/wappalyzer-next.git
cd wappalyzer-next
  1. Build and run with Docker Compose:
docker compose build
  1. To scan URLs using the Docker container:
  • Scan a single URL:
docker compose run --rm wappalyzer -i https://example.com
  • Scan multiple URLs from a file:
docker compose run --rm wappalyzer -i urls.txt -w 3 -oJ output.json

For Users

Some common usage examples are given below, refer to list of all options for more information.

  • Scan a single URL: wappalyzer -i https://example.com
  • Scan multiple URLs from a file: wappalyzer -i urls.txt -w 3
  • Set page-load timeout for full scans: wappalyzer -i urls.txt -t 15
  • Scan with authentication: wappalyzer -i https://example.com -c "sessionid=abc123; token=xyz789"
  • Export results to JSON: wappalyzer -i https://example.com -oJ results.json
  • Export JSON to stdout: wappalyzer -i https://example.com -oJ

When an output flag is used without a file, the report is written to stdout. Status lines, banner text, and errors are written to stderr.

Options

Note: For accuracy use 'full' scan type (default). 'fast' and 'balanced' do not use browser emulation.

  • -i: Input URL or file containing URLs (one per line)
  • --scan-type: Scan type (default: 'full')
    • fast: Quick HTTP-based scan (sends 1 request)
    • balanced: HTTP-based scan with more requests
    • full: Complete scan using wappalyzer extension
  • -w, --workers: Number of concurrent workers (default: 5; full scans are capped at 3)
  • -t, --timeout: Maximum seconds to wait for a page load in full scans (default: 30)
  • -oJ [file]: JSON output file path, or stdout when the file is omitted or set to -
  • -oC [file]: CSV output file path, or stdout when the file is omitted or set to -
  • -oH [file]: HTML output file path, or stdout when the file is omitted or set to -
  • -c, --cookie: Cookie header string for authenticated scans

For Developers

The python library is available on pypi as wappalyzer and can be imported with the same name.

Using the Library

Use Wappalyzer when scanning more than one URL. The browser is started once, reused, and closed when the with block exits.

from wappalyzer import Wappalyzer

with Wappalyzer(workers=3, timeout=30) as scanner:
    results = scanner.analyze_many([
        'https://example.com',
        'https://github.com',
        'https://python.org',
    ])

for url, technologies in results.items():
    print(url)
    for name, data in technologies.items():
        version = f" {data['version']}" if data['version'] else ""
        print(f"  {name}{version}")

The same scanner can also scan one URL at a time without reopening Chromium:

from wappalyzer import Wappalyzer

with Wappalyzer(workers=3, timeout=30) as scanner:
    github = scanner.analyze('https://github.com')
    python = scanner.analyze('https://python.org')

For a single URL, analyze() is shorter. It creates its own scanner, runs one scan, and closes it.

from wappalyzer import analyze

results = analyze(
    url='https://example.com',
    scan_type='full',  # 'fast', 'balanced', or 'full'
    cookie='sessionid=abc123',
    timeout=30
)

Do not call the top-level analyze() function in a loop for large jobs. Use Wappalyzer.analyze_many() or Wappalyzer.analyze() on a reused scanner so Chromium and the Wappalyzer extension are not reloaded for every URL.

analyze() Function Parameters

  • url (str): The URL to analyze
  • scan_type (str, optional): Type of scan to perform
    • 'fast': Quick HTTP-based scan
    • 'balanced': HTTP-based scan with more requests
    • 'full': Complete scan including JavaScript execution (default)
  • workers (int, optional): Number of browser workers to create for full scans (default: 1)
  • cookie (str, optional): Cookie header string for authenticated scans
  • timeout (int, optional): Maximum seconds to wait for a page load in full scans (default: 30)

Return Value

Returns a dictionary with the URL as key and detected technologies as value:

{
  "https://github.com": {
    "Amazon S3": {
      "version": "",
      "confidence": 100,
      "categories": ["CDN"],
      "groups": ["Servers"]
    },
    "React Router": {
      "version": "6",
      "confidence": 100,
      "categories": ["JavaScript frameworks"],
      "groups": ["Web development"]
    }
  },
  "https://example.com": {}
}

FAQ

Why Chromium and Playwright?

The full scanner runs the Wappalyzer extension in Chromium through Playwright. Chromium extension support in Playwright is direct and does not require geckodriver or Selenium.

What is the difference between 'fast', 'balanced', and 'full' scan types?

  • fast: Sends a single HTTP request to the URL. Doesn't use the extension.
  • balanced: Sends additional HTTP requests to .js files, /robots.txt and does DNS queries. Doesn't use the extension.
  • full: Uses the official Wappalyzer extension to scan the URL in a headless browser.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wappalyzer-2.0.1.tar.gz (34.7 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

wappalyzer-2.0.1-py3-none-any.whl (34.7 MB view details)

Uploaded Python 3

File details

Details for the file wappalyzer-2.0.1.tar.gz.

File metadata

  • Download URL: wappalyzer-2.0.1.tar.gz
  • Upload date:
  • Size: 34.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for wappalyzer-2.0.1.tar.gz
Algorithm Hash digest
SHA256 b4a6e806b73b6b7e7d649095de2f7f254203a80e3c2cf208c69c0f931aabb906
MD5 22bca07035b28158e8afc3b6fd3d40a3
BLAKE2b-256 0bb09ff9f47d76d20383268c866cb4f82ada4e622f7db2002c4b2aa12e3d0cf2

See more details on using hashes here.

Provenance

The following attestation bundles were made for wappalyzer-2.0.1.tar.gz:

Publisher: pypi.yml on s0md3v/wappalyzer-next

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file wappalyzer-2.0.1-py3-none-any.whl.

File metadata

  • Download URL: wappalyzer-2.0.1-py3-none-any.whl
  • Upload date:
  • Size: 34.7 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for wappalyzer-2.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 3e37208ed155c3acfc55ced22cc404fe7136fe624b6bb23261a94b4c8773cc2f
MD5 b0007479f6ebacc727ce45f9b948390d
BLAKE2b-256 8ba7a6bb7235efdcb5c848278be05d782908ac36ac7f79248b8eeb9b3e1095a1

See more details on using hashes here.

Provenance

The following attestation bundles were made for wappalyzer-2.0.1-py3-none-any.whl:

Publisher: pypi.yml on s0md3v/wappalyzer-next

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page