Skip to main content

SCuWl, Simple custom wordlist generator.

Project description

💀 Scuwl 💀

Simple custom wordlist generator

Scuwl (skull) is a Python CLI program that quickly and easily generates a wordlist from a webpage. The idea for Scuwl was inspired by the program Cewl. Scuwl defaults to a crawling depth of zero and most webpages return a wordlist in less than a second. Using a crawling depth of one generally takes a few minutes.

Scuwl is fast because it recursively scrapes websites asynchronously. Scuwl minimizes its memory footprint by processing HTML as it goes and updating the wordlist in memory as a set. By default Scuwl keeps unique words, three characters long and over, and removes all punctuation.

Note: Using a crawling depth of over one remains untested.

Features

  • Fast recursive asynchronous web requests using aiohttp
  • CLI options gives you control over the generated wordlist
  • Simple Python codebase (< 150 lines)
  • Low memory usage (~80MB)

Installation

  python -m pip install scuwl

Usage

$ scuwl -h
usage: scuwl.py [-h] [-d DEPTH] [-H HEADERS] [-m MIN_LENGTH] [-o OUTFILE]
                [-P PROXY] [-p] [-u USER_AGENT] [-v]
                url

💀SCuWl💀, Simple custom wordlist generator.

positional arguments:
  url                   url to scrape

options:
  -h, --help            show this help message and exit
  -d DEPTH, --depth DEPTH
                        depth of search
  -H HEADERS, --headers HEADERS
                        json headers for client
  -m MIN_LENGTH, --min-length MIN_LENGTH
                        minimum length of words to keep
  -o OUTFILE, --outfile OUTFILE
                        outfile for wordlist
  -P PROXY, --proxy PROXY
                        proxy address for client
  -p, --punctuation     keep punctutation
  -u USER_AGENT, --user-agent USER_AGENT
                        user-agent string for client
  -v, --version         show program's version number and exit

Examples

Generate wordlist and send to stdout

$ scuwl https://github.com/petebuffon/scuwl
topics
out
scuwl
2022
track
...
Generate wordlist and save as wordlist.txt

$ scuwl -o wordlist.txt https://github.com/petebuffon/scuwl
$ wc -l wordlist.txt
122 wordlist.txt
Keep punctuation

$ scuwl -p -o wordlist.txt https://github.com/petebuffon/scuwl
$ head wordlist.txt
customer
wait?
write
devops
user
Use a crawl depth of one (scrapes all links from input webpage)

$ scuwl -d 1 -o wordlist.txt https://github.com/petebuffon/scuwl
$ wc -l wordlist.txt
6326 wordlist.txt

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scuwl-1.0.tar.gz (5.0 kB view hashes)

Uploaded Source

Built Distribution

scuwl-1.0-py3-none-any.whl (5.5 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page