Skip to main content

Web Source Discovery Tool

Project description

penterepTools

PTWEBDISCOVER - Web Source Discovery Tool

Installation

pip install ptwebdiscover

Adding to PATH

If you're unable to invoke the script from your terminal, it's likely because it's not included in your PATH. You can resolve this issue by executing the following commands, depending on the shell you're using:

For Bash Users

echo "export PATH=\"`python3 -m site --user-base`/bin:\$PATH\"" >> ~/.bashrc
source ~/.bashrc

For ZSH Users

echo "export PATH=\"`python3 -m site --user-base`/bin:\$PATH\"" >> ~/.zshrc
source ~/.zshrc

Usage examples

ptwebdiscover -u https://www.example.com
ptwebdiscover -u https://www.example.com -ch lowercase,numbers,123abcdEFG*
ptwebdiscover -u https://www.example.com -lx 4
ptwebdiscover -u https://www.example.com -w
ptwebdiscover -u https://www.example.com -w wordlist.txt
ptwebdiscover -u https://www.example.com -w wordlist.txt --begin_with admin
ptwebdiscover -u https://*.example.com -w
ptwebdiscover -u https://www.example.com/exam*.txt
ptwebdiscover -u https://www.example.com -e "" bak old php~ php.bak
ptwebdiscover -u https://www.example.com -E extensions.txt
ptwebdiscover -u https://www.example.com -w -sn "Page Not Found"

Options

   -u    --url                     <url>           URL for test (usage of a star character as anchor)
   -ch   --charsets                <charsets>      Specify charset fro brute force (example: lowercase,uppercase,numbers,[custom_chars])
   -src  --source                  <source>        Check for presence of only specified <source> (eg. -src robots.txt)
                                                   Modify wordlist (lowercase,uppercase,capitalize)
   -lm   --length-min              <length-min>    Minimal length of brute-force tested string (default 1)
   -lx   --length-max              <length-max>    Maximal length of brute-force tested string (default 6 bf / 99 wl)
   -w    --wordlist                <filename>      Use specified wordlist(s)
   -pf   --prefix                  <string>        Use prefix before tested string
   -sf   --suffix                  <string>        Use suffix after tested string
   -bw   --begin-with              <string>        Use only words from wordlist that begin with the specified string
   -ci   --case-insensitive                        Case insensitive items from wordlist
   -e    --extensions              <extensions>    Add extensions behind a tested string ("" for empty extension)
   -E    --extension-file          <filename>      Add extensions from default or specified file behind a tested string.
   -r    --recurse                                 Recursive browsing of found directories
   -md   --max_depth               <integer>       Maximum depth during recursive browsing (default: 20)
   -b    --backups                                 Find backups for db, all app and every discovered content
   -bo   --backups-only                            Find backup of complete website only
   -P    --parse                                   Parse HTML response for URLs discovery
   -Po   --parse-only                              Brute force method is disabled, crawling started on specified url
   -D    --directory                               Add a slash at the ends of the strings too
   -nd   --not-directories         <directories>   Not include listed directories when recursive browse run
   -sy   --string-in-response      <string>        Print findings only if string in response (GET method is used)
   -sn   --string-not-in-response  <string>        Print findings only if string not in response (GET method is used)
   -sc   --status-codes            <status-codes>  Ignore response with status codes (default 404)
   -d    --delay                   <miliseconds>   Delay before each request in seconds
   -T    --timeout                 <miliseconds>   Manually set timeout (default 10000)
   -cl   --content-length          <kilobytes>     Max content length to download and parse (default: 1000KB)
   -m    --method                  <method>        Use said HTTP method (default: HEAD)
   -se   --scheme                  <scheme>        Use scheme when missing (default: http)
   -p    --proxy                   <proxy>         Use proxy (e.g. http://127.0.0.1:8080)
   -H    --headers                 <headers>       Use custom headers
   -a    --user-agent              <agent>         Use custom value of User-Agent header
   -c    --cookie                  <cookies>       Use cookie (-c "PHPSESSID=abc; any=123")
   -A    --auth                    <name:pass>     Use HTTP authentication
   -rc   --refuse-cookies                          Do not use cookies set by application
   -t    --threads                 <threads>       Number of threads (default 20)
   -wd   --without-domain                          Output of discovered sources without domain
   -wh   --with-headers                            Output of discovered sources with headers
   -ip   --include-parameters                      Include GET parameters and anchors to output
   -tr   --tree                                    Output as tree
   -o    --output                  <filename>      Output to file
   -S    --save                    <directory>     Save content localy
   -wdc  --without_dns_cache                       Do not use DNS cache (example for /etc/hosts records)
   -tg   --target                  <ip or host>    Use this target when * is in domain
   -nr   --not-redirect                            Do not follow redirects
   -s    --silent                                  Do not show statistics in realtime
   -C    --cache                                   Cache each request response to temp file
   -ne   --non-exist                               Check, if non existing pages return status code 200.
   -er   --errors                                  Show all errors
   -v    --version                                 Show script version
   -h    --help                                    Show this help message
   -j    --json                                    Output in JSON format

Dependencies

ptlibs
bs4
treelib

License

Copyright (c) 2024 Penterep Security s.r.o.

ptwebdiscover is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

ptwebdiscover is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with ptwebdiscover. If not, see https://www.gnu.org/licenses/.

Warning

You are only allowed to run the tool against the websites which you have been given permission to pentest. We do not accept any responsibility for any damage/harm that this application causes to your computer, or your network. Penterep is not responsible for any illegal or malicious use of this code. Be Ethical!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ptwebdiscover-1.1.1.tar.gz (61.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ptwebdiscover-1.1.1-py3-none-any.whl (66.8 kB view details)

Uploaded Python 3

File details

Details for the file ptwebdiscover-1.1.1.tar.gz.

File metadata

  • Download URL: ptwebdiscover-1.1.1.tar.gz
  • Upload date:
  • Size: 61.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for ptwebdiscover-1.1.1.tar.gz
Algorithm Hash digest
SHA256 a1ca39e9a7e3cc4e2cbd7ffc6f38cf91ccc48e99e3d87cf7bea8d83aa4c8ea96
MD5 abb4cd5a63090855494495b0cbb1fc6d
BLAKE2b-256 de4a0376b2785f9c39b3f3cb1ba70c6456a189a9ed94f516a835e4831f7d7fff

See more details on using hashes here.

File details

Details for the file ptwebdiscover-1.1.1-py3-none-any.whl.

File metadata

  • Download URL: ptwebdiscover-1.1.1-py3-none-any.whl
  • Upload date:
  • Size: 66.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for ptwebdiscover-1.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 2b4210425b4e3256c931e7c12003da37e9f9f8116c5ab6efd389fb0cd84542fd
MD5 fdcf449d12c709e0b1e8875999e594e1
BLAKE2b-256 c2106a3d7a214cc2b4a14b89e4a2230c8d55065cdbe5407b1f97e628b1ec9344

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page