Skip to main content

Web Source Discovery Tool

Project description

penterepTools

PTWEBDISCOVER

Web Source Discovery Tool

ptwebdiscover is tool for discovering web sources by brute force, dictionary and web crawling methods. This tool may be used for inteligent backup discovery too.

Installation (ptmanager)

pip install ptwebdiscover

Add to PATH

If you cannot invoke the script in your terminal, its probably because its not in your PATH. Fix it by running commands below.

echo "export PATH=\"`python3 -m site --user-base`/bin:\$PATH\"" >> ~/.bashrc
source ~/.bashrc

Usage examples

ptwebdiscover -u https://www.example.com
ptwebdiscover -u https://www.example.com -ch lowercase,numbers,[123abcdEFG*]
ptwebdiscover -u https://www.example.com -lx 4
ptwebdiscover -u https://www.example.com -w
ptwebdiscover -u https://www.example.com -w wordlist.txt
ptwebdiscover -u https://www.example.com -w wordlist.txt --begin_with admin
ptwebdiscover -u https://*.example.com -w
ptwebdiscover -u https://www.example.com/exam*.txt
ptwebdiscover -u https://www.example.com -e "" bak old php~ php.bak
ptwebdiscover -u https://www.example.com -ef extensions.txt
ptwebdiscover -u https://www.example.com -w -sn "Page Not Found"

Specials

Use '*' character in <url> to anchor tested location
Use special wordlist with format of lines "location::technology" for identify of used techlologies
For proxy authorization use -p http://username:password@address:port

Options

-u    --url                     <url>           URL for test (usage of a star character as anchor)
-ch   --charsets                <charsets>      Specify charset for brute force (example: lowercase,uppercase,numbers,[custom_chars]), modify wordlist (lowercase,uppercase,capitalize)
-lm   --length-min              <length-min>    Minimal length of brute-force tested string (default 1)
-lx   --length-max              <length-max>    Maximal length of brute-force tested string (default 6 bf / 99 wl
-w    --wordlist                <filename>      Use specified wordlist(s)
-pf   --prefix                  <string>        Use prefix before tested string
-sf   --suffix                  <string>        Use suffix after tested string
-bw   --begin-with              <string>        Use only words from wordlist that begin with the specified string
-ci   --case-insensitive                        Case insensitive items from wordlist
-e    --extensions              <extensions>    Add extensions behind a tested string ("" for empty extension)
-E    --extension-file          <filename>      Add extensions from default or specified file behind a tested string.
-r    --recurse                                 Recursive browsing of found directories
-md   --max_depth               <integer>       Maximum depth during recursive browsing (default: 20)
-b    --backups                                 Find backups for db, all app and every discovered content
-bo   --backups-only                            Find backup of complete website only
-P    --parse                                   Parse HTML response for URLs discovery
-Po   --parse-only                              Brute force method is disabled, crawling started on specified url
-D    --directory                               Add a slash at the ends of the strings too
-nd   --not-directories         <directories>   Not include listed directories when recursive browse run
-sy   --string-in-response      <string>        Print findings only if string in response (GET method is used)
-sn   --string-not-in-response  <string>        Print findings only if string not in response (GET method is used)
-sc   --status-codes            <status codes>  Ignore response with status codes (default 404)
-m    --method                  <method>        Use said HTTP method (default: HEAD)
-se   --scheme                  <scheme>        Use scheme when missing (default: http)
-d    --delay                   <miliseconds>   Delay before each request in seconds
-p    --proxy                   <proxy>         Use proxy (e.g. http://127.0.0.1:8080)
-T    --timeout                 <miliseconds>   Manually set timeout (default 10000)
-cl   --content_length          <kilobytes>     Max content length to download and parse (default: 1000KB)
-H    --headers                 <headers>       Use custom headers
-ua   --user-agent              <agent>         Use custom value of User-Agent header
-c    --cookie                  <cookies>       Use cookie (-c "PHPSESSID=abc; any=123")
-rc   --refuse-cookies                          Do not use cookies sets by application
-a    --auth                    <name:pass>     Use HTTP authentication
-t    --threads                 <threads>       Number of threads (default 20)
-j    --json                                    Output in JSON format
-wd   --without-domain                          Output of discovered sources without domain
-wh   --with-headers                            Output of discovered sources with headers
-ip   --include-parameters                      Include GET parameters and anchors to output
-tr   --tree                                    Output as tree
-o    --output                  <filename>      Output to file
-S    --save                    <directory>     Save content localy
-wdc  --without_dns_cache                       Do not use DNS cache (example for /etc/hosts records)
-wa   --without_availability                    Do not use target availability check
-tg   --target                  <ip or host>    Use this target when * is in domain
-nr   --not-redirect                            Do not follow redirects
-er   --errors                                  Show all errors
-s    --silent                                  Do not show statistics in realtime
-v    --version                                 Show script version
-h    --help                                    Show this help message

Dependencies

- requests
- treelib
- tldextract
- idna
- filelock
- appdirs
- ptlibs
- ptthreads

DNS caching is done by cachefile from URLExtract

Version History

* 0.0.5
    - Updated prints for new ptlibs
* 0.0.4
    - Disabled availability check, when star in url
* 0.0.3
    - fixed case sensitivity in wordlists
    - added .7z and .tgz extensions to backups search
* 0.0.1 - 0.0.2
    * Alpha releases

Licence

Copyright (c) 2020 HACKER Consulting s.r.o.

ptwebdiscover is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

ptwebdiscover is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with ptwebdiscover. If not, see https://www.gnu.org/licenses/.

Warning

You are only allowed to run the tool against the websites which you have been given permission to pentest. We do not accept any responsibility for any damage/harm that this application causes to your computer, or your network. Penterep is not responsible for any illegal or malicious use of this code. Be Ethical!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ptwebdiscover-0.0.5.tar.gz (33.0 kB view details)

Uploaded Source

Built Distribution

ptwebdiscover-0.0.5-py3-none-any.whl (31.9 kB view details)

Uploaded Python 3

File details

Details for the file ptwebdiscover-0.0.5.tar.gz.

File metadata

  • Download URL: ptwebdiscover-0.0.5.tar.gz
  • Upload date:
  • Size: 33.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.6.4 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.9.12

File hashes

Hashes for ptwebdiscover-0.0.5.tar.gz
Algorithm Hash digest
SHA256 0cc0b0bc3c8b564c769a73b91ff662414f79f6d62bf3dd999f7cd21a996a47cc
MD5 f2c2a4ac2393f1ba365a55bdf6bdc87d
BLAKE2b-256 a489de3033bc633282514358c26d6b8fb2a430313446b61ca42443f42d0b79a7

See more details on using hashes here.

File details

Details for the file ptwebdiscover-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: ptwebdiscover-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 31.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.6.4 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.9.12

File hashes

Hashes for ptwebdiscover-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 981dbf54b709c6b901b05676df6a0b54fd09200375f78bddd15a2adea53528ec
MD5 8f987caac48ebc209bf1aa681e095038
BLAKE2b-256 7a5db56d0406ccfaf9173b1d12fbc7b9a2c307472ae2d7fc3193abaaacaedac8

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page