Skip to main content

Web sources information extractor

Project description

penterepTools

PTINSEARCHER

Web sources information extractor

ptinsearcher is a tool for extracting information from web sources. This tool allows dumping of HTML comments, e-mail addresses, phone numbers, IP addresses, subdomains, HTML forms, links and metadata of documents.

Installation

pip install ptinsearcher

Add to PATH

If you cannot invoke the script in your terminal, its probably because its not in your PATH. Fix it by running commands below.

echo "export PATH=\"`python3 -m site --user-base`/bin:\$PATH\"" >> ~/.bashrc
source ~/.bashrc

Usage examples

ptinsearcher -u https://www.example.com/            # Dump information from URL
ptinsearcher -u https://www.example.com/ -e C       # Extract comments from URL
ptinsearcher -u https://www.example.com/ -e CSE     # Extract comments, subdomains, emails from URL
ptinsearcher -f urlList.txt                         # Load list of sources to grab from file
ptinsearcher -f urlList.txt -gc -e E                # Group findings of all sources

Options

   -u   --url                 <url>           Test URL
   -f   --file                <file>          Load URL list from file
   -d   --domain              <domain>        Domain - Merge domain with filepath. Use when wordlist contains filepaths (e.g. /index.php)
   -e   --extract             <extract>       Specify data to extract [A, E, S, C, F, I, P, U, Q, X, M, T] (default A)
   -o   --output              <output>        Save output to file
   -op  --output-parts                        Save each extract_type to separatorarate file
   -gp  --group-parameters                    Group URL parameters
   -wp  --without-parameters                  Without URL parameters
   -g   --grouping                            One output table for all sites
   -gc  --grouping-complete                   Merge all results into one group
   -r   --redirect                            Follow redirects (default False)
   -c   --cookie              <cookie=value>  Set cookie(s)
   -H   --headers             <header:value>  Set custom headers
   -p   --proxy               <proxy>         Set proxy (e.g. http://127.0.0.1:8080)
   -ua  --user-agent          <user-agent>    Set User-Agent (default Penterep Tools)
   -j   --json                                Output in JSON format
   -v   --version                             Show script version and exit
   -h   --help                                Show this help message and exit

Extract arguments

Specify which data to extract from source

A - grab all (default)
E - Emails
S - Subdomains
C - Comments
F - Forms
I - IP addresses
U - Internal URLs
Q - Internal URLs with parameters
X - External URLs
P - Phone numbers
M - Metadata
T - Metadata-Tags (author, robots, generator)

Dependencies

  • requests
  • bs4
  • pyexiftool
  • tldextract
  • ptlibs

We use ExifTool to extract metadata. Python 3.6+ is required.

Version History

  • 0.0.6 - 0.0.7
    • Fixed spacing when printing forms & internal URLs with parameters
    • Fixed JSON output for internal URLs with parameters
    • Added 'T' to extract parameters - dumps content of Author, Robots and Generator meta tags.
  • 0.0.5
    • Improved stability
    • Updated help message
    • Replaced extract parameter for comment extraction from 'H' to 'C'
    • Fixed grouping
  • 0.0.1 - 0.0.4
    • Alpha releases

License

Copyright (c) 2020 HACKER Consulting s.r.o.

ptinsearcher is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

ptinsearcher is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with ptinsearcher. If not, see https://www.gnu.org/licenses/.

Warning

You are only allowed to run the tool against the websites which you have been given permission to pentest. We do not accept any responsibility for any damage/harm that this application causes to your computer, or your network. Penterep is not responsible for any illegal or malicious use of this code. Be Ethical!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ptinsearcher-0.0.7.tar.gz (4.9 MB view details)

Uploaded Source

Built Distribution

ptinsearcher-0.0.7-py3-none-any.whl (5.6 MB view details)

Uploaded Python 3

File details

Details for the file ptinsearcher-0.0.7.tar.gz.

File metadata

  • Download URL: ptinsearcher-0.0.7.tar.gz
  • Upload date:
  • Size: 4.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.6.4 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.9.9

File hashes

Hashes for ptinsearcher-0.0.7.tar.gz
Algorithm Hash digest
SHA256 58a42aed3fe43ad280bf34c9d72d0a2e17153e661558e52651713231dc243052
MD5 f0519e5facc3485f649e9d846a949656
BLAKE2b-256 ffd2021cc67b0b230cb617b601dce02868575237afdc89f1341a02c68f0b818c

See more details on using hashes here.

File details

Details for the file ptinsearcher-0.0.7-py3-none-any.whl.

File metadata

  • Download URL: ptinsearcher-0.0.7-py3-none-any.whl
  • Upload date:
  • Size: 5.6 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.6.4 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.9.9

File hashes

Hashes for ptinsearcher-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 fd59f58100ef7faa407e21282d0744c9954ad7b56ef0127ac207d1bd0d466fc5
MD5 49bba340bbe9e8dff4a11f03758e05c3
BLAKE2b-256 28e8814e4c7a46b105f6a61a39769a3c004667bd411848f9d5a6b7af2cdd9571

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page