Web sources information extractor
Project description
PTINSEARCHER
Web sources information extractor
ptinsearcher is a tool for extracting information from web sources. This tool allows dumping of HTML comments, e-mail addresses, phone numbers, IP addresses, subdomains, HTML forms, links and metadata of documents.
Installation
pip install ptinsearcher
Add to PATH
If you cannot invoke the script in your terminal, its probably because its not in your PATH. Fix it by running commands below.
echo "export PATH=\"`python3 -m site --user-base`/bin:\$PATH\"" >> ~/.bashrc
source ~/.bashrc
Usage examples
ptinsearcher -u https://www.example.com/ # Dump information from URL
ptinsearcher -u https://www.example.com/ -e C # Extract comments from URL
ptinsearcher -u https://www.example.com/ -e CSE # Extract comments, subdomains, emails from URL
ptinsearcher -f urlList.txt # Load list of sources to grab from file
ptinsearcher -f urlList.txt -gc -e E # Group findings of all sources
Options
-u --url <url> Test URL
-f --file <file> Load URL list from file
-d --domain <domain> Domain - Merge domain with filepath. Use when wordlist contains filepaths (e.g. /index.php)
-e --extract <extract> Specify data to extract [A, E, S, C, F, I, P, U, Q, X, M, T] (default A)
-o --output <output> Save output to file
-op --output-parts Save each extract_type to separatorarate file
-gp --group-parameters Group URL parameters
-wp --without-parameters Without URL parameters
-g --grouping One output table for all sites
-gc --grouping-complete Merge all results into one group
-r --redirect Follow redirects (default False)
-c --cookie <cookie=value> Set cookie(s)
-H --headers <header:value> Set custom headers
-p --proxy <proxy> Set proxy (e.g. http://127.0.0.1:8080)
-ua --user-agent <user-agent> Set User-Agent (default Penterep Tools)
-j --json Output in JSON format
-v --version Show script version and exit
-h --help Show this help message and exit
Extract arguments
Specify which data to extract from source
A - grab all (default)
E - Emails
S - Subdomains
C - Comments
F - Forms
I - IP addresses
U - Internal URLs
Q - Internal URLs with parameters
X - External URLs
P - Phone numbers
M - Metadata
T - Metadata-Tags (author, robots, generator)
Dependencies
- requests
- bs4
- pyexiftool
- tldextract
- ptlibs
We use ExifTool to extract metadata. Python 3.6+ is required.
Version History
- 0.0.6
- Fixed spacing when printing forms & internal URLs with parameters
- Fixed JSON output for internal URLs with parameters
- Added 'T' to extract parameters - dumps content of Author, Robots and Generator meta tags.
- 0.0.5
- Improved stability
- Updated help message
- Replaced extract parameter for comment extraction from 'H' to 'C'
- Fixed grouping
- 0.0.1 - 0.0.4
- Alpha releases
License
Copyright (c) 2020 HACKER Consulting s.r.o.
ptinsearcher is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
ptinsearcher is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with ptinsearcher. If not, see https://www.gnu.org/licenses/.
Warning
You are only allowed to run the tool against the websites which you have been given permission to pentest. We do not accept any responsibility for any damage/harm that this application causes to your computer, or your network. Penterep is not responsible for any illegal or malicious use of this code. Be Ethical!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file ptinsearcher-0.0.6.tar.gz
.
File metadata
- Download URL: ptinsearcher-0.0.6.tar.gz
- Upload date:
- Size: 4.9 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.7.1 importlib_metadata/4.6.4 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.9.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5302e3529e3ecd10a3face97db91c060e4f447ffe2ffb010aec1e6259fb96ac7 |
|
MD5 | 3de92d01962201741c00fd7452ee9d56 |
|
BLAKE2b-256 | b7efa0544be07f39e0e2aaba0c0b598f7e77e07cdeb4433083080ce14395252c |
File details
Details for the file ptinsearcher-0.0.6-py3-none-any.whl
.
File metadata
- Download URL: ptinsearcher-0.0.6-py3-none-any.whl
- Upload date:
- Size: 5.6 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.7.1 importlib_metadata/4.6.4 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.9.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f53835ee3dafd05407869451c328aecb07702c33d425c8ed0ea023a327013650 |
|
MD5 | 70f3fa2f6b9b533e909f2932e6ec10df |
|
BLAKE2b-256 | b21b9f2f4a1b0dee451c9490206ff7fb265eb3b64a20b6f544bf1834a3760700 |