Skip to main content

Fetch Known Urls

Project description


unja
Unja

Fetch Known Urls

What's Unja?

Unja is a fast & light tool for fetching known URLs from Wayback Machine, Common Crawl, Virus Total, UrlScan.io & AlienVault's Otx it uses a separate thread for each provider to optimize its speed and use Wayback resumption key to divide scan into multiple parts to handle a large scan & it uses direct filters on API to get only filtered data from API to do less work on your system.

Why Unja?

  • Supports Wayback/Common-Crawl/Virus-Total/Otx/UrlScan.io
  • Automatically handles rate limits and timeouts
  • Export results: text or detailed output with status,mime,length in JSON
  • MultiThreading: separate thread for each provider to fetch data simultaneously
  • Filters: apply filters dirtly on provider to avoid unnecessary data

Installing Unja

You can install Unja with pip as following:

pip3 install unja

or, by downloading this repository and running

python3 setup.py install

Updating Unja

You can update Unja with pip as following:

pip3 install unja -U

Usage

unja -h

This will display help for the tool.

Flag Description Example
-d doimain unja -d ninjhacks.com
-f List of domains file seprated by new line unja -f domains.txt
--sub Include subdomain unja --sub
-p Providers (wayback,commoncrawl,otx,virustotal,urlscan) unja -p wayback
--wbf (default : statuscode:200 ~mimetype:html) unja --wbf statuscode:200
--ccf (default : =status:200 ~mime:.*html) unja --ccf =status:200
--wbl Wayback results per request (default : 10000) unja --wbl 1000
--otxl Otx results per request (default : 500) unja --otxl 500
-r Amount of retries for http client (default : 3) unja -r 3
-v Enable verbose mode to show errors unja -v
-j Enable json mode for detailed output in json format unja -j
-s Silent mode don't print header unja -s
--ucci Update CommonCrawl Index unja --ucci
--vtkey Change VirusTotal Api in config unja --vtkey
--uskey Change UrlScan Api in config unja --uskey

Output Methods

text = ( default ) Output urls only.

json = ( -j ) Output url,status,mime,length in json format it's can help you later filtering result based on those variables.

Filters

Filters directly apply on providers to get only useful filtered data from provider.

Wayback Commoncrawl Description
statuscode:200 =status:200 return only those urls which status code is 200
!statuscode:200 !=status:200 return only non 200 status code
mimetype:text/html mime:text/html return only those url which response type is text/html
!mimetype:text/html !=mime:text/html return only non text/html response type
~mimetype:html ~mime:.*html return all those url which have html word in response type
~original:unja ~url:.*unja return all those url which have unja word in url

Oneliners

Get only urls with parameters & status code 200

unja -s -d target.com --sub -p wayback,commoncrawl --wbf 'statuscode:200 ~original:=' --ccf '=status:200 ~url:.*=' | anew | tee output

Looking for open redirects

unja -s -d target.com --sub -p wayback,commoncrawl --wbf '~statuscode:30 ~original:=http' --ccf '~status:30 ~url:.*=http' | anew | tee output

Clean result ( Exclude images,css,javascripts,woff & 404)

unja -s -d target.com --sub -p wayback,commoncrawl --wbf '!statuscode:404 ~!mimetype:image ~!mimetype:javascript ~!mimetype:css ~!mimetype:woff' --ccf '!=status:404 !~mime:.*image !~mime:.*javascript !~mime:.*css !~mime:.*woff' | anew | tee output

Let me know if you have any other good oneliner ./

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

unja-0.0.7.tar.gz (20.8 kB view details)

Uploaded Source

Built Distribution

unja-0.0.7-py3-none-any.whl (19.4 kB view details)

Uploaded Python 3

File details

Details for the file unja-0.0.7.tar.gz.

File metadata

  • Download URL: unja-0.0.7.tar.gz
  • Upload date:
  • Size: 20.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.4

File hashes

Hashes for unja-0.0.7.tar.gz
Algorithm Hash digest
SHA256 500e9558cfca716a7d8928ec31ee3bc78ed21fb50102775f115dbb1fd6a61bb1
MD5 fd934642ebc6c16508a03bda2ec608a7
BLAKE2b-256 941dac92de30aaa71652ccbe4ee20e9a5a90b3ecbfed25bb2acb4b0d6d707979

See more details on using hashes here.

File details

Details for the file unja-0.0.7-py3-none-any.whl.

File metadata

  • Download URL: unja-0.0.7-py3-none-any.whl
  • Upload date:
  • Size: 19.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.4

File hashes

Hashes for unja-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 e79e7ff6775977320a58fe40329e91f649efdb68ddc80f75b2bb747dc3749645
MD5 38e1be3a8a5fdfd69b508d218e211ea6
BLAKE2b-256 eed2eee32690dc0840e5ad52b27e38e98e8927a548614a25f8f7fa1cb2a256cf

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page