Skip to main content

A python tool to declutter url lists for crawling/pentesting

Project description

uro

Using a URL list for security testing can be painful as there are a lot of URLs that have uninteresting/duplicate content; uro aims to solve that.

It doesn't make any http requests to the URLs and removes:

  • incremental urls e.g. /page/1/ and /page/2/
  • blog posts and similar human written content e.g. /posts/a-brief-history-of-time
  • urls with same path but parameter value difference e.g. /page.php?id=1 and /page.php?id=2
  • images, js, css and other "useless" files

uro-demo

Installation

The recommended way to install uro is as follows:

pipx install uro

Note: If you are using an older version of python, use pip instead of pipx

Basic Usage

The quickest way to include uro in your workflow is to feed it data through stdin and print it to your terminal.

cat urls.txt | uro

Advanced usage

Reading urls from a file (-i/--input)

uro -i input.txt

Writing urls to a file (-o/--output)

If the file already exists, uro will not overwrite the contents. Otherwise, it will create a new file.

uro -i input.txt -o output.txt

Whitelist (-w/--whitelist)

uro will ignore all other extensions except the ones provided.

uro -w php asp html

Note: Extensionless pages e.g. /books/1 will still be included. To remove them too, use --filter hasext.

Blacklist (-b/--blacklist)

uro will ignore the given extensions.

uro -b jpg png js pdf

Note: uro has a list of "useless" extensions which it removes by default; that list will be overridden by whatever extensions you provide through blacklist option. Extensionless pages e.g. /books/1 will still be included. To remove them too, use --filter hasext.

Filters (-f/--filters)

For granular control, uro supports the following filters:

  1. hasparams: only output urls that have query parameters e.g. http://example.com/page.php?id=
  2. noparams: only output urls that have no query parameters e.g. http://example.com/page.php
  3. hasext: only output urls that have extensions e.g. http://example.com/page.php
  4. noext: only output urls that have no extensions e.g. http://example.com/page
  5. allexts: don't remove any page based on extension e.g. keep .jpg which would be removed otherwise
  6. keepcontent: keep human written content e.g. blogs.
  7. keepslash: don't remove trailing slash from urls e.g. http://example.com/page/
  8. vuln: only output urls with parameters that are know to be vulnerable. More info.

Example: uro --filters hasexts hasparams

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

uro-1.0.2.tar.gz (10.4 kB view details)

Uploaded Source

File details

Details for the file uro-1.0.2.tar.gz.

File metadata

  • Download URL: uro-1.0.2.tar.gz
  • Upload date:
  • Size: 10.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.13.1

File hashes

Hashes for uro-1.0.2.tar.gz
Algorithm Hash digest
SHA256 f94e23f89c14265de5e868f116f07c0c9b510fa191a6e0084fbcfa8c3f431c64
MD5 3251f8170d58196f642da81736895e39
BLAKE2b-256 ffaa0c0e01facf06b02fc76b6ee323b420368ca5e06de38489218249ea47173b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page