Skip to main content

URLs Deduplication Tool.

Project description

UDdup - URLs Deduplication Tool

The tool gets a list of URLs, and removes "duplicate" pages in the sense of URL patterns that are probably repetitive and points to the same web template.

For example:

https://www.example.com/product/123
https://www.example.com/product/456
https://www.example.com/product/123?is_prod=false
https://www.example.com/product/222?is_debug=true

All the above are probably points to the same product "template". Therefore it should be enough to scan only some of these URLs by our various scanners.

The result of the above after UDdup should be:

https://www.example.com/product/123?is_prod=false
https://www.example.com/product/222?is_debug=true

Why do I need it?

Mostly for better (automated) reconnaissance process, with less noise (for both the tester and the target).

Examples

Take a look at demo.txt which is the raw URLs file which results in demo-results.txt.


Installation

# Clone the repository.
git clone https://github.com/rotemreiss/uddup.git

# Install the Python requirements.
cd uddup
pip install -r requirements.txt

Usage

python uddup/main.py -u demo.txt -o ./demo-result.txt

More Usage Options

Short Form Long Form Description
-h --help Show this help message and exit
-u --urls File with a list of urls
-o --output Save results to a file
-s --silent Print only the result URLs

Contributing

Feel free to fork the repository and submit pull-requests.


Support

Create new GitHub issue

Want to say thanks? :) Message me on Linkedin


License

License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

uddup-0.9.1.tar.gz (4.4 kB view hashes)

Uploaded Source

Built Distribution

uddup-0.9.1-py3-none-any.whl (5.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page