Skip to main content

A tool for downloading files with filenames matching user-specified regex patterns which are linked to on a user-provided URL

Project description

DoRePy - (Do)wnload (Re)gex (Py)thon

DoRePy (pronounced like doe-ray-pee) is your go-to script for automating the download of files from a webpage that match a specific regex pattern. Fed up with manually sifting through pages to download files? DoRePy has got your back!

Features

  • Regex Pattern Matching: Use the power of regular expressions to target exactly the files you need.
  • Retry Logic: Network hiccup? No problem. DoRePy retries failed downloads, respecting rate limits like a well-mannered netizen.

Getting Started

Prerequisites

  • Python 3
  • Requests: pip install requests
  • BeautifulSoup: pip install beautifulsoup4

Installation

Clone this repository or simply download dorepy.py to your local machine:

git clone https://github.com/CillySu/DoRePy/dorepy.git

Usage

Navigate to the directory containing dorepy.py and run:

python dorepy.py [URL] [PATTERN]

Where:

[URL] is the webpage URL from which you want to download files. [PATTERN] is the regex pattern that matches the file names you want to download.

Example:

python dorepy.py "http://example.com" "\.pdf$"

This command downloads all PDF files which are linked to on http://example.com.

Contributing

Feel like DoRePy missed a beat? Fork the repo, add your spin, and submit a pull request. All contributions are welcome!

License

Distributed under the MIT License. See LICENSE for more information.

A Note on Responsible Use

Please use DoRePy wisely and respect website terms of service and your local laws as applicable.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

DoRePy-0.1.1.tar.gz (3.4 kB view hashes)

Uploaded Source

Built Distribution

DoRePy-0.1.1-py3-none-any.whl (3.9 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page