Skip to main content

Small library to download files with date and time based filenames or folder structures. In parallel using wget.

Project description

https://coveralls.io/repos/github/cpaulik/datedown/badge.svg?branch=master https://badge.fury.io/py/datedown.svg

Small library to download files with date and time based filenames or folder structures. In parallel using wget.

Recursive wget can be slow and result in cumbersome local folder structures. This library downloads exact filenames based on exact dates or a range of dates. Remote and local filenames and paths are built using the Python strftime and strptime format specification

The library uses the Python multiprocessing module to start multiple wget instances for possibly faster downloading. At the end of the download process it verfies that all the files were downloaded. No support for checksums at the moment.

Installation

  • Install wget if it is not already on your system.

  • pip install datedown

Usage

The program can be used either as a library to be called from other Python programs or as a stand alone command line program.

Use as a command line program

After installation the datedown program should be available in your shell. To get detailed instructions on how to use it run datedown -h.

If it is impossible to know the exact filename on the server then also a recursive version of the script is available under the name datedown_rec.

Example

datedown 2000-01-01 2000-01-02 http://localhost:8888 file_%Y_%m_%d.txt /home/cpa/ --urlsubdirs test_data year_month_subfolders %Y %m

This would download the files

to

  • /home/cpa/test_data/year_month_subfolders/2000/01/file_2000_01_01.txt

  • /home/cpa/test_data/year_month_subfolders/2000/01/file_2000_01_02.txt

Use as a library

For use as a library the most important function is datedown.interface.download_by_dt or datedown.down.download. The first function takes functions that produce urls from Python datetime objects whereas the second takes lists of urls and local filenames. Please see the API Documentation for more details about these functions.

Documentation Status

Note

This project has been set up using PyScaffold 3.3 For details and usage information on PyScaffold see http://pyscaffold.readthedocs.org/.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datedown-0.4.tar.gz (20.6 kB view details)

Uploaded Source

File details

Details for the file datedown-0.4.tar.gz.

File metadata

  • Download URL: datedown-0.4.tar.gz
  • Upload date:
  • Size: 20.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.14

File hashes

Hashes for datedown-0.4.tar.gz
Algorithm Hash digest
SHA256 601bb1153c9e6413bb3517f4bf5bf0bd4640e4e641206475615c04d1bb55d548
MD5 94aaf73516aae4b2aa2abcdf6e7e1585
BLAKE2b-256 76ff23e9dee538a2646928ae6985d31bf5b29c726538a5a74102d5952076ecfa

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page