A custom CMCC library to list and download data from the Marine Data Store
Project description
Marine Data Store ToolBox
This Python script provides a command-line interface (CLI) for downloading datasets using copernicusmarine toolbox or botos3
How to Install it
Create the conda environment:
mamba env create -f environment.yml
mamba activate mdsenv
pip install .
Uninstall
To uninstall it:
mamba activate mdsenv
pip uninstall mds-toolbox
Usage
The script provides several commands for different download operations:
Usage: mds [OPTIONS] COMMAND [ARGS]...
Options:
-h, --help Show this message and exit.
Commands:
etag Get the etag of a give S3 file
file-list Wrapper to copernicus marine toolbox file list
get Wrapper to copernicusmarine get
s3-get Download files with direct access to MDS using S3
s3-list Listing file on MDS using S3
subset Wrapper to copernicusmarine subset
S3 direct access
Since the copernicusmarine tool add a heavy overhead to s3 request, two functions has been developed to:
- make very fast s3 request
- provide a thread-safe access to s3 client
s3-get
Usage: mds s3-get [OPTIONS]
Options:
-b, --bucket TEXT Bucket name [required]
-f, --filter TEXT Filter on the online files [required]
-o, --output-directory TEXT Output directory [required]
-p, --product TEXT The product name [required]
-i, --dataset-id TEXT Dataset Id [required]
-g, --dataset-version TEXT Dataset version or tag
-r, --recursive List recursive all s3 files
--threads INTEGER Downloading file using threads
-s, --subdir TEXT Dataset directory on mds (i.e. {year}/{month})
- If present boost the connection
--overwrite Force overwrite of the file
--keep-timestamps After the download, set the correct timestamp
to the file
--sync-time Update the file if it changes on the server
using last update information
--sync-etag Update the file if it changes on the server
using etag information
--help Show this message and exit.
Example
mds s3-get -i cmems_obs-ins_med_phybgcwav_mynrt_na_irr -b mdl-native-03 -g 202311 -p INSITU_MED_PHYBGCWAV_DISCRETE_MYNRT_013_035 -o "/work/antonio/20240320" -s latest/$(date -du +"%Y%m%d") --keep-timestamps --sync-etag -f $(date -du +"%Y%m%d")
Example using threads
mds s3-get --threads 10 -i cmems_obs-ins_med_phybgcwav_mynrt_na_irr -b mdl-native-03 -g 202311 -p INSITU_MED_PHYBGCWAV_DISCRETE_MYNRT_013_035 -o "." -s latest/$(date -du +"%Y%m%d") --keep-timestamps --sync-etag -f $(date -du +"%Y%m%d")
s3-list
Usage: mds.py s3-list [OPTIONS]
Options:
-b, --bucket TEXT Filter on the online files [required]
-f, --filter TEXT Filter on the online files [required]
-p, --product TEXT The product name [required]
-i, --dataset-id TEXT Dataset Id
-g, --dataset-version TEXT Dataset version or tag
-s, --subdir TEXT Dataset directory on mds (i.e. {year}/{month}) -
If present boost the connection
-r, --recursive List recursive all s3 files
--help Show this message and exit.
Example
mds s3-list -b mdl-native-01 -p INSITU_GLO_PHYBGCWAV_DISCRETE_MYNRT_013_030 -i cmems_obs-ins_glo_phybgcwav_mynrt_na_irr -g 202311 -s "monthly/BO/202401" -f "*" | tr " " "\n"
Example recursive
mds s3-list -b mdl-native-12 -p MEDSEA_ANALYSISFORECAST_PHY_006_013 -f '*' -r | tr " " "\n"
Wrapper for copernicusmarine
The following functions rely on copernicusmarine implementation, the final result is strictly related to the installed version
Subset
Usage: mds.py subset [OPTIONS]
Options:
-o, --output-directory TEXT Output directory [required]
-f, --output-filename TEXT Output filename [required]
-i, --dataset-id TEXT Dataset Id [required]
-v, --variables TEXT Variables to download. Can be used multiple times
-x, --minimum-longitude FLOAT Minimum longitude for the subset.
-X, --maximum-longitude FLOAT Maximum longitude for the subset.
-y, --minimum-latitude FLOAT Minimum latitude for the subset. Requires a
float within this range: [-90<=x<=90]
-Y, --maximum-latitude FLOAT Maximum latitude for the subset. Requires a
float within this range: [-90<=x<=90]
-z, --minimum-depth FLOAT Minimum depth for the subset. Requires a
float within this range: [x>=0]
-Z, --maximum-depth FLOAT Maximum depth for the subset. Requires a
float within this range: [x>=0]
-t, --start-datetime TEXT Start datetime as:
%Y|%Y-%m-%d|%Y-%m-%dT%H:%M:%S|%Y-%m-%d
%H:%M:%S|%Y-%m-%dT%H:%M:%S.%fZ
-T, --end-datetime TEXT End datetime as:
%Y|%Y-%m-%d|%Y-%m-%dT%H:%M:%S|%Y-%m-%d
%H:%M:%S|%Y-%m-%dT%H:%M:%S.%fZ
-r, --dry-run Dry run
-g, --dataset-version TEXT Dataset version or tag
-n, --username TEXT Username
-w, --password TEXT Password
--help Show this message and exit.
Example
mds subset -f output.nc -o . -i cmems_mod_glo_phy-thetao_anfc_0.083deg_P1D-m -x -18.16667 -X 1.0 -y 30.16 -Y 46.0 -z 0.493 -Z 5727.918000000001 -t 2025-01-01 -T 2025-01-01 -v thetao
Get
Command:
Usage: mds.py get [OPTIONS]
Options:
-f, --filter TEXT Filter on the online files
-o, --output-directory TEXT Output directory [required]
-i, --dataset-id TEXT Dataset Id [required]
-g, --dataset-version TEXT Dataset version or tag
-s, --service TEXT Force download through one of the available
services using the service name among
['original-files', 'ftp'] or its short name
among ['files', 'ftp'].
-d, --dry-run Dry run
-u, --update If the file not exists, download it, otherwise
update it it changed on mds
-v, --dataset-version TEXT Dry run
-nd, --no-directories TEXT Option to not recreate folder hierarchy in
output directory
--disable-progress-bar TEXT Flag to hide progress bar
-n, --username TEXT Username
-w, --password TEXT Password
--help Show this message and exi
Example
mds get -f '20250210*_d-CMCC--TEMP-MFSeas9-MEDATL-b20250225_an-sv10.00.nc' -o . -i cmems_mod_med_phy-tem_anfc_4.2km_P1D-m
File List
To retrieve a list of file, use:
Usage: mds.py file-list [OPTIONS] DATASET_ID MDS_FILTER
Options:
-g, --dataset-version TEXT Dataset version or tag
--help Show this message and exit.
Example
mds file-list cmems_mod_med_phy-cur_anfc_4.2km_PT15M-i *b20250225* -g 202411
Etag
Usage: mds.py etag [OPTIONS]
Options:
-e, --s3_file TEXT Path to a specific s3 file - if present, other
parameters are ignored.
-p, --product TEXT The product name
-d, --dataset_id TEXT The datasetID
-v, --version TEXT Force the selection of a specific dataset version
-s, --subdir TEXT Subdir structure on mds (i.e. {year}/{month})
-f, --mds_filter TEXT Pattern to filter data (no regex)
--help Show this message and exit.
Example
With a specific file:
mds etag -e s3://mdl-native-12/native/MEDSEA_ANALYSISFORECAST_PHY_006_013/cmems_mod_med_phy-cur_anfc_4.2km_PT15M-i_202411/2025/05/20250501_qm-CMCC--RFVL-MFSeas9-MEDATL-b20250513_an-sv10.00.nc
Or:
mds etag -p MEDSEA_ANALYSISFORECAST_PHY_006_013 -i cmems_mod_med_phy-cur_anfc_4.2km_PT15M-i -g 202411 -f '*' -s 2025/05
Authors
- Antonio Mariani - antonio.mariani@cmcc.it
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mds_toolbox-2.0.1.tar.gz.
File metadata
- Download URL: mds_toolbox-2.0.1.tar.gz
- Upload date:
- Size: 23.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.10.14 Linux/6.11.0-26-generic
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1c5a681b1322d2cbde42f8b5c4628d6869d9d63e8a29b10a00a4a2070256d059
|
|
| MD5 |
1c8ca449ec0fc245122aba29cef5598c
|
|
| BLAKE2b-256 |
e0c64e0874c104695c87b9fef0fb1f7d001284241b34049518628323ca149b73
|
File details
Details for the file mds_toolbox-2.0.1-py3-none-any.whl.
File metadata
- Download URL: mds_toolbox-2.0.1-py3-none-any.whl
- Upload date:
- Size: 27.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.10.14 Linux/6.11.0-26-generic
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4424ba8063e256540a5c348d79db27d182470c30ea6f0f8948d4a478869b9984
|
|
| MD5 |
c8c9855307489c0bf094bc912856dca8
|
|
| BLAKE2b-256 |
3ff4e95f028a2ddcacdbdb411f9727ae6c69f7670df462ccc65efd4162600de3
|