Skip to main content

Archive (safe)-Removal Rolling Toolbox

Project description

arcX - Archive, Rm & Clean toolboX

ArcX is a versatile toolbox designed to streamline data management by automating key tasks. Whether you're handling experiment outputs or maintaining clean directories, ArcX simplifies the process with a range of powerful tools.

  • Archiving: Seamlessly archive Oceanic experiment outputs using a flexible YAML configuration file. ArcX takes care of organizing and storing your data without manual intervention.
  • Safe File Removal: Efficiently remove files that already have a local copy. ArcX ensures that only unnecessary files are deleted, safeguarding important data.
  • Comprehensive Directory Cleaning: Clean multiple directories with a single command. ArcX offers various cleaning options and leverages a YAML configuration file to specify exactly what to delete and how to do it.

Installation

Via pip

pip install arcx

Via conda/mamba

mamba install arcx

Usage

Usage: arcx [OPTIONS] COMMAND [ARGS]...

Options:
  --version  Show the version and exit.
  --help     Show this message and exit.

Commands:
  rolling
  saferm
  update-md5

Rolling

To keep a directory under rolling, use the command:

Usage: cli.py rolling [OPTIONS]

Options:
  -c, --config PATH  [required]
  --dry-run          Dry run mode.
  -d, --debug        Enable debug mode.
  --help             Show this message and exit.

Rolling Configuration File

Here a template of a rolling configuration file:

- !CleanPath
    path: $PATH_UNDER_ROLLING
    fmt: ????????   # YYYYMMDD
    safe:
        to_keep: X
        reference_paths:
            - <REF1>
            - <REF2>
    conditional:
        to_keep: Y
        expected_files:
            - file.exe
            - tmp.nc
    force:
        to_keep: Z

let's comment each section.

CleanPath object

With this, we start to declare a list of !CleanPath objects with two mandatory attributes:

- !CleanPath
    path: $PATH_UNDER_ROLLING
    fmt: ????????   # YYYYMMDD
  • path: it's the path to keep under rolling
  • fmt: it's a string bash that represent the format of file/dir to delete. It can contains jolly characters ? or *
Safe
  safe:
      to_keep: X
      reference_paths:
          - <REF1>
          - <REF2>
  • safe: means for safe rolling, which means delete a file only if an identical local copy already exists
  • to_keep: how much dir/file to not include in the rolling
  • reference_paths: a list of path where to find if a local copy already exists

To consider that the safe mode doesn't remove the dir under rolling.

Conditional #### Conditional
  conditional:
      to_keep: Y
      expected_files:
          - file.exe
          - tmp.nc
  • conditional: specify to remove a dir if some conditions are meet
  • to_keep: how much dir/file to not include in the rolling
  • expected_files: specify the exact list of files expected to find in rolling path to trigger the rm operations. The filename can contains jolly character ? and *
Force
    force: # optional
        to_keep: Z
  • force: Enable path rm without any check
  • to_keep: how much dir/file to not include in the rolling

Update md5 dir

A requirement to safe clean, is to compute in advance the md5 of all files in the rolling path using the command:

update-md5 [OPTIONS]

Options:
  -p, --path PATH  [required]
  -d, --debug      Enable debug mode.
  --help           Show this message and exit.

The command will create in the directory a file called .dir_md5.txt with the following structure:

md5hash filename1
md5hash filename2

Safe Clean

Safe Rm command is the equivalent of safe rm rolling section: it removes files from clean path only if exists an identical copy in keep path

Usage: cli.py saferm [OPTIONS]

Options:
  --keep PATH    [required]
  --clean PATH   [required]
  -f, --force    Delete files without confirm request
  -d, --dry-run  Disable file removal
  -d, --debug    Enable debug mode.
  --help         Show this message and exit.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arcx-0.2.1.tar.gz (15.2 kB view hashes)

Uploaded Source

Built Distribution

arcx-0.2.1-py3-none-any.whl (18.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page