Tools for using compute.rhg.com and compute.impactlab.org
Project description
Tools for using compute.rhg.com and compute.impactlab.org
Free software: MIT license
Documentation: https://rhg-compute-tools.readthedocs.io.
Installation
pip:
pip install rhg_compute_tools
Features
Kubernetes tools
easily spin up a preconfigured cluster with get_cluster(), or flavors with get_micro_cluster(), get_standard_cluster(), get_big_cluster(), or get_giant_cluster().
>>> import rhg_compute_tools.kubernetes as rhgk
>>> cluster, client = rhgk.get_cluster()
Google cloud storage utilities
Utilities for managing google cloud storage directories in parallel from the command line or via a python API
>>> import rhg_compute_tools.gcs as gcs
>>> gcs.sync_gcs('my_data_dir', 'gs://my-bucket/my_data_dir')
History
v1.2.1
Bug fixes: * raise error on gsutil nonzero status in rhg_compute_tools.gcs.cp (PR #105)
v1.2
New features: * Adds google storage directory marker utilities and rctools gcs mkdirs command line app
v1.1.4
Add dask_kwargs to the rhg_compute_tools.xarray functions
v1.1.3
Add retry_with_timeout to rhg_compute_tools.utils.py
v1.1.2
Drop matplotlib.font_manager._rebuild() call in design.__init__ - no longer supported
v1.1.1
Refactor datasets_from_delayed to speed up
v1.1
Add gcs.ls function
v1.0.1
Fix tag kwarg in get_cluster
v1.0.0
Make the gsutil API consistent, so that we have cp, sync and rm, each of which accept the same args and kwargs
Swap bumpversion for setuptools_scm to handle versioning
Cast coordinates to dict before gathering in rhg_compute_tools.xarray.dataarrays_from_delayed and rhg_compute_tools.xarray.datasets_from_delayed. This avoids a mysterious memory explosion on the local machine. Also add name in the metadata used by those functions so that the name of each dataarray or Variable is preserved.
Use dask-gateway when available when creating a cluster in rhg_compute_tools.kubernetes. Add some tests using a local gateway cluster. TODO: More tests.
Add tag kwarg to rhg_compute_tools.kuberentes.get_cluster function (PR #87)
v0.2.2
?
v0.2.1
Add remote scheduler deployment (part of dask_kubernetes 0.10)
Remove extraneous GCSFUSE_TOKENS env var no longer used in new worker images
Set library thread limits based on how many cpus are available for a single dask thread
Change formatting of the extra env_items passed to get_cluster to be a list rather than a list of dict-like name/value pairs
v0.2.0
Add CLI tools . See rctools gcs repdirstruc --help to start
Add new function rhg_compute_tools.gcs.replicate_directory_structure_on_gcs to copy directory trees into GCS. Users can authenticate with cred_file or with default google credentials
Fixes to docstrings and metadata
Add new function rhg_compute_tools.gcs.rm to remove files/directories on GCS using the google.cloud.storage API
Store one additional environment variable when passing cred_path to rhg_compute_tools.kubernetes.get_cluster so that the google.cloud.storage API will be authenticated in addition to gsutil
v0.1.8
Deployment fixes
v0.1.7
Design tools: use RHG & CIL colors & styles
Plotting helpers: generate cmaps with consistent colors & norms, and apply a colorbar to geopandas plots with nonlinear norms
Autoscaling fix for kubecluster: switch to dask_kubernetes.KubeCluster to allow use of recent bug fixes
v0.1.6
Add rhg_compute_tools.gcs.cp_gcs and rhg_compute_tools.gcs.sync_gcs utilities
v0.1.5
need to figure out how to use this rever thing
v0.1.4
Bug fix again in rhg_compute_tools.kubernetes.get_worker
v0.1.3
Bug fix in rhg_compute_tools.kubernetes.get_worker
v0.1.2
Add xarray from delayed methods in rhg_compute_tools.xarray
rhg_compute_tools.gcs.cp_to_gcs now calls gsutil in a subprocess instead of google.storage operations. This dramatically improves performance when transferring large numbers of small files
Additional cluster creation helpers
v0.1.1
New google compute helpers (see rhg_compute_tools.gcs.cp_to_gcs, rhg_compute_tools.gcs.get_bucket)
New cluster creation helper (see rhg_compute_tools.kubernetes.get_worker)
Dask client.map helpers (see rhg_compute_tools.utils submodule)
v0.1.0
First release on PyPI.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file rhg_compute_tools-1.2.1.tar.gz
.
File metadata
- Download URL: rhg_compute_tools-1.2.1.tar.gz
- Upload date:
- Size: 43.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.63.0 importlib-metadata/4.11.2 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.8.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 720793b3ba0f8cc907447198ed1b521f0ba2d8b06ca29fd4fcea110f3cddb62e |
|
MD5 | 3b665c44b0281cb02e09a0139223e38c |
|
BLAKE2b-256 | 04ecc5ac6d145d55d55013014ba89f229be7c7822897bab3a920e7d609a211bf |
File details
Details for the file rhg_compute_tools-1.2.1-py2.py3-none-any.whl
.
File metadata
- Download URL: rhg_compute_tools-1.2.1-py2.py3-none-any.whl
- Upload date:
- Size: 29.1 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.63.0 importlib-metadata/4.11.2 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.8.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f1132ae4d10f892ca4542cbac0277f1ebd83bae30d2c731293713a6fbab72891 |
|
MD5 | 7a085eca26c121bfffbd177102c63a7b |
|
BLAKE2b-256 | ecd0354240fbc72c22a98f7137f9655c3a6351a85411582740436bca5e183058 |