Windowed multiprocessing wrapper for rasterio
Project description
Parallel processing wrapper for rasterio
Install
From pypi:
pip install rio-mucho --pre
From github (usually for a branch / dev):
pip install pip install git+ssh://git@github.com/mapbox/rio-mucho.git@<branch>
Development:
git clone git@github.com:mapbox/rio-mucho.git cd rio-mucho pip install -e .
Usage
with riomucho.RioMucho([{inputs}], {output}, {run function}, windows={windows}, global_args={global arguments}, kwargs={kwargs to write}) as rios: rios.run({processes})
Arguments
inputs
An list of file paths to open and read.
output
What file to write to.
run_function
A function to be applied to each window chunk. This should have input arguments of:
- A data input, which can be one of:
- A list of numpy arrays of shape (x,y,z), one for each file as specified in input file list mode="simple_read" [default]
- A numpy array of shape ({n input files x n band count}, {window rows}, {window cols}) mode=array_read"
- A list of open sources for reading mode="manual_read"
- A rasterio window tuple
- A rasterio window index (ij)
- A global arguments object that you can use to pass in global arguments
This should return:
- An output array of ({count}, {window rows}, {window cols}) shape, and of the correct data type for writing
def basic_run({data}, {window}, {ij}, {global args}): ## do something return {out}
Keyword arguments
windows={windows}
A list of rasterio (window, ij) tuples to operate on. [Default = src[0].block_windows()]
global_args={global arguments}
Since this is working in parallel, any other objects / values that you want to be accessible in the run_function. [Default = {}]
global_args = { 'divide_value': 2 }
kwargs={keyword args}
The kwargs to pass to the output. [Default = srcs[0].kwargs
Example
import riomucho, rasterio, numpy def basic_run(data, window, ij, g_args): ## do something out = np.array( [d[0] /= global_args['divide'] for d in data] ) return out # get windows from an input with rasterio.open('/tmp/test_1.tif') as src: ## grabbing the windows as an example. Default behavior is identical. windows = [[window, ij] for ij, window in src.block_windows()] kwargs = src.meta # since we are only writing to 2 bands kwargs.update(count=2) global_args = { 'divide': 2 } processes = 4 # run it with riomucho.RioMucho(['input1.tif','input2.tif'], 'output.tif', basic_run, windows=windows, global_args=global_args, kwargs=kwargs) as rm: rm.run(processes)
Utility functions
`riomucho.utils.array_stack([array, array, array,…])
Given a list of ({depth}, {rows}, {cols}) numpy arrays, stack into a single (l{list length * each image depth}, {rows}, {cols}) array. This is useful for handling variation between rgb inputs of a single file, or separate files for each.
One RGB file
files = ['rgb.tif'] open_files = [rasterio.open(f) for f in files] rgb = `riomucho.utils.array_stack([src.read() for src in open_files])
Separate RGB files
files = ['r.tif', 'g.tif', 'b.tif'] open_files = [rasterio.open(f) for f in files] rgb = `riomucho.utils.array_stack([src.read() for src in open_files])
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Filename, size | File type | Python version | Upload date | Hashes |
---|---|---|---|---|
Filename, size rio-mucho-0.0.1.tar.gz (4.3 kB) | File type Source | Python version None | Upload date | Hashes View |