Functions to easily copy a netcdf file, replacing and transforming variables and dimensions.
Project description
A small Python library providing functions to copy easily a netcdf file while replacing and transforming variables and dimensions.
About ncio
ncio is a Python library that provides functions to copy dimensions, variables, attributes, etc. from one netcdf file to a netcdf output file. Replacements and transformations can be performed on the copied entities such as names can be changed or variables can be replaced with other data. It is a thin wrapper of the netCDF4 Python package that adds no new functionality but rather provides convenience functions to work easily with netcdf files.
Documentation
The complete documentation for ncio is available at Github pages:
copy_file
ncio provides the functions ncio.copy_dimensions, ncio.copy_global_attributes, and ncio.copy_variables to copy dimensions, attributes, and variables from one netcdf file to another. The functions have the keywords renamedim, removedim, changedim, adddim, renamevar, removevar, replacevar, and replaceatt, which purpose should be rather mnemonic. There is also the function ncio.copy_file that combines the three individual routines in a single function.
Imagine you have a netcdf file that contains the variable gpp, among others. You want to have an output file, which is the same as the input file except that gpp should be doubled. This could be done, for example, as:
import ncio
ifile = 'basic_gpp.nc'
ofile = 'gppx2.nc'
# read current 'gpp' array
gpp = ncio.ncread(ifile, 'gpp')
# copy file by replacing gpp by double its value
ncio.copy_file(ifile, ofile, replacevar={'gpp': {'gpp': 2. * gpp}})
Here the convenience function ncio.ncread was used that simply reads the given variable from a netcdf file.
There is also the convenience function ncio.ncinfo, where one can inquire about dimensions in the file, variables, their long_names and units or all attributes. One could, for example, assert that the variable gpp is present before reading it:
# read current 'gpp' array
assert 'gpp' in ncio.ncinfo(ifile, variables=True), (
f'Variable "gpp" not in input file {ifile}')
gpp = ncio.ncread(ifile, 'gpp')
If you set the keyword noclose=True in ncio.copy_file, then it does not close the file and returns the file handle. You can then still manipulate the contents of the file. The doubling of gpp could hence also be done as:
import ncio
ifile = 'basic_gpp.nc'
ofile = 'gppx2.nc'
# read current 'gpp' array
assert 'gpp' in ncio.ncinfo(ifile, variables=True), (
f'Variable "gpp" not in input file {ifile}')
gpp = ncio.ncread(ifile, 'gpp')
# copy file, then directly access gpp variable in putput file
fo = ncio.copy_file(ifile, ofile, noclose=True)
ovar = fo.variables['gpp']
ovar[:] = ovar[:] * 2.
fo.close()
A slightly extended example could be to produce a suite of scenarios with respiration (resp) having global annual values between say 120 and 170 and the spatial pattern of gpp. If the input file has monthly global gpp fields for several years, this would be:
import ncio
ifile = 'basic_gpp.nc'
# read current 'gpp' array
assert 'gpp' in ncio.ncinfo(ifile, variables=True), (
f'Variable "gpp" not in input file {ifile}')
gpp = ncio.ncread(ifile, 'gpp')
isglobalgpp = gpp.mean() * 12.
for g in [140, 150, 160, 170]:
ofile = f'resp{g}.nc'
hist = (f'Modified {ifile} to produce resp with the distribution'
f' of gpp but with an annual mean of {g} PgC/a.')
ncio.copy_file(ifile, ofile,
replacevar={'gpp': {'resp': gpp * (g / isglobalgpp)}},
replaceatt={'resp':
{'long_name':
'Terrestrial Ecosystem Respiration',
'units': 'kg gridcell-1 s-1'}},
addglobalatt={'history': hist})
Individual copy routines
Using the individual functions gives more flexibility, of course. Imagine you have an input file that has monthly fields on a global grid on Earth, i.e. having dimensions (time, lon, lat). The land grid cells of the vegetation variables have the extra dimension patch for fractions of different plant functional types in the same grid cell, i.e. having dimensions (time, lon, lat, patch) or (time, patch, lon, lat). There is an associated variable patchfrac that gives the fixed relative sizes of each patch. To get the average of the grid cell, the vegetation variables have to be multiplied by patchfrac and summed over the patch dimension. We will first set all metadata in the output file, then copy the variables that have no unlimited dimension time, and then the variables having a time dimensions. The latter will be copied timestep per timestep to avoid Python swapping variables to disk or out-of-memory:
import sys
import time
import netCDF4 as nc
import ncio
ifile = 'patch.nc'
ofile = 'nopatch.nc'
# check input file
assert 'patchfrac' in ncio.ncinfo(ifile, variables=True), (
f'No patchfrac variable in input file {ifile}')
assert 'patch' in ncio.ncinfo(ifile, var='patchfrac', dims=True)
# open input and output files
fi = nc.Dataset(ifile, 'r')
if 'file_format' in dir(fi):
fo = nc.Dataset(ofile, 'w', format=fi.file_format)
else:
fo = nc.Dataset(ofile, 'w', format='NETCDF3_64BIT_OFFSET')
# copy global attributes, adding script
ncio.copy_global_attributes(fi, fo,
add={'history': (
time.asctime() + ': ' +
' '.join(sys.argv))})
# copy dimensions
ncio.copy_dimensions(fi, fo, removedim=['patch'])
# create static variables (independent of time)
ncio.create_variables(fi, fo, time=False, removedim=['patch'])
# create dynamic variables (time dependent)
ncio.create_variables(fi, fo, time=True, removedim=['patch'])
# get patchfrac
patchfrac = fi.variables['patchfrac']
# copy static variables
for ivar in fi.variables.values():
if 'time' not in ivar.dimensions:
ovar = fo.variables[ivar.name]
invar = ivar[:]
if 'patch' in ivar.dimensions:
idx = ivar.dimensions.index('patch')
# use patchfrac at first time step for non-time
# dependent variables
out = np.sum(invar * patchfrac, axis=idx)
else:
out = invar
ovar[:] = out
# copy dynamic variables
ntime = fi.dimensions['time'].size
for tt in range(ntime):
for ivar in fi.variables.values():
if 'time' in ivar.dimensions:
ovar = fo.variables[ivar.name]
invar = ivar[tt, ...]
if 'patch' in ivar.dimensions:
# -1 because of specific timestep
idx = ivar.dimensions.index('patch') - 1
out = np.sum(invar * patchfrac, axis=idx)
else:
out = invar
ovar[tt, ...] = out
# finish
fi.close()
fo.close()
Installation
The easiest way to install is via pip:
pip install ncio
or via conda:
conda install -c conda-forge ncio
Requirements
License
ncio is distributed under the MIT License. See the LICENSE file for details.
Copyright (c) 2020- Matthias Cuntz
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file ncio-1.0.1.tar.gz
.
File metadata
- Download URL: ncio-1.0.1.tar.gz
- Upload date:
- Size: 313.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.1 CPython/3.12.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b336a4fe3f14571f1e16cc6f75a7dfe1b1fdb079603ef48a85b4220690338e2e |
|
MD5 | 9708046f256a6ed99b20dcdd8a047eaf |
|
BLAKE2b-256 | 624596ef808bcbffae4b1057cc43140ac9554fcba70db816fdba317db33d9702 |
File details
Details for the file ncio-1.0.1-py3-none-any.whl
.
File metadata
- Download URL: ncio-1.0.1-py3-none-any.whl
- Upload date:
- Size: 18.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.1 CPython/3.12.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7ba444b9ec1f1bf8515ac0c49e11a9eadf0667050a56019cd349776a26e9fafd |
|
MD5 | 601e9b9ac9d7baa229bd57c139800b07 |
|
BLAKE2b-256 | a9aeed906c9c5b5d6a1ee65d08c39cab8d469d12181d75cc7deb1d9d822031a7 |