Skip to main content

I/O functions for Python and LQCD file formats

Project description

I/O functions for Python and LQCD file formats

python pypi license build & test codecov pylint black

Lyncs IO offers two high-level functions load and save (or dump as alias of save).

The main features of this module are

  • Seamlessly IO, reading and writing made simple. In most of the cases, after saving save(obj, filename), loading obj=load(filename) returns the original Python object. This feature is already ensured by formats like pickle, but we try to ensure it as much as possible also for other formats.

  • Many formats supported. The file format can be specified either via the filename's extension or with the option format passed to load/save. The structure of the package is flexible enough to easily accomodate new/customized file formats as these arise. See [Adding a file format] for guidelines.

  • Support for archives. In case of archives, e.g. HDF5, zip etc., the content can be accessed directly by specifying it in the path. For instance with directory/file.h5/content, directory/file.h5 is the file path, and the remaining is content to be accessed that will be searched inside the file.

  • Support for Parallel IO. Where possible, the option chunks can be used for enabling parallel IO via Dask.

  • Omission of extension. When saving, if the extension is omitted, the optimal file format is deduced from the data type and the extension is added to the filename. When loading, any extension is considered, i.e. filename.*, and if only one match is available, the file is loaded.

Installation

The package can be installed via pip:

pip install [--user] lyncs_io

NOTE: for enabling parallel IO, lyncs_io requires a working MPI installation. This can be installed via apt-get:

sudo apt-get install libopenmpi-dev openmpi-bin

OR using conda:

conda install -c anaconda mpi4py

Parallel IO can then be enabled via

pip install [--user] lyncs_io[mpi]

Documentation

The high-level load and save (or dump as alias of save) functions provided by the Lyncs IO can be used as follows:

import numpy as np
import lyncs_io as io

arr1 = np.random.rand((10,10,10))
io.save(arr, "data.h5/random")

arr2 = np.zeros_like(arr)
io.save(arr, "data.h5/zeros")

arrs = io.load("data.h5")
assert (arr1 == arrs["random"]).all()
assert (arr2 == arrs["zeros"]).all()

NOTE: for save we use the order data, filename. This is the opposite of what done in numpy but consistent with pickle's dump. This order is preferred because the function can be used directly as a method for a class since self, i.e. the data, would be passed as the first argument of save.

IO with MPI

import numpy as np
import lyncs_io as io
from mpi4py import MPI

# Assume 2D cartesian topology
comm = MPI.COMM_WORLD
dims = MPI.Compute_dims(comm.size, 2)
cartesian2d = comm.Create_cart(dims=dims)

oarr = np.random.rand(6, 4, 2, 2)
io.save(oarr, "pario.npy", comm=cartesian2d)
iarr = io.load("pario.npy", comm=cartesian2d)

assert (iarr == oarr).all()

NOTE: Parallel IO is enabled once a valid cartesian communicator is passed to load or save routines, otherwise Serial IO is performed. Currently only numpy format supports this functionality.

File formats

Adding a file format

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lyncs_io-0.0.3.tar.gz (16.1 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page