Skip to main content

Python package for reading and writing files in HDF5 (serial and parallel), PETSc and raw binary

Project description

This repository contains utility functions written in Python for reading and writing files in different formats. They provide high level convenience wrappers for low level functions and libraries. Currently supported formats include: direct access binary, PETSc and HDF5. The HDF5 functions support both serial and parallel I/O and allow conveniently and efficiently saving/loading a wide variety of native Python datatypes as well as NumPy arrays.

IMPORTANT: Please do NOT post this code on your own github or other website. See LICENSE.txt for licensing information.

Feel free to email if you have any questions: samkat6@gmail.com


Installation

pip install pyioutils

For reading/writing HDF5 files in parallel, you will additionally need to install my pympiutils package:

pip install pympiutils

Note that parallelization depends on an MPI-enabled HDF5 installation and the h5py module built against it (instructions coming soon ...).

Usage

For more usage examples, see here. For parallel HDF5 I/O see here.

# Example 1: raw binary I/O
import numpy as np
from pyioutils.binaryio import write_binary, read_binary

# Write array as 64 bit float:
x=np.linspace(0., 7., num=8)
write_binary('x.bin', x, prec='>f8')

# Read data back in (default is as big endian 64 bit float)
xr=read_binary('x.bin', prec='f8', machineformat='>')

# Check
np.allclose(x,xr)

# Write multi-dim array to big endian binary file
# Data are written in Fortran (column major) order
y=np.reshape(x, (4,2), order='F')
write_binary('y.bin', y, prec='>f8')

# Read entire file and reshape using Fortran (column major) order
yr=read_binary('y.bin', dims=(4,2))

# Check
np.allclose(y,yr)

# Assume data are in blocks of size recSize; read the second record (iRec=1). This 
# assumes data are in the default big endian 64 bit float format.
y2=read_binary('y.bin', iRec=1, recSize=4)

# Write array to big endian binary file with header
z=np.array([280.,281.,282.,283.])
# Header as 32 bit int
write_binary('z.bin', len(z), prec='>i4')
# Append data as 64 bit float
write_binary('z.bin', z,prec='>f8', append=True)

# Read data back in skipping header
Tr=read_binary('z.bin', offsetBytes=4)
# Example 2: PETSc I/O
import numpy as np
from pyioutils.petscio import readPetscBinVec, writePetscBinVec, getPetscBinVecFileStats

# Write array to a PETSc Vec file
x=np.linspace(0.,7.,num=8)
writePetscBinVec("xy.petsc", x)

# Append another Vec
y=np.linspace(5.,10.,num=8)
writePetscBinVec("xy.petsc", y, append=True)

# Read both back in (multiple Vecs are returned as a list)
xy=readPetscBinVec("xy.petsc",nRec=-1)

# Check
np.allclose(x,xy[0])
np.allclose(y,xy[1])

# Read back in second Vec in file
yr=readPetscBinVec("xy.petsc", nRec=1, startRec=1)

# Check
np.allclose(y,yr)

# Multiple arrays can be written by passing them as a list
writePetscBinVec("xy2.petsc", [x, y])

# Query contents of a PETSc Vec file
vecLength, numVecs, numBytes=getPetscBinVecFileStats("xy.petsc")
print(f"{vecLength}, {numVecs}, {numBytes}")
# Example 3: HDF5 I/O
# For parallel I/O see the examples at: 
# https://github.com/samarkhatiwala/pympiutils/examples
import numpy as np
from pyioutils.hdfio import load, save

# Create some data
# NumPy array
x=np.arange(100,dtype=np.float64)
# List
y=[np.arange(n,dtype=np.float64) for n in range(1,5)]
# String
z='This is a test'
# Dictionary
d=dict({'v1': [1,2,3],'v2': 'some text', 'v3': np.ones(4)})

# Save the first three variables to HDF5 file passing them as a dictionary
save({'x': x, 'y': y, 'z': z}, 'data.h5')

# Add (append) the fourth one to the same file
save({'d': d}, "data.h5", append=True)

# Read it all back in
s=load("data.h5")

# Inspect
s.keys()
s.d.keys()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyioutils-1.0.0.tar.gz (12.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyioutils-1.0.0-py3-none-any.whl (12.2 kB view details)

Uploaded Python 3

File details

Details for the file pyioutils-1.0.0.tar.gz.

File metadata

  • Download URL: pyioutils-1.0.0.tar.gz
  • Upload date:
  • Size: 12.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.9.6

File hashes

Hashes for pyioutils-1.0.0.tar.gz
Algorithm Hash digest
SHA256 5b2c057de323534c48ae27b5d3123cc3f27e7720d3e7942c9f4ef6fa0010fad7
MD5 f7d01c8fc7a5a9bdac58d02dad55c2ca
BLAKE2b-256 4f0dc2f6d2d15d9cf67c417ffe3a09660c1a5ebd5b20a2f874f7071380acbfab

See more details on using hashes here.

File details

Details for the file pyioutils-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: pyioutils-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 12.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.9.6

File hashes

Hashes for pyioutils-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0d981e01f72749b592f91ef7b3cc7f26b85f34db8ade1d78aa24d033b6623c6c
MD5 85a1d66c63b7a808f64756129117cf24
BLAKE2b-256 f21eafd2501331e1328b6612d652d3689a2652122f626ffba3b722cb34871579

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page