W-Data format for superfluid dynamics and the W-SLDA Toolkit.
This project contains tools for working with and manipulating the W-data format used for analyzing superfluid data generated by the W-SLDA Toolkit.
This format was originally derived from the W-SLDA project led by Gabriel Wlazlowski as documented here:
Here we augment this format slightly to facilitate working with Python.
The original format required a
.wtxt file with lots of relevant
information. Here we generalize the format to allow this information
to be specified in the data files, which we allow to be in the NPY
python3 -m pip install wdata
The W-data format stores various arrays representing physical
quantities such as the density (real), pairing field (complex),
currents (3-component real vectors) etc. on a regular lattice of shape
Nxyz = (Nx, Ny, Nz) at a bunch of
The data is represented by two classes:
Var: These are the data variables such as density, currents, etc. with additional metadata (see the
wdata.io.IVarinterface for details):
Var.name: Name of variable as it will appear in VisIt for example.
Var.data: The actual data as a NumPy array.
Var.filename: The file where the data is stored on disk.
Var.unit: Unit (mainly for use in VisIt... does not affect the data.)
Additionally, the following can be provided, but can also be inferred from the
Var.descr: NumPy data descriptor (
Var.shape: Shape of the array.
WData: This represents a complete dataset. Some relevant attributes are (see
WData.infofile: Location of the infofile (see below). This is where the metadata will be stored or loaded from.
WData.variables: List of
(x, y, z)shaped so that they can be used with broadcasting. I.e.
r = np.sqrt(x**2+y**2+z**2).
WData.t: Array of times.
WData.dim: Dimension of dataset. I.e.
dim==1for 1D simulations,
dim==3for 3D simulations.
WData.aliases: Dictionary of aliases. Convenience for providing alternative data access in VisIt.
WData.constants: Dictionary of constants such as
WDataconstructor will check that the data exists. If you are missing data, you can suppress this check by calling
Here is a minimal set of data:
import numpy as np np.random.seed(3) from wdata.io import WData, Var Nt = 10 Nxyz = (4, 8, 16) dxyz = (0.3, 0.2, 0.1) dt = 0.1 Ntxyz = (Nt,) + Nxyz density = np.random.random(Ntxyz) data = WData(prefix='dataset', data_dir='_example_wdata', Nxyz=Nxyz, dxyz=dxyz, variables=[Var(density=density)], Nt=Nt) data.save(force=True)
This will make a directory
_example_wdata with infofile
$ tree _example_wdata _example_wdata |-- dataset.wtxt `-- dataset_density.wdat 0 directories, 2 files $ cat _example_wdata/dataset.wtxt # Generated by wdata.io: [2020-12-18 06:41:29 UTC+0000 = 2020-12-17 22:41:29 PST-0800] NX 4 # Lattice size in X direction NY 8 # ... Y ... NZ 16 # ... Z ... DX 0.3 # Spacing in X direction DY 0.2 # ... Y ... DZ 0.1 # ... Z ... prefix dataset # datafile prefix: <prefix>_<var>.<format> datadim 3 # Block size: 1:NX, 2:NX*NY, 3:NX*NY*NZ cycles 10 # Number Nt of frames/cycles per dataset t0 0 # Time value of first frame dt 1 # Time interval between frames # variables # tag name type unit format # description var density real none wdat # density
The data can be loaded by specifying the infofile:
from wdata.io import WData data = WData.load('_example_wdata/dataset.wtxt')
The data could be plotted using PyVista for example (the random data will not look so good...):
import numpy as np import pyvista as pv from wdata.io import WData data = WData.load('_example_wdata/dataset.wtxt') n = data.density grid = pv.StructuredGrid(*np.meshgrid(*data.xyz)) grid["vol"] = n.flatten(order="F") contours = grid.contour(np.linspace(n.min(), n.max(), 5)) p = pv.Plotter() p.add_mesh(contours, scalars=contours.points[:, 2]) p.show()
The recommended way to save data is to create variables for the data, times, and abscissa, then store this:
import numpy as np from wdata.io import WData, Var np.random.seed(3) Nt = 10 Nxyz = (32, 32, 32) dxyz = (10.0/32, 10.0/32, 10.0/32) dt = 0.1 # Abscissa. Not strictly needed, but if you have them, then use them # instead. t = np.arange(Nt)*dt xyz = np.meshgrid(*[(np.arange(_N)-_N/2)*_dx for _N, _dx in zip(Nxyz, dxyz)], sparse=True, indexing='ij') # Now make the WData object and save the data. Ntxyz = (Nt,) + Nxyz w = np.pi/t.max() ws = [1.0 + 0.5*np.cos(w*t), 1.0 + 0.5*np.sin(w*t), 1.0 + 0*t] density = np.exp(-sum((_x[None,...].T*_w).T**2/2 for _x, _w in zip(xyz, ws))) delta = np.random.random(Ntxyz) + np.random.random(Ntxyz)*1j - 0.5 - 0.5j current = np.random.random((Nt, 3,) + Nxyz) - 0.5 variables = [ Var(density=density), Var(delta=delta), Var(current=current) ] data = WData(prefix='dataset2', data_dir='_example_wdata/', xyz=xyz, t=t, variables=variables) data.save()
Now load and plot the data:
import numpy as np import pyvista as pv from wdata.io import WData data = WData.load(infofile='_example_wdata/dataset2.wtxt') n = data.density grid = pv.StructuredGrid(*np.meshgrid(*data.xyz)) grid["vol"] = n.flatten(order="F") contours = grid.contour(np.linspace(n.min(), n.max(), 5)) p = pv.Plotter() p.add_mesh(contours, scalars=contours.points[:, 2]) p.show()
Note: the actual data is loaded into python using memory-mapped arrays. This allows you to refer to very large data-sets without loading the entire data into memory. This will delay loading until a copy of the array is made. For example:
import numpy as np from wdata.io import WData data = WData.load(infofile='_example_wdata/dataset2.wtxt') # At this point, the data has not been fully loaded. You can # work with subsets efficiently. For example, the following will # only load the first frame of data: n = data.density # Beware: if you make a copy of the data, explicitly *or implicitly* then it will get # loaded. The following will load the full array into memory so that np.cos can do its # computations. sum_cos_n = np.sum(np.cos(data.density)) # If this is too big, you may want to process each slice independently. The previous # example could be more efficiently computed using the following loop: sum_cos_n = sum(np.cos(_n).sum() for _n in data.density) # The Dask package may be useful for such processing in more complicated settings.
For documentation, we use Sphinx. To build this run:
poetry install # Install all of the developer dependencies poetry run make -C docs html
__init__(): The default behavior of autodoc is to merge the documentation of
__init__methods with the class since the user never directly calls
__init__(). Keep this in mind when writing the docstrings.
- Resolve issue #3: Document that
WData(..., check_data=False)allows one to skip check of data. (Also added better support for saving
WData()objects with partial data.)
- Resolve issue #10: Provide working abscissa. This allows the user to provide abscissa
xthat are not equally spaced. These will be stored as data.
- Resolve issue #14: More flexible loading, providing defaults for missing optional values, and allowing for extra new but unused values (particularly, units provided for consts).
- Update to new W-Data format which specifies that all special parameters (
t0, etc.) should be case insensitive.
- Changed default value of
0so that we can have load and test empty datasets.
- Update and include
- Resolve issue #13:
WDatacan now load read-only files.
- Resolve issue #8. Vectors can have
Nv <= dim. Also, keep
Nxyzinfo even if
dim<3: this is how plane-wave approximations are used sometimes.
- Fixed many small bugs discovered by 100% coverage testing.
io.WData.load()etc. to constructor.
check_dataflag to optionally disable testing of data.
- Remove item-access. Use attribute access instead:
- Address issue #4 for loading large datasets. We now use memory mapped files.
- Started adding Sphinx documentation. Not complete (
sphinxcontrib.zopeextneeds updating... something is wrong.)
- Fixed issue #2.
datadim < 3now works properly.
- Started working on documentation (incomplete).
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.