A tool to compare and/or summarize netCDF files
Project description
Guide to ncreview
ncreview is a tool which allows users to produce interactive web-based comparisons between datastreams or summaries of a single datastream, providing information on netCDF data and metadata. The metadata part of the review is produced in a non-lossy way which preserves all metadata information present throughout each datastream. Numerical data is summarized with statistics like min, max, mean, n_missing, etc. for a summary interval which can be specified by the user at the command line.
Setup
To use ncreview normally on the ARM servers, there are a few modifications you need to make to your enviornment. Set the following enviornment variables in your profile:
PATHto/apps/ds/bin:$PATHPYTHONPATHto/apps/ds/lib
Installation
git clone https://code.arm.gov/tpowell/ncreview-wind-data-hub.git
cd ncreview-wind-data-hub
./build.sh --prefix=[install directory]
Command Line Interface
The reports are created through the ncreview command line interface, and deposited at a URL to be opened in a browser. Usage help can be found by typing ncreview --help.
Example call, 1 datastream:
ncreview -n qcradSGP_C1_all -t 01-00-00 .
In this example the output URL will have the name "qcradSGP_C1_all" appended at the end of the link. The time averaging will be 1 hour, and it will plot all the data in the working directory since there are no start and end date arguments (In this case pwd is /data/archive/sgp/sgpqcrad1longC1.c2).
Web Report
The web-based report is laid out in a hierarchical structure of nested expandable elements, which, for comparisons, are color-coded to indicate the difference in the data they contain. As described in the expandable legend in the upper-right hand corner, throughout the report blue is used to indicate that data has changed, red is used to indicate that data was in the old report but is not in the new (removed), and green means that data is in the new report but not the old one (added).
At the top level, there are five main sections into which the summary information is divided:
File Timeline
The file timeline provides a visual summary of what time periods files in the datastream cover. The timeline is interactive, and users can zoom and pan around it using the mouse. Hovering over one of the grey file rectangles will bring up its begin and end times in the table below.
Attributes
The attributes section provides summary information on global attributes throughout the datastream(s). If an attribute's value remained constant over the collection of files scanned, it will be displayed as a static value next to its name. If the attribute's value varied, or was different in even one file, that value will be displayed on a timeline. This timeline works very similarly to the file timeline: it can be zoomed and panned, and hovering over some section of the timeline reveals the attribute's value at that time.
Dimensions
The dimensions section works very similarly to the attributes section, but instead of displaying attribute names and values, it displays dimension names and lengths.
Variables
Each variable section contains a summary of its data, a list of its dimensions, and a variable attributes section which behaves precisely like the global attributes section. If viewing a summary of 1 datastream, and the variable has companion variables such as QC data, these will be stored in a companions section in the variable structure; otherwise, for datastream comparisons, companion variables are listed separately in their own variable sections.
Summary
Each variable, both actual and qc data, are further summarized, showing changes between old and new files. This is useful for viewing the total number of missing values, infinity values, fill values, and extremely large and small values in a simple-to-read, color-coded chart quickly and efficiently, without expanding all variable drop-downs and panning for bad data.
Data
A variable's data can be displayed in several formats, depending on its dimensionality:
-
Dimensionless
A dimensionless variable's data is displayed as either a static value or as a timeline, just like dimension lengths or attribute values.
-
Dimensioned by
timeData dimensioned by
timeis displayed in an interactive plot which plots one of a number of summary statistics. To change which summary statistic is being displayed, click the name of that statistic in the table below the plot.For data with multiple dimensions, the summary statistics are calculated across all dimensions, and displayed as a single 2D plot the same as a variable dimensioned only by time.
When making a comparison, a background color appears behind the plot lines when the data differs, according to the color scheme in the legend.
The "variable data plot" button below the interactive plot uses ACT (Atmospheric data Community Toolkit) to generate unsummarized plots of the data for a time range specified by the bounds currently selected within the interactive plot. This can be useful when zooming in to produce higher detail plots than are available interactively, or to leverage any of the extra functionality provided by ACT. Please note that ACT plots can take a while to appear for large sets of data.
-
Dimensioned, but not by
timeIn this case, each file's data is summarized into a few values like min, max and mean, and these values are displayed in a table. If values vary from file to file, then they are displayed in an interactive plot, which works very similarly to the timed plot.
Running on non-ARM servers
By default, the webpage is loaded by engineering.arm.gov, which expects to find the data on the ARM servers. If you wish to run ncreview on a non-ARM server, and then view the data created, you will need to use ncrserver to create a local server.
ncrserver usage
ssh [user]@[server] -L [local_port]:localhost:[server_port]
ncrserver -p [server_port]
So, for example, say you have ncreview data with id "my_ncr_data". If your machine's port 8000 is open, and the ncreview output is on a server with port 8080 open, you can set up ncrserver like so:
ssh [user]@[server] -L 8000:localhost:8080
ncrserver -p 8080
and then enter this URL into your browser window:
http://localhost:8000/ncreview/?id=my_ncr_data
(Of course, you should have been given this URL when you ran ncreview; just edit the port number if necessary.)
Note the http - https will not work!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ncreview_plus-0.0.post1.dev14.tar.gz.
File metadata
- Download URL: ncreview_plus-0.0.post1.dev14.tar.gz
- Upload date:
- Size: 3.8 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
627b5fdce1455f0d236f25e2997c7c16603100353ee13c5c651cc33d57f3ae57
|
|
| MD5 |
76e02fd8a8b927f19b851c1273ce7b1e
|
|
| BLAKE2b-256 |
35244114115f3bd1598f0f3fd8b16494491f0ffce25df3d00db55c77b082cbe8
|
File details
Details for the file ncreview_plus-0.0.post1.dev14-py3-none-any.whl.
File metadata
- Download URL: ncreview_plus-0.0.post1.dev14-py3-none-any.whl
- Upload date:
- Size: 3.9 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d83bc09779f83cdaa9d097e6cdb8ba565798af0843f2184291b3b7a2f54735fa
|
|
| MD5 |
b01c582aa9adea1d30a697985d661298
|
|
| BLAKE2b-256 |
497a617f1028a2c90a85807a26de5a2996ea63d587eaee43c21551ee0d25181f
|