Skip to main content

Tools to support interoperability and the adoption of standards for permafrost data files.

Project description

Introduction

The Permafrost File Interoperability Toolkit (PFIT) is designed to promote interoperability and the adoption of standards for permafrost data files. This package currently supports the NTGS ground temperature standard. It includes tools to check and manipulate ground temperature data.

File Data Checker

The FileDataChecker checks column names and values for CSV, XLS, and XLSX files. It logs issues with the files being read in if they do not conform to the NTGS standard.

The File Data Checker can be run by passing arguments through the command line, but it can also be imported for use as a module.

The following functions are available from the class:

@static pathExists(path: str)

Checks for the existence of a path and raises an exception if it does not exist.

@static createPathIfExists(path: str)

Creates a path if it does not exist and returns the path initially passed in otherwise.

checkPath(pathLob: str, isVerbose: bool, logPath: str)

  • pathLoc - A string of a file path leading to the file to be checked. This can also be a zip file, which will be unzipped.

  • isVerbose - A boolean value that determines if true, verbose logging to the console will also occur.

  • logPath - A string of a file path that can either lead to a directory or a specific file for the log file to be created at.

    This parameter can be left as None or an empty string (although something must still be passed into the function).

Sets logging level, creates passed file paths if non-existent, unzips files if in ZIP format and calls checkFile.

checkFile(fileName: str)

Opens file with pandas and applies the error checks described below.

The following errors may be reported:

  • Invalid Time - Time does not follow a valid time in the format HH:MM:SS.
  • Invalid Date - Date values should be formatted as YYYY-MM-DD.
  • Unexpected Column - One of the first 6 column names is not from the expected list of column names (or is not in the correct order). If this warning occurs, the columns must be resolved in the correct name and order first, otherwise no other checking is done.
    • Expects data files to contain the first 6 columns with the exact following names: project_name, site_id, latitude, longitude, date_YYYY-MM-DD, time_HH:MM:SS
  • Unexpected Metre - All following metre columns after the first 6 column names should be formatted as "_m" only.
  • No Measurements - No measurement columns are detected in the file.
  • File Type - The file read in is not supported.
  • Coordinate - A latitude or longitude value contains something that is not valid.
  • Latitude - A latitude value is found that is not valid (Less than -90 or greater than 90).
  • Longitude - A longitude value is found that is not valid (Less than -180 or greater than 180).
  • Temperature - A temperature value is found that is not a valid temperature.

XLS and XLSX files are not recommended as they can be problematic when parsing date/time values. Please consider saving data in CSV format.

If you do decide to use XLS(X) files, ensure that the data is located in the first sheet as this is is the only sheet that is checked.

CSV Column Melter

The CSVColMelter accepts existing ground temperature data files that are in the wide format and converts it to the long CSV format through transposition of depth columns. Files must conform to the NTGS-style ground temperature file format. This can be verified with the FileDataChecker.

The CSV Column Melter can be run by passing arguments through the command line, but it can also be imported for use as a module.

The following functions are available from the class:

@static timezone_check(tz: str)

Converts the timezone value to a float and checks if it is within reasonable range. Function for the command line argument parsing.

@static pathExists(path: str)

Checks for the existence of a path and raises an exception if it does not exist.

getISOFormat(date: str or datetime.datetime, time: str or datetime.time)

Used in pandas value interpretation. Parses a date string as YYYY-MM-DD or datetime.datetime object and a time string as HH:MM:SS or a datetime.time object, returns a datetime.datetime object in ISO format.

meltFile(filename: str, outLoc: str)

Opens file for melting, outputs to specified output location (outLoc) when dataframe has been melted.

meltDataFrame(df: pandas.DataFrame)

Dataframe of read in file is manipulated from wide to long format.

Conversion to NetCDF

NTGS_to_NetCDF converts NTGS-style CSV files into NetCDF (.nc). Currently a work in progress.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pfit-0.0.6.tar.gz (13.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pfit-0.0.6-py3-none-any.whl (13.6 kB view details)

Uploaded Python 3

File details

Details for the file pfit-0.0.6.tar.gz.

File metadata

  • Download URL: pfit-0.0.6.tar.gz
  • Upload date:
  • Size: 13.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/50.3.1 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.8.6

File hashes

Hashes for pfit-0.0.6.tar.gz
Algorithm Hash digest
SHA256 7c4d3c6e9758a7564b8b68db31aa8ce3f36ed1be005acd56df16b4e98f30da69
MD5 9d437d14a6bd0de7ef874bfd72bfb494
BLAKE2b-256 f96fcb1afaf0ed2230a8a32481cb497d4a5aae729622e49318cb87282bc78765

See more details on using hashes here.

File details

Details for the file pfit-0.0.6-py3-none-any.whl.

File metadata

  • Download URL: pfit-0.0.6-py3-none-any.whl
  • Upload date:
  • Size: 13.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/50.3.1 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.8.6

File hashes

Hashes for pfit-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 815f6cd85f3b430da75e232578eef59cbb1179988e6110efd302c48d4650afbd
MD5 4857a23788042ee06580951bfd4d4b13
BLAKE2b-256 5611adb3344390d26a90b53f52a431961d807eddf5c72a87d42a9d4542da9745

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page