Skip to main content

Weave insitu data together

Project description

insitupy

https://img.shields.io/pypi/v/insitupy.svg

Manage reading and analyzing raw files of insitu data. The goal is to get raw insitu data files into manageable classes with helpful functions and provide access to the data as GeoPandas Arrays.

The first application of this will be for SnowEx pit data.

THIS IS A WORK IN PROGRESS (use at your own risk as outlined in the MIT license). We fully welcome any contribution and ideas. Snow science works better together!

  • Free software: MIT license

Features

  • Parsing of raw insitu data files with a variable number of header lines

  • Reading csv data files into a pandas DataFrame, and parsing metadata into usable format

  • Flexible user-defined variables

  • Reading both pit and point data

  • Parsing of date and location data

Definitions

Types of variables

Insitupy uses two types of variable definitions to parse CSV information:

1. primary variables - These are the data that expected to be found in the data columns. Think of them as the column headers that describe the data. In the example file <example_file_> below it would be the last header row indicated by the ‘#’. 2. metadata variables - These are the data that are expected to be found in the header lines. These are assumed to start with a ‘#’ sign by default. All data above the last row with the ‘#’ is assumed to be additional information that describes the data.

Variable definitions

Both variables types are defined the same way, in a separate yaml file. A standard single variable definition looks like this:

TOTAL_DEPTH:  # <- YAML root key
  code: total_depth
  description: Total depth of measurement
  map_from:
  - total_snow_depth
  - hs
  match_on_code: true
  auto_remap: true
  • code: The string that will be used to reference this variable within the insitupy code

  • description: A description of the variable

  • map_from: A list of strings that will be used to match the entry as primary or metadata variables

  • match_on_code: If true, the variable will be matched if the code values is found

    in the data, not just the map_from values

  • auto_remap: If true, the variable will be remapped to the code value

Overriding variables

One can override a variable by defining the same code as in the default supplied files within insitupy. The user definition always has precedence over the internal ones.

Example

I want to read in a file that looks like this:

# Location,East River
# Date/Local Standard Time,2020-04-27T08:45
# Latitude,38.92524
# Longitude,-106.97112
# Flags,random flag
# Top (cm),Bottom (cm),Density (kg/m3)
95.0,85.0,401.0
85.0,75.0,449.0
75.0,65.0,472.0

Parsing data

The metadata, variable and data parsers are configured with a lot of defaults for working with SnowEx data within insitupy, so the simple approach is

from insitupy.campaigns.snowex import SnowExProfileData
my_data = SnowExProfileDataCollection.from_csv(
    "./some_data.csv",
    # Don't fail when there are unknown variables in the header
    allow_map_failure=True
)
# Inspect the data
print(my_data.profiles[0].df)
# Look at the parsed metadata
print(my_data.profiles[0].metadata)

Defining your own variables

If you want to try your hand at defining variables yourself, you can do as follows.

A user custom metadata YAML file:

LATITUDE:
  auto_remap: true
  code: latitude
  description: Latitude
  map_from:
  - lat
  - latitude
  match_on_code: true
LONGITUDE:
  auto_remap: true
  code: longitude
  description: Longitude
  map_from:
  - long
  - lon
  - longitude
DATETIME:
  auto_remap: true
  code: datetime
  description: Combined date and time
  map_from:
  - Date/Local Standard Time
  - date/local_standard_time
  - datetime
  - "date&time"
  - date/time
  - date/local_time
  match_on_code: true
SITE_NAME:
  auto_remap: true
  code: site_name
  description: Name of campaign site
  map_from:
      - location
  match_on_code: true

and a separate primary variable YAML file:

BOTTOM_DEPTH:
  auto_remap: true
  code: bottom_depth
  description: Lower edge of measurement
  map_from:
  - bottom
  - bottom_depth
  match_on_code: true
DENSITY:
  auto_remap: true
  code: density
  description: measured snow density
  map_from:
  - density
  - density_mean
  match_on_code: true
DEPTH:
  auto_remap: true
  code: depth
  description: top or center depth of measurement
  map_from:
  - depth
  - top
  match_on_code: true
LAYER_THICKNESS:
  auto_remap: true
  code: layer_thickness
  description: thickness of layer
  map_from: null
  match_on_code: true

Save the two files to your local hard drive. They will be used as arguments in Python code with the next step.

Then use the new definitions and read in the file:

from insitupy.campaigns.snowex import SnowExProfileData
my_data = SnowExProfileDataCollection.from_csv(
    "./some_data.csv",
    # Don't fail when there are unknown variables in the header
    allow_map_failure=True,
    # Use the files YOU defined here
    primary_variable_files="/path/to/saved/primaryvariables.yaml",
    metadata_variable_files="/path/to/saved/metadatavariables.yaml",
)
print(my_data.profiles[0].df)
print(my_data.profiles[0].metadata)

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

History

0.1.0 (2024-03-27)

  • First release on PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

insitupy-0.4.5.tar.gz (42.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

insitupy-0.4.5-py2.py3-none-any.whl (31.9 kB view details)

Uploaded Python 2Python 3

File details

Details for the file insitupy-0.4.5.tar.gz.

File metadata

  • Download URL: insitupy-0.4.5.tar.gz
  • Upload date:
  • Size: 42.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for insitupy-0.4.5.tar.gz
Algorithm Hash digest
SHA256 709a791e245f3db96fc1a9016490e602506713d70165cb597faa7850faa87b8e
MD5 da813b9dbdbf8cf48bd29262f539193b
BLAKE2b-256 79bd9fb226b593f2c5cb584ae6eaa23f3e4e980fb09ec223c7cfce3ec7525b2f

See more details on using hashes here.

File details

Details for the file insitupy-0.4.5-py2.py3-none-any.whl.

File metadata

  • Download URL: insitupy-0.4.5-py2.py3-none-any.whl
  • Upload date:
  • Size: 31.9 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for insitupy-0.4.5-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 5029e97a0884049149f8e443d484e991a93494810ccbabd10c7b266c3b31f594
MD5 cbeaff3946d3a93113eb5c64c6870643
BLAKE2b-256 0ba369a3da85f1232a8f70cb98ab7c4b5bee30c4b1d9739f9de690e3c4e307bd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page