Time series processing
Project description
Time series data processing
diive
is a Python library for time series processing, in particular ecosystem data. Originally developed
by the ETH Grassland Sciences group for Swiss FluxNet.
Recent updates: CHANGELOG
Recent releases: Releases
Example notebooks can be found in the folder notebooks
. More notebooks are added constantly.
Current Features
Analyses
- Calculate z-aggregates in quantiles (classes) of x and y (notebook example)
- Daily correlation (notebook example)
- Decoupling: Sorting bins method (notebook example)
- Find data gaps (notebook example)
- Histogram (notebook example)
- Optimum range
- Percentiles (notebook example)
Corrections
- Offset correction
- Set to threshold
- Wind direction offset detection and correction (notebook example)
Create variable
- Calculate time since last occurrence, e.g. since last precipitation (notebook example)
- Calculate daytime flag, nighttime flag and potential radiation from latitude and longitude (notebook example)
- Day/night flag from sun angle
- VPD from air temperature and RH (notebook example)
Eddy covariance high-resolution
- Flux detection limit from high-resolution data
- Find maximum covariance between turbulent wind and scalar
- Wind rotation to calculate turbulent departures of wind components and scalar (e.g. CO2)
Files
- Detect expected and unexpected (irregular) files in a list of files
- Split multiple files into smaller parts and export them as (compressed) CSV files
- Read single data file with parameters (notebook example)
- Read single data file with pre-defined filetype (notebook example)
- Read multiple data files with pre-defined filetype (notebook example)
Fits
- Bin fitter
Flux
- Critical heat days for NEP, based on air temperature and VPD
- CO2 penalty
- USTAR threshold scenarios
Flux processing chain
For info about the Swiss FluxNet flux levels, see here.
- Flux processing chain (notebook example)
- The notebook example shows the application of:
- Level-2 quality flags
- Level-3.1 storage correction
- Level-3.2 outlier removal
- The notebook example shows the application of:
- Quick flux processing chain (notebook example)
Formats
Format data to specific formats
- Convert EddyPro fluxnet output files for upload to FLUXNET database (notebook example)
- Load and save parquet files (notebook example)
Gap-filling
Fill gaps in time series with various methods
- XGBoostTS (notebook example (minimal), notebook example (more extensive))
- RandomForestTS (notebook example)
- Linear interpolation (notebook example)
- Quick random forest gap-filling (notebook example)
Outlier Detection
Single outlier tests create a flag where 0=OK
and 2=outlier
.
Multiple tests combined
- Step-wise outlier detection
Single tests
- Absolute limits (notebook example)
- Absolute limits, separately defined for daytime and nighttime data (notebook example)
- Incremental z-score: Identify outliers based on the z-score of increments
- Local standard deviation: Identify outliers based on the local standard deviation from a running median
- Local outlier factor: Identify outliers based on local outlier factor, across all data
- Local outlier factor: Identify outliers based on local outlier factor, daytime nighttime separately
- Manual removal: Remove time periods (from-to) or single records from time series
- Missing values: Simply creates a flag that indicated available and missing data in a time series
- z-score: Identify outliers based on the z-score across all time series data
- z-score: Identify outliers based on the z-score, separately for daytime and nighttime
- z-score: Identify outliers based on max z-scores in the interquartile range data
Plotting
- Heatmap showing values (z) of time series as date (y) vs time ( x) (notebook example)
- Heatmap showing values (z) of time series as year (y) vs month ( x) (notebook example)
- Long-term anomalies per year (notebook example)
- Simple (interactive) time series plot (notebook example)
- ScatterXY plot (notebook example)
- Various classes to generate heatmaps, bar plots, time series plots and scatter plots, among others
Quality control
- Stepwise MeteoScreening from database (notebook example)
Stats
- Time series stats (notebook example)
Timestamps
- Create continuous timestamp based on number of records in the file and the file duration
- Detect time resolution from data (notebook example)
- Insert additional timestamps in various formats
Installation
diive
can be installed from source code, e.g. using poetry
for dependencies.
diive
is currently developed under Python 3.9.7, but newer (and many older) versions should also work.
diive
can be installed using conda with conda intall -c conda-forge diive
One way to install and use diive
with a specific Python version on a local machine:
- Install miniconda
- Start
miniconda
prompt - Create a environment named
diive-env
that contains Python 3.9.7:conda create --name diive-env python=3.9.7
- Activate the new environment:
conda activate diive-env
- Install
diive
version directly from source code:pip install https://github.com/holukas/diive/archive/refs/tags/v0.63.1.tar.gz
(select .tar.gz file of the desired version) - If you want to use
diive
in Jupyter notebooks, you can install Jupyterlab. In this example Jupyterlab is installed from theconda
distribution channelconda-forge
:conda install -c conda-forge jupyterlab
- If used in Jupyter notebooks,
diive
can generate dynamic plots. This requires the installation of:conda install -c bokeh jupyter_bokeh
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.