Skip to main content

Predict soil hydraulic parameters from basic soil properties.

Project description

This package provides an implementation of Rosetta, a group of neural network model for predicting unasturated soil hydraulic parameters from basic soil characterization data.

For many use cases it is not necessary to install this package – there is a web browser interface and an api to rosetta-soil that is available at https://www.handbook60.org/rosetta

Installation

pip install rosetta-soil

Quickstart

>>> from rosetta import rosetta

# required ordering for data records:
# [sa (%), si (%), cl (%), bd (g/cm3), th33, th1500]
# sa, si, and cl are required; others optional

>>> data = [
        [30,30,40,1.5,0.3,0.1],
        [20,60,20],
        [55,25,20,1.1]
    ]

>>> mean, stdev, codes = rosetta(3, data)

>>> with np.printoptions(precision=4, suppress=True):
...     print(mean)

[[ 0.0687  0.3839  0.0037  1.5123 10.4522  1.0203  0.2242]
 [ 0.0899  0.4301  0.0038  1.4993 15.8995  0.8909  0.1726]
 [ 0.0913  0.485   0.0097  1.4172 84.7834  2.9015 -0.3463]]

>>> with np.printoptions(precision=4, suppress=True):
...  print(stdev)

[[ 0.0136  0.015   0.0012  0.1176  4.816   0.4968  1.4017]
 [ 0.0067  0.0088  0.0007  0.0454  3.1789  0.449   1.6119]
 [ 0.0128  0.0131  0.0022  0.0577 27.0357  1.8203  1.2248]]

>>> print(codes)

[5 2 3]

Background

The Rosetta pedotransfer functions predict seven parameters for the van Genuchten model of unsaturated soil hydraulic properties

  • theta_r : residual volumetric water content

  • theta_s : saturated volumetric water content

  • alpha : retention shape parameter (1/cm)

  • n : retention shape parameter

  • ksat : saturated hydraulic conductivity (cm/d)

  • k0 : unsaturated conductivity matching point (cm/d)

  • L : unsaturated conductivity exponent

Rosetta provides four models for predicting the seven parameters from soil characterization data. The models differ in the required input data

Model Code

Input Data

2

sa, si, cl (SSC)

3

SSC, bulk density (BD)

4

SSC, BD, th33

5

SSC, BD, th33, th1500

where

  • sa, si, cl are percentages of sand, silt and clay

  • BD is soil bulk density (g/cm3)

  • th33 is the soil volumetric water content at 33 kPa

  • th1500 is the soil volumetric water content at 1500 kPa

Three versions of Rosetta are available for predicting theta_r, theta_s alpha, n, and ksat. The versions effectively represent three alternative calibrations or trainings of the four Rosetta models. The references that should be cited when using Rosetta versions 1, 2, and 3 are, respectively:

[1] Schaap, M.G., Leij, F.J., and Van Genuchten, M.T. 2001. ROSETTA: a computer program for estimating soil hydraulic parameters with hierarchical pedotransfer functions. Journal of Hydrology 251(3-4): 163-176. doi: 10.1016/S0022-1694(01)00466-8

[2] Schaap, M.G., A. Nemes, and M.T. van Genuchten. 2004. Comparison of Models for Indirect Estimation of Water Retention and Available Water in Surface Soils. Vadose Zone Journal 3(4): 1455-1463. doi: 10.2136/vzj2004.1455

[3] Zhang, Y. and Schaap, M.G. 2017. Weighted recalibration of the Rosetta pedotransfer model with improved estimates of hydraulic parameter distributions and summary statistics (Rosetta3). Journal of Hydrology 547: 39-53. doi: 10.1016/j.jhydrol.2017.01.004

The parameters k0 and L are predicted with a neural network model from [1] that uses as inputs values for the retention parameters theta_r, theta_s, alpha, and n. In the current implementation, the same model is to predict k0 and L regardless of the version of Rosetta used to estimate the retention paramters.

Usage

Soil data should be orgainzed as a collection of data records,

>>> data = [
...     [30, 30, 40, 1.5, 0.3, 0.1],
...     [20, 60, 20],
...     [55, 25, 20, 1.1]
... ]

where each element contains soil data in this order:

[%sand, %silt, %clay, bulk density, th33, th1500]

Sand, silt, and clay are required in each element; the others are optional.

Two functions, rosetta and rosesoil, are provided for predicting soil hydraulic parameters. The two functions have the same arguments and return the same results, but differ in the format of the returned results. The function rosetta returns the predicted parameters in numpy arrays, whereas rosesoil returns RosettaResults object.

rosetta and rosesoil take as arguments the Rosetta version to be used (1, 2, or 3) and the array-like soil data. A third optional argument, estimate_type, specifies the parameter estimate to be returned for alpha, npar, Ksat, and K0:

estimate_type: Literal['linear', 'log', 'geo']
   'linear' (default):
       Estimate is arithmetic mean.
   'log':
       Estimate is mean of log10 transformed parameter value.
   'geo':
       Estimate is geometric mean.

The function rosetta returns a 3-tuple of numpy arrays

>>> from rosetta import rosetta

>>> data = [
...     [30,30,40,1.5,0.3,0.1],
...     [20,60,20],
...     [55,25,20,1.1]
... ]

>>> mean, stdev, codes = rosetta(3, data)

>>> with np.printoptions(precision=4, suppress=True):
...     print(mean)

[[ 0.0687  0.3839  0.0037  1.5123 10.4522  1.0203  0.2242]
 [ 0.0899  0.4301  0.0038  1.4993 15.8995  0.8909  0.1726]
 [ 0.0913  0.485   0.0097  1.4172 84.7834  2.9015 -0.3463]]

>>> with np.printoptions(precision=4, suppress=True):
...  print(stdev)

[[ 0.0136  0.015   0.0012  0.1176  4.816   0.4968  1.4017]
 [ 0.0067  0.0088  0.0007  0.0454  3.1789  0.449   1.6119]
 [ 0.0128  0.0131  0.0022  0.0577 27.0357  1.8203  1.2248]]

>>> print(codes)

[5 2 3]

In the above, mean is a 2D array with shape (nsamples, 7). The ith row holds predicted soil hydraulic parameters for ith entry in data. The columns are:

Column

Parameter

0

theta_r, residual water content

1

theta_s, saturated water content

2

alpha or log10(alpha), ‘alpha’ shape parameter, 1/cm

3

npar or log10(npar), ‘n’ shape parameter

4

Ksat or log10(Ksat), saturated conductivity, cm/day

5

K0 or log10(K0), conductivity matching point, cm/day

6

lpar, unsaturated conductivity exponent, cm/day

stdev is 2D array holding the corresponding parameter standard deviations. Note these are standard deviations for the bootstrap resamples, not the standard error of the mean estimate.

codes is a 1D array with the ith entry indicating the Rosetta model and input data used to predict the ith row of mean and stdev.

Code

Data used

2

sand, silt, clay (SSC)

3

SSC + bulk density (BD)

4

SSC + BD + field capacity water content (TH33)

5

SSC + BD + TH33 + wilting point water content (TH1500)

-1

no result returned, inadequate or erroneous data

The function rosesoil works the same way:

>>> from rosetta import rosesoil


>>> data = [
...     [30,30,40,1.5,0.3,0.1],
...     [20,60,20],
...     [55,25,20,1.1]
... ]

>>> result = rosesoil(3, data)

>>> print(result)

RosettaResults([RosettaResult(sand=30.0000, silt=30.0000, clay=40.0000,
rhob=1.5000, th33=0.3000, th1500=0.1000, version=3, estimate_type='linear',
code=5, thr=0.0687, ths=0.3839, alpha=0.0037, npar=1.5123, ksat=10.4522,
k0=1.0203, lpar=0.2242, thr_std=0.0136, ths_std=0.0150, alpha_std=0.0012,
npar_std=0.1176, ksat_std=4.8160, k0_std=0.4968, lpar_std=1.4017),
RosettaResult(sand=20.0000, silt=60.0000, clay=20.0000, rhob=None,
th33=None, th1500=None, version=3, estimate_type='linear', code=2,
thr=0.0899, ths=0.4301, alpha=0.0038, npar=1.4993, ksat=15.8995, k0=0.8909,
lpar=0.1726, thr_std=0.0067, ths_std=0.0088, alpha_std=0.0007,
npar_std=0.0454, ksat_std=3.1789, k0_std=0.4490, lpar_std=1.6119),
RosettaResult(sand=55.0000, silt=25.0000, clay=20.0000, rhob=1.1000,
th33=None, th1500=None, version=3, estimate_type='linear', code=3,
thr=0.0913, ths=0.4850, alpha=0.0097, npar=1.4172, ksat=84.7834, k0=2.9015,
lpar=-0.3463, thr_std=0.0128, ths_std=0.0131, alpha_std=0.0022,
npar_std=0.0577, ksat_std=27.0357, k0_std=1.8203, lpar_std=1.2248)])


To get the results as a list of dictionaries, use `.asdicts()`

>>> print(result.asdicts())

[{'sand': 30.0, 'silt': 30.0, 'clay': 40.0, 'rhob': 1.5, 'th33': 0.3,
'th1500': 0.1, 'version': 3, 'estimate_type': 'linear', 'code': 5, 'thr':
0.06872133198419333, 'ths': 0.3839050853475144, 'alpha':
0.0036914540698334396, 'npar': 1.5122960137476906, 'ksat':
10.45220672557941, 'k0': 1.020283781688175, 'lpar': 0.22415802649553881,
'thr_std': 0.013635805076704855, 'ths_std': 0.014976665455951114,
'alpha_std': 0.0012009965476429596, 'npar_std': 0.11756697611413282,
'ksat_std': 4.816027639984864, 'k0_std': 0.4967836368490979, 'lpar_std':
1.4016987918707104}, {'sand': 20.0, 'silt': 60.0, 'clay': 20.0, 'rhob':
None, 'th33': None, 'th1500': None, 'version': 3, 'estimate_type':
'linear', 'code': 2, 'thr': 0.08994502219206939, 'ths': 0.4301366480210401,
'alpha': 0.0038032951212636345, 'npar': 1.4992978643076722, 'ksat':
15.899549552218094, 'k0': 0.8908630387641998, 'lpar': 0.17264425930013833,
'thr_std': 0.006710949117601514, 'ths_std': 0.008790220586404331,
'alpha_std': 0.0006666891009750212, 'npar_std': 0.04539463666512193,
'ksat_std': 3.178862086739814, 'k0_std': 0.4489771681065423, 'lpar_std':
1.6118830509323745}, {'sand': 55.0, 'silt': 25.0, 'clay': 20.0, 'rhob':
1.1, 'th33': None, 'th1500': None, 'version': 3, 'estimate_type': 'linear',
'code': 3, 'thr': 0.09130753033144609, 'ths': 0.485031958049669, 'alpha':
0.009748718589139088, 'npar': 1.4171964842691995, 'ksat': 84.7833726853498,
'k0': 2.9015059553301388, 'lpar': -0.3463378623447919, 'thr_std':
0.012777797583517441, 'ths_std': 0.013068706563916467, 'alpha_std':
0.0022148332082555515, 'npar_std': 0.05767077407189314, 'ksat_std':
27.035739850886362, 'k0_std': 1.8202787951521058, 'lpar_std':
1.2248193041204392}]

Alternative usage

Predictions can also be made using the Rosetta and UnsaturatedK classes. These classes return the raw bootstrap ensembles of the predicted parameters, which may be of interest to researchers who want to compute alternative summary statistics or who want to use the resamples for other purposes.

>>> from rosetta import Rosetta

The Rosetta class is instantiated for a particular Rosetta version and model,

>>> rose33 = Rosetta(rosetta_version=3, model_code=3)

The .predict() method returns a 2-tuple of numpy arrays. The first has shape (nboot, nsample, nouput=4), where the 4 outputs are: [theta_r, theta_s, log10(alpha), log10(npar)]. The second array has shape (nboot, nsample, nouput=1), where the 1 ouput is [log10(Ksat)].

Note that whereas rosetta() and rosesoil() allow data elements of differing lengths, .predict() requires that data be compatible with a full 2D array that has the correct number of columns for the instantiated model (ncols = data.shape[1] = model_code + 1).

>>> from rosetta import Rosetta
>>> rose33 = Rosetta(rosetta_version=3, model_code=3)
>>> data =[
...     # sa, si, cl, bd
...     [30, 30, 40, 1.5],
...     [55, 25, 20, 1.1]
... ]
>>> retc_boot, ksat_boot = rose33.predict(data)
>>> print(retc_boot.shape)
(1000, 2, 4)
>>> print(ksat_boot.shape)
(1000, 2, 1)

Statistics for the bootstrap resamples are computed along the zero axis, e.g.

>>> import numpy as np
>>> print(np.mean(retc_boot, axis=0))

[[ 0.11535773  0.41791199 -2.06713851  0.11201021]
[ 0.09130753  0.48503196 -2.02238809  0.15107161]]

The UnsaturatedK class is used to predict K0 and L from the retention parameters theta_r, theta_s, alpha, and npar. The input retention parameters must be linear values, not log10 transformed. The .predict() method returns a 3D array with shape (nboot, nsamples, nouput=2), where the two outputs are [log10(K0), Lpar].

>>> from rosetta import UnsaturatedK
>>> unsatk = UnsaturatedK()
>>> retc_params = [
...     # theta_r, theta_s, alpha, npar
...     [0.12, 0.42, 0.008, 1.29],
...     [0.09, 0.49, 0.009, 1.41]
... ]
>>> unsat_boot = unsatk.predict(retc_params)
>>> print(unsat_boot.shape)
(100, 2, 2)
>>> print(np.mean(unsat_boot, axis=0))
[[-0.04057941, -1.03302704],
[ 0.35788353, -0.25048901]]

Acknowledgments

The values for the neural network weights and biases were obtained from Marcel Schaap and Yonggen Zhang at the University of Arizona.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rosetta_soil-0.3.tar.gz (2.5 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rosetta_soil-0.3-py3-none-any.whl (2.5 MB view details)

Uploaded Python 3

File details

Details for the file rosetta_soil-0.3.tar.gz.

File metadata

  • Download URL: rosetta_soil-0.3.tar.gz
  • Upload date:
  • Size: 2.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: Hatch/1.16.3 cpython/3.13.11 HTTPX/0.28.1

File hashes

Hashes for rosetta_soil-0.3.tar.gz
Algorithm Hash digest
SHA256 c12b6c85dee559e80b51bbe31d58cf6623137dd78c0295c803ea916a8f94af5a
MD5 5622e0957241a5983b9eaf4cce855d70
BLAKE2b-256 2d8cbfa9107d6de00d24653b90e4c99b759434dffe0625e111a637053fadc3b5

See more details on using hashes here.

File details

Details for the file rosetta_soil-0.3-py3-none-any.whl.

File metadata

  • Download URL: rosetta_soil-0.3-py3-none-any.whl
  • Upload date:
  • Size: 2.5 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: Hatch/1.16.3 cpython/3.13.11 HTTPX/0.28.1

File hashes

Hashes for rosetta_soil-0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 1358ad61e6f89bb33aef578e38755a89dc6f533a85aa9b39ca79be1c602d1418
MD5 0d29e31280cdf999b6b0a0c0f3fc7c82
BLAKE2b-256 b34eb823a8a708ebe2458b5609e7ddc7debde211297833260e13348da10a8a2f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page