A 'Python' implementation of the 'SoilProfileCollection' object from the 'R' package 'aqp'

These details have not been verified by PyPI

Project description

Python SoilProfileCollection Module

Author: Andrew Gene Brown

A Python implementation of the SoilProfileCollection object from the R package ‘aqp’.

Installation

For now you can install soilprofilecollection directly from GitHub.

For instance, add it to an existing project using poetry:

poetry add git+https://github.com/brownag/soilprofilecollection.git

Examples

The soilprofilecollection module provides the SoilProfileCollection class.

from soilprofilecollection import SoilProfileCollection

Data in a SoilProfileCollection object are instantiated from Pandas DataFrame objects.

import pandas as pd

# --- Sample Site Data (mimicking aqp::sp4 site data) ---
# Represents profile-level information.
# The 'id' column links to the 'id' column in the horizon data.

site_data_dict = {
    'id': ['P001', 'P002', 'P003', 'P004'],  # Profile IDs (unique identifier)
    'group': ['A', 'B', 'B', 'A'],           # Example site grouping variable
    'elev': [1154, 1158, 1156, 1150],        # Elevation (example numeric site variable)
    'slope_field': [4, 3, 5, 6],             # Slope (example numeric site variable)
    'aspect_field': [330, 290, 40, 90]       # Aspect (example numeric site variable)
    
    # Add other relevant site-level columns here (e.g., coordinates, classification)
}
site_data = pd.DataFrame(site_data_dict)
print(site_data)
#>      id group  elev  slope_field  aspect_field
#> 0  P001     A  1154            4           330
#> 1  P002     B  1158            3           290
#> 2  P003     B  1156            5            40
#> 3  P004     A  1150            6            90

# --- Sample Horizon Data (mimicking aqp::sp4 horizon data) ---
# Represents horizon-level information.
# - 'hzid': Unique identifier for each horizon row.
# - 'id': Profile ID, linking this horizon to a profile in the site data.
# - 'top', 'bottom': Depth columns defining the horizon boundaries.
# - 'hzname': Horizon designation (e.g., Ap, Bt1).
# - 'genhz': Master horizon designation group (e.g. A, B, C)
# - Other columns ('clay', 'sand', 'phfield'): Example horizon numeric properties.

hz_data_dict = {
    'hzid': [133, 134, 135, 136, 137,   # Unique horizon IDs (must be unique across all horizons)
             138, 139, 140, 141,        # Usually integers or unique strings
             142, 143, 144,
             145, 146, 147],
    'id': ['P001', 'P001', 'P001', 'P001', 'P001',  # Profile IDs (link to site data)
           'P002', 'P002', 'P002', 'P002',
           'P003', 'P003', 'P003',
           'P004', 'P004', 'P004'],
    'hzname': ['Ap', 'A', 'ABt', 'Bt1', 'Bt2',  # Horizon designations (often strings)
               'Ap', 'A', 'Bt1', 'Bt2',
               'Ap', 'AB', 'Bt',
               'A', 'Bw', 'C'],
    'genhz': ['A','A','A','B','B',
              'A','A','B','B',
              'A','A','B',
              'A','B','C'],
    'top': [0, 18, 30, 46, 61,  # Top depths (numeric)
            0, 15, 38, 56,
            0, 20, 41,
            0, 10, 35],
    'bottom': [18, 30, 46, 61, 91,  # Bottom depths (numeric)
               15, 38, 56, 84,
               20, 41, 76,
               10, 35, 80],
    'clay': [21, 20, 24, 26, 27,  # Clay content (%) (example numeric property)
             18, 19, 28, 25,
             22, 26, 29,
             15, 18, 12],
    'sand': [54, 53, 50, 48, 47,  # Sand content (%) (example numeric property)
             58, 55, 45, 48,
             51, 49, 42,
             60, 55, 65],
    'phfield': [6.2, 6.0, 5.9, 5.9, 6.1,  # pH (field measure) (example numeric property)
                6.5, 6.3, 6.0, 6.2,
                6.1, 5.8, 5.9,
                6.8, 6.5, 7.0]
    # Add other relevant horizon-level columns here (e.g., color, structure, roots)
}
hz_data = pd.DataFrame(hz_data_dict)
print(hz_data)
#>     hzid    id hzname genhz  top  bottom  clay  sand  phfield
#> 0    133  P001     Ap     A    0      18    21    54      6.2
#> 1    134  P001      A     A   18      30    20    53      6.0
#> 2    135  P001    ABt     A   30      46    24    50      5.9
#> 3    136  P001    Bt1     B   46      61    26    48      5.9
#> 4    137  P001    Bt2     B   61      91    27    47      6.1
#> 5    138  P002     Ap     A    0      15    18    58      6.5
#> 6    139  P002      A     A   15      38    19    55      6.3
#> 7    140  P002    Bt1     B   38      56    28    45      6.0
#> 8    141  P002    Bt2     B   56      84    25    48      6.2
#> 9    142  P003     Ap     A    0      20    22    51      6.1
#> 10   143  P003     AB     A   20      41    26    49      5.8
#> 11   144  P003     Bt     B   41      76    29    42      5.9
#> 12   145  P004      A     A    0      10    15    60      6.8
#> 13   146  P004     Bw     B   10      35    18    55      6.5
#> 14   147  P004      C     C   35      80    12    65      7.0

Once we have DataFrames containing site and horizon data, we instantiate the SoilProfileCollection object using the constructor:

# Instantiate the collection using the column names defined above
spc = SoilProfileCollection(
  horizons=hz_data,
  site=site_data,
  idname='id',           # Column name for profile IDs
  hzidname='hzid',       # Column name for unique horizon IDs
  depthcols=('top','bottom'), # Tuple of (top_col, bottom_col)
  hzdesgncol='hzname'    # Column name for horizon designations (optional)
)

Alternatively, if all your data is in a single DataFrame, you can use the from_dataframe class method. This is useful when your source data is not already split into site and horizon tables.

You provide a schema_template to map your column names to the standard SoilProfileCollection names.

import pandas as pd
from soilprofilecollection import SoilProfileCollection

# --- Sample Combined Data with different column names ---
# This is to demonstrate from_dataframe()
combined_data_dict = {
    'profile_id': ['P001', 'P001', 'P002', 'P002'],
    'horizon_id': [1, 2, 3, 4],
    'top_depth': [0, 10, 0, 20],
    'bottom_depth': [10, 30, 20, 40],
    'horizon_name': ['A', 'B', 'A', 'C'],
    'site_group': ['A', 'A', 'B', 'B'],
    'elevation': [100, 100, 120, 120]
}
combined_data = pd.DataFrame(combined_data_dict)

# Define the schema template to map original names to SPC standard names
schema = {
    'profile_id': 'id',
    'horizon_id': 'hzid',
    'top_depth': 'top',
    'bottom_depth': 'bottom',
    'horizon_name': 'hzname'
}

# Create a SoilProfileCollection from the single dataframe
spc_from_df = SoilProfileCollection.from_dataframe(
    data=combined_data,
    schema_template=schema,
    idname='id',
    hzidname='hzid',
    depthcols=('top', 'bottom'),
    hzdesgncol='hzname'
)

print(spc_from_df)
#> <SoilProfileCollection> (2 profiles, 4 horizons)
#>   Profile ID:   id
#>   Horizon ID:   hzid
#>   Depth Cols:   top (top), bottom (bottom)
#>   Profile Top Depths:    [min: 0.0, mean: 0.0, max: 0.0]
#>   Profile Bottom Depths: [min: 30.0, mean: 35.0, max: 40.0]
#>   Hz Desgn Col: hzname
#>   Site Vars:     (0 total)
#>   Horizon Vars: id, hzid, top, bottom, hzname... (7 total)

The SoilProfileCollection class has several properties and methods:

print(spc)
#> <SoilProfileCollection> (4 profiles, 15 horizons)
#>   Profile ID:   id
#>   Horizon ID:   hzid
#>   Depth Cols:   top (top), bottom (bottom)
#>   Profile Top Depths:    [min: 0.0, mean: 0.0, max: 0.0]
#>   Profile Bottom Depths: [min: 76.0, mean: 82.8, max: 91.0]
#>   Hz Desgn Col: hzname
#>   Site Vars:    id, group, elev, slope_field, aspect_field (5 total)
#>   Horizon Vars: hzid, id, hzname, genhz, top... (9 total)
print(len(spc))
#> 4
print(spc.site)
#>           id group  elev  slope_field  aspect_field
#> id_idx                                             
#> P001    P001     A  1154            4           330
#> P002    P002     B  1158            3           290
#> P003    P003     B  1156            5            40
#> P004    P004     A  1150            6            90
print(spc.depths())
#>             id  hzid  top  bottom
#> hzid_idx                         
#> 133       P001   133    0      18
#> 134       P001   134   18      30
#> 135       P001   135   30      46
#> 136       P001   136   46      61
#> 137       P001   137   61      91
#> 138       P002   138    0      15
#> 139       P002   139   15      38
#> 140       P002   140   38      56
#> 141       P002   141   56      84
#> 142       P003   142    0      20
#> 143       P003   143   20      41
#> 144       P003   144   41      76
#> 145       P004   145    0      10
#> 146       P004   146   10      35
#> 147       P004   147   35      80
print(spc.depths(how="minmax"))
#>      id  min_depth  max_depth
#> 0  P001          0         91
#> 1  P002          0         84
#> 2  P003          0         76
#> 3  P004          0         80

subset = spc[0:3] # Get first 3 profiles
print(len(subset))
#> 3

subset2 = spc[0:3,0:2]
#> /home/andrew/workspace/soilmcp/upstream/soilprofilecollection/soilprofilecollection/soil_profile_collection.py:654: FutureWarning: DataFrameGroupBy.apply operated on the grouping columns. This behavior is deprecated, and in a future version of pandas the grouping columns will be excluded from the operation. Either pass `include_groups=False` to exclude the groupings or explicitly select the grouping columns after groupby to silence this warning.
#>   ).apply(
print(len(subset))
#> 3

standard_intervals = [25, 50]
x = subset.glom(intervals = standard_intervals)
y = subset.glom(intervals = standard_intervals, truncate = True)
z = subset.glom(intervals = standard_intervals, agg_fun = "dominant")

You can also create sketches of the data stored in the object using the .plot() method:

ap = subset.plot(color="clay", label_hz=True) # Color by clay content
import matplotlib.pyplot as plt
plt.suptitle("Sample SPC Plot (Colored by Clay %)")
plt.show()

print(subset)
#> <SoilProfileCollection> (3 profiles, 12 horizons)
#>   Profile ID:   id
#>   Horizon ID:   hzid
#>   Depth Cols:   top (top), bottom (bottom)
#>   Profile Top Depths:    [min: 0.0, mean: 0.0, max: 0.0]
#>   Profile Bottom Depths: [min: 76.0, mean: 83.7, max: 91.0]
#>   Hz Desgn Col: hzname
#>   Site Vars:    id, group, elev, slope_field, aspect_field (5 total)
#>   Horizon Vars: hzid, id, hzname, genhz, top... (9 total)

ap2 = subset2.plot(color="clay", label_hz=True) # Color by clay content
import matplotlib.pyplot as plt
plt.suptitle("Sample SPC Plot (Colored by Clay %)")
plt.show()

print(subset2)
#> <SoilProfileCollection> (3 profiles, 6 horizons)
#>   Profile ID:   id
#>   Horizon ID:   hzid
#>   Depth Cols:   top (top), bottom (bottom)
#>   Profile Top Depths:    [min: 0.0, mean: 0.0, max: 0.0]
#>   Profile Bottom Depths: [min: 30.0, mean: 36.3, max: 41.0]
#>   Hz Desgn Col: hzname
#>   Site Vars:    id, group, elev, slope_field, aspect_field (5 total)
#>   Horizon Vars: hzid, id, hzname, genhz, top... (9 total)

ax = x.plot(color="clay", label_hz=True) # Color by clay content
import matplotlib.pyplot as plt
plt.suptitle("Sample SPC Plot (Colored by Clay %)")
plt.show()

print(x)
#> <SoilProfileCollection> (3 profiles, 7 horizons)
#>   Profile ID:   id
#>   Horizon ID:   slice_hzid
#>   Depth Cols:   top (top), bottom (bottom)
#>   Profile Top Depths:    [min: 15.0, mean: 17.7, max: 20.0]
#>   Profile Bottom Depths: [min: 56.0, mean: 64.3, max: 76.0]
#>   Hz Desgn Col: hzname
#>   Site Vars:    id, group, elev, slope_field, aspect_field (5 total)
#>   Horizon Vars: hzid, id, hzname, genhz, top... (10 total)

ay = y.plot(color="clay", label_hz=True) # Color by clay content
import matplotlib.pyplot as plt
plt.suptitle("Sample SPC Plot (Colored by Clay %)")
plt.show()

print(y)
#> <SoilProfileCollection> (3 profiles, 7 horizons)
#>   Profile ID:   id
#>   Horizon ID:   slice_hzid
#>   Depth Cols:   top (top), bottom (bottom)
#>   Profile Top Depths:    [min: 25.0, mean: 25.0, max: 25.0]
#>   Profile Bottom Depths: [min: 50.0, mean: 50.0, max: 50.0]
#>   Hz Desgn Col: hzname
#>   Site Vars:    id, group, elev, slope_field, aspect_field (5 total)
#>   Horizon Vars: hzid, id, hzname, genhz, top... (10 total)

az = z.plot(color="clay", label_hz=True) # Color by clay content
import matplotlib.pyplot as plt
plt.suptitle("Sample SPC Plot (Colored by Clay %)")
plt.show()

print(z)
#> <SoilProfileCollection> (3 profiles, 3 horizons)
#>   Profile ID:   id
#>   Horizon ID:   agg_hzid
#>   Depth Cols:   top (top), bottom (bottom)
#>   Profile Top Depths:    [min: 25.0, mean: 25.0, max: 25.0]
#>   Profile Bottom Depths: [min: 50.0, mean: 50.0, max: 50.0]
#>   Site Vars:    id, group, elev, slope_field, aspect_field (5 total)
#>   Horizon Vars: id, top, bottom, hzname, genhz... (9 total)

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.3.1

May 1, 2026

0.3.0

Dec 27, 2025

This version

0.2.1

Oct 13, 2025

0.2.0

Oct 13, 2025

0.1.0

Mar 31, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

soilprofilecollection-0.2.1.tar.gz (35.3 kB view details)

Uploaded Oct 13, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

soilprofilecollection-0.2.1-py3-none-any.whl (33.1 kB view details)

Uploaded Oct 13, 2025 Python 3

File details

Details for the file soilprofilecollection-0.2.1.tar.gz.

File metadata

Download URL: soilprofilecollection-0.2.1.tar.gz
Upload date: Oct 13, 2025
Size: 35.3 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for soilprofilecollection-0.2.1.tar.gz
Algorithm	Hash digest
SHA256	`9a0f255eda25b04957ee5d511adb4fc544a022b6266fd54e32c205b42ca3bf38`
MD5	`428a7b95c5671c66a3086a14b14d77e1`
BLAKE2b-256	`63a3674c15adbc7606f160df83b55f77b3718da3118191a084e463c341f69fce`

See more details on using hashes here.

Provenance

The following attestation bundles were made for soilprofilecollection-0.2.1.tar.gz:

Publisher: pypi-release.yml on brownag/soilprofilecollection

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: soilprofilecollection-0.2.1.tar.gz
- Subject digest: 9a0f255eda25b04957ee5d511adb4fc544a022b6266fd54e32c205b42ca3bf38
- Sigstore transparency entry: 601504872
- Sigstore integration time: Oct 13, 2025
Source repository:
- Permalink: brownag/soilprofilecollection@1974e2bf113534dae5185b7e6503a3466186ccbc
- Branch / Tag: refs/tags/0.2.1
- Owner: https://github.com/brownag
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: pypi-release.yml@1974e2bf113534dae5185b7e6503a3466186ccbc
- Trigger Event: release

File details

Details for the file soilprofilecollection-0.2.1-py3-none-any.whl.

File metadata

Download URL: soilprofilecollection-0.2.1-py3-none-any.whl
Upload date: Oct 13, 2025
Size: 33.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for soilprofilecollection-0.2.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0a9b35d3650057f08c05324e2990dd8af79cb920522549cdf86280f56ff9286e`
MD5	`050fc1c487f7f7bd090403511b94c20d`
BLAKE2b-256	`f915df8e47bada2f275ceeeb49bd8b52a519e65d2f5751bee9ec3afcbf0f0589`

See more details on using hashes here.

Provenance

The following attestation bundles were made for soilprofilecollection-0.2.1-py3-none-any.whl:

Publisher: pypi-release.yml on brownag/soilprofilecollection

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: soilprofilecollection-0.2.1-py3-none-any.whl
- Subject digest: 0a9b35d3650057f08c05324e2990dd8af79cb920522549cdf86280f56ff9286e
- Sigstore transparency entry: 601504874
- Sigstore integration time: Oct 13, 2025
Source repository:
- Permalink: brownag/soilprofilecollection@1974e2bf113534dae5185b7e6503a3466186ccbc
- Branch / Tag: refs/tags/0.2.1
- Owner: https://github.com/brownag
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: pypi-release.yml@1974e2bf113534dae5185b7e6503a3466186ccbc
- Trigger Event: release

soilprofilecollection 0.2.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Python SoilProfileCollection Module

Author: Andrew Gene Brown

Installation

Examples

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance