Skip to main content

A Python client for accessing USDA Soil Data Access (SDA) web service

Project description

soildb

PyPI version License: MIT

Python client for the USDA-NRCS Soil Data Access (SDA) web service, NRCS monitoring networks (SCAN, SNOTEL), and other National Cooperative Soil Survey data sources.

Overview

soildb provides Python access to:

  • Soil Data: USDA Soil Data Access (SDA) web service for soil survey data
  • Weather Data: NRCS Air and Water Database (AWDB) for soil and weather monitoring
  • Integration: Tools for combining soil and weather data for comprehensive analysis

Query soil survey data, environmental monitoring data, export to pandas/polars DataFrames, and handle spatial queries.

Note: AWDB module provides complementary environmental data (soil moisture, temperature, precipitation). See the documentation in docs/awdb.qmd for guidance on how to use AWDB with soil data.

Installation

pip install soildb

For spatial functionality:

pip install soildb[spatial]

For all optional features support:

pip install soildb[all]

Features

Soil Data (SDA)

  • Query soil survey data from NRCS Soil Data Access
  • Export to pandas and polars DataFrames
  • Build custom SQL queries with fluent interface
  • Spatial queries with points, bounding boxes, and polygons
  • Bulk data fetching with automatic pagination
  • Full pedon laboratory characterization data

Environmental Data (AWDB)

  • Access soil moisture and temperature monitoring from SCAN stations
  • Retrieve precipitation, temperature, and weather data from SNOTEL and NWCC networks
  • Find nearest monitoring stations by location
  • Query historical weather patterns for climate analysis

Integration Features

  • Combine soil properties with weather patterns for suitability analysis
  • Correlate soil characteristics with environmental responses
  • Validate soil survey data against field observations
  • Async I/O for high performance and concurrency

Quick Start

Query Builder

Build and execute custom SQL queries with the fluent interface:

from soildb import Query

query = (Query()
        .select("mukey", "muname", "musym")
        .from_("mapunit")
        .inner_join("legend", "mapunit.lkey = legend.lkey")
        .where("areasymbol = 'IA109'")
        .limit(5))

# Inspect the generated SQL
print(query.to_sql())

# Execute and get results
from soildb import SDAClient
result = SDAClient().execute.sync(query)
df = result.to_pandas()
print(df.head())
SELECT TOP 5 mukey, muname, musym FROM mapunit INNER JOIN legend ON mapunit.lkey = legend.lkey WHERE areasymbol = 'IA109'
    mukey                                             muname  musym
0  408337  Colo silty clay loam, channeled, 0 to 2 percen...   1133
1  408339        Colo silty clay loam, 0 to 2 percent slopes    133
2  408340        Colo silty clay loam, 2 to 4 percent slopes   133B
3  408345  Clarion loam, 9 to 14 percent slopes, moderate...  138D2
4  408348          Harpster silt loam, 0 to 2 percent slopes   1595

Async vs Synchronous Usage

All soildb functions have both async and synchronous versions. For most use cases, the synchronous .sync() version is simpler and easier to use.

Synchronous Usage

For simple scripts and interactive use, soildb provides synchronous versions of all async functions:

from soildb import get_mapunit_by_areasymbol

# Synchronous usage - no async/await needed!
mapunits = get_mapunit_by_areasymbol.sync("IA109")
df = mapunits.to_pandas()
print(f"Found {len(df)} map units")
df.head()
Found 80 map units
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { vertical-align: right; } </style>
mukey musym muname mukind muacres areasymbol areaname
0 408333 1032 Spicer silty clay loam, 0 to 2 percent slopes Consociation 1834 IA109 Kossuth County, Iowa
1 408334 107 Webster clay loam, 0 to 2 percent slopes Consociation 46882 IA109 Kossuth County, Iowa
2 408335 108 Wadena loam, 0 to 2 percent slopes Consociation 807 IA109 Kossuth County, Iowa
3 408336 108B Wadena loam, 2 to 6 percent slopes Consociation 1103 IA109 Kossuth County, Iowa
4 408337 1133 Colo silty clay loam, channeled, 0 to 2 percen... Consociation 1403 IA109 Kossuth County, Iowa

The .sync methods automatically manage SDA client connections for you. For multiple calls, consider reusing a client:

from soildb import SDAClient, get_mapunit_by_areasymbol

client = SDAClient()
mapunits1 = get_mapunit_by_areasymbol.sync("IA109", client=client)
mapunits2 = get_mapunit_by_areasymbol.sync("IA113", client=client)
client.close()

Convenience Functions

soildb provides high-level functions for common tasks:

from soildb import get_mapunit_by_areasymbol

mapunits = get_mapunit_by_areasymbol.sync("IA109")
df = mapunits.to_pandas()
print(f"Found {len(df)} map units")
df.head()
Found 80 map units
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style>
mukey musym muname mukind muacres areasymbol areaname
0 408333 1032 Spicer silty clay loam, 0 to 2 percent slopes Consociation 1834 IA109 Kossuth County, Iowa
1 408334 107 Webster clay loam, 0 to 2 percent slopes Consociation 46882 IA109 Kossuth County, Iowa
2 408335 108 Wadena loam, 0 to 2 percent slopes Consociation 807 IA109 Kossuth County, Iowa
3 408336 108B Wadena loam, 2 to 6 percent slopes Consociation 1103 IA109 Kossuth County, Iowa
4 408337 1133 Colo silty clay loam, channeled, 0 to 2 percen... Consociation 1403 IA109 Kossuth County, Iowa

If you have suggestions for new convenience functions please file a feature request on GitHub.

Spatial Queries

Query soil data by location with points, bounding boxes, or polygons:

from soildb import spatial_query

# Point query
response = spatial_query.sync(
    geometry="POINT(-93.6 42.0)",
    table="mupolygon"
)
df = response.to_pandas()
print(f"Point query found {len(df)} results")
Point query found 1 results
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style>
mukey areasymbol musym nationalmusym muname mukind
0 411278 IA169 1314 fsz1 Hanlon-Spillville complex, channeled, 0 to 2 p... Complex

Bulk Data Fetching

Retrieve large datasets efficiently with automatic pagination and chunking:

from soildb import fetch_by_keys, get_mukey_by_areasymbol

# Get mukeys for survey areas
areas = ["IA109", "IA113", "IA117"]
all_mukeys = get_mukey_by_areasymbol.sync(areas)

print(f"Found {len(all_mukeys)} mukeys across {len(areas)} areas")

# Fetch components in chunks automatically
response = fetch_by_keys.sync(
    all_mukeys, 
    "component", 
    key_column="mukey", 
    chunk_size=100,
    columns=["mukey", "cokey", "compname", "localphase", "comppct_r"]
)
df = response.to_pandas()
print(f"Fetched {len(df)} component records")
Found 410 mukeys across 3 areas
Fetching 410 keys in 5 chunks of 100
Fetched 1067 component records
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style>
mukey cokey compname localphase comppct_r
0 408333 25562547 Kingston <NA> 2
1 408333 25562548 Okoboji <NA> 5
2 408333 25562549 Spicer <NA> 90
3 408333 25562550 Madelia <NA> 3
4 408334 25562837 Okoboji <NA> 5
5 408334 25562838 Glencoe <NA> 3
6 408334 25562839 Canisteo <NA> 2
7 408334 25562840 Webster <NA> 85
8 408334 25562841 Nicollet <NA> 5
9 408335 25562135 Biscay <NA> 1

The component table has a hierarchical relationship:

  • mukey (map unit key) is the parent
  • cokey (component key) is the child

So when fetching components, you typically want to filter by mukey to get all components for specific map units.

Use the fetch_by_keys() function with the "mukey" as the key_column to achieve this with automatic pagination over chunks with 100 rows each (or specify your own chunk_size).

Async Usage

For performance-critical applications, use async functions directly with concurrent requests:

import asyncio
from soildb import fetch_by_keys, get_mukey_by_areasymbol

async def concurrent_example():
    # Get mukeys for multiple areas concurrently
    areas = ["IA109", "IA113", "IA117"]
    all_mukeys = await get_mukey_by_areasymbol(areas)
    
    # Fetch components concurrently with automatic pagination
    response = await fetch_by_keys(
        all_mukeys,
        "component",
        key_column="mukey",
        chunk_size=100,
        columns=["mukey", "cokey", "compname", "comppct_r"]
    )
    return response.to_pandas()

# Run async function
df = asyncio.run(concurrent_example())

For more async patterns, see the Async Programming Guide.

Examples

See the examples/ directory and documentation for detailed usage patterns.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

soildb-0.4.0.tar.gz (219.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

soildb-0.4.0-py3-none-any.whl (116.7 kB view details)

Uploaded Python 3

File details

Details for the file soildb-0.4.0.tar.gz.

File metadata

  • Download URL: soildb-0.4.0.tar.gz
  • Upload date:
  • Size: 219.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for soildb-0.4.0.tar.gz
Algorithm Hash digest
SHA256 1cbf2a809dd56e8956f84007b68911599cd59c1db2f8d4165bdab1ad8ee474a1
MD5 220b4e2c0d7e3a5c787bd2f93111a8a9
BLAKE2b-256 5302005366c3cf2652e800a528046bc5879987ee72bfd5d8ecc53b1b5a7c6389

See more details on using hashes here.

Provenance

The following attestation bundles were made for soildb-0.4.0.tar.gz:

Publisher: pypi-release.yml on brownag/py-soildb

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file soildb-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: soildb-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 116.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for soildb-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c5e68d7c19b1564322f71d308d460861c0f529200f693dc512d993197d7a8b8a
MD5 4ba83e1dd7e2d594e7987ce393ea1df9
BLAKE2b-256 79033780edbdfa70fdd2eccaca293759868195b224a1c10b7001081dd83fde5e

See more details on using hashes here.

Provenance

The following attestation bundles were made for soildb-0.4.0-py3-none-any.whl:

Publisher: pypi-release.yml on brownag/py-soildb

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page