Skip to main content

GeoSSM: geospatial state-space modeling tools

Project description

GEOSSM Logo

Geossm 🌍

Geostatistics with State Space Models

geossm is a Python package for applying state space models to spatial and spatiotemporal data. It is tailored for modern geostatistical workflows and natively operates on GeoDataFrame objects from the geopandas library.

The package is designed with scalability and modularity in mind, making it suitable for large spatial and spatiotemporal datasets across environmental, climate, and geospatial applications.

Table of Contents

Overview

State space models (SSMs) are powerful statistical tools for modeling dynamic systems. This package extends their application to geospatial and spatiotemporal contexts, enabling:

  • Efficient filtering and smoothing of spatial processes
  • Low-rank approximations for scalability
  • Seamless integration with geospatial data workflows
  • Support for complex environmental and climate datasets

The package is built on the research presented in the PhD thesis: A State-Space Modelling Framework in Geostatistics with Application to Environmental Data by Jacopo Rodeschini.

Progect folder

geossm/
├── pyproject.toml      <-- All package config
├── environment.yml     <-- For Conda users
├── README.md
├── LICENSE
├── src/                <-- The "Source" folder
│   └── geossm/         <-- The actual package folder
│       ├── __init__.py
│       └── core.py
└── tests/            

🔍 Key Features

  • Seamless GeoDataFrame Integration: Work directly with geopandas.GeoDataFrame objects
  • State Space Modeling: Tools for building, estimating, filtering, and smoothing spatial processes
  • Low-Rank Approximations: Efficient handling of large-scale spatial data via LRSSM
  • Modular Pipeline:
    • Data preprocessing and validation
    • Design matrix construction
    • Model specification and estimation
    • Prediction and simulation
  • Research-Oriented: Built for extensibility and experimental workflows
  • Multiple Model Types: Support for linear time-invariant and time-varying SSMs

Requirements

  • Python: 3.8 or higher
  • Key Dependencies:
    • geopandas ≥ 1.1.2 (geospatial data handling)
    • pandas ≥ 2.2.2 (data manipulation)
    • numpy ≥ 2.2.6 (numerical computing)
    • scipy ≥ 1.15.3 (scientific computing)
    • jax ≥ 0.6.2 (automatic differentiation & optimization)
    • statsmodels ≥ 0.14.6 (statistical modeling)
    • matplotlib ≥ 3.9.1 (visualization)
    • Additional spatial & mesh packages: shapely, gmsh, meshio, pygmsh, pyproj

See pyproject.toml or environment.yml for the complete dependency list.

🚀 Installation

Option 1: From pip (Recommended)

pip install geossm

Option 2: From Source with Conda

  1. Clone or download the repository:
git clone https://github.com/yourusername/geossm.git
cd geossm
  1. Create the conda environment (named geossm) with all required packages. Before creating the environment, make sure your Conda installation is updated to the latest version and configured to use the faster libmamba solver (recommended for significantly faster dependency resolution).
  • Update Conda (recommended)
# Enable the faster libmamba solver
conda config --set solver libmamba

# Update conda in the base environment
conda update -n base -c defaults conda
  • Create the environment Once Conda is updated, create the environment (named geossm) using:
conda env create -f environment.yml
  1. Activate the environment:
conda activate geossm
  1. Install the package in development mode:
pip install -e .

Verify Installation

import geossm
print(geossm.__version__)

Remove the package and the environment

Remove the package

pip uninstall geossm

Remove an entire environment

conda remove -n geossm --all

Quick Start

Loading Data

import geossm
import geossm.datasets as datasets

# List available datasets
print(datasets.list_datasets())

# Load the Agrimonia dataset
agrimonia_gdf, shapefile = datasets.load_dataset('agrimonia')
print(agrimonia_gdf.head())
print(agrimonia_gdf.columns)

Building a State Space Model

import matplotlib.pyplot as plt
from shapely.geometry import Point, Polygon
import numpy as np

import pygmsh
import gmsh
import geopandas as geodf
import geossm.datasets as df
from geossm.stmodel import LRStateSpaceModel as lrssm
from geossm.covmodel import FEMSolver


# %% Load the agrimonia dataset
agri, shape = df.load_dataset('agrimonia')


# %% From .csv to geopandas
ct = np.array([agri.Longitude.to_numpy(), agri.Latitude.to_numpy()]).T
agri['geometry'] = [Point(p[0], p[1]) for p in ct]  # (x,y) = (lat,lon)

agri = geodf.GeoDataFrame(agri, crs=4326)

domain = list(shape.geometry[0].geoms)[0].boundary
buffer = list(domain.buffer(0.3).boundary.geoms)[0]


# %% Build the model
model = lrssm(agri, ['AQ_pm10 ~ 1 + WE_temp_2m'], verbose=True, domain = [Polygon(buffer)])
print(model)


# %% [Utils] build mesh with gmsh
def buildMesh(poly, lc, points, lc_buffer=None, lc_points=1e22):
    with pygmsh.occ.Geometry() as geom:

        if lc_buffer is None:
            lc_buffer = lc

        coords = np.array(poly.buffer(
            lc_buffer).simplify(lc_buffer).exterior.coords[:-1])
        domain = geom.add_polygon(coords, mesh_size=lc_buffer*0.1)

        # 2. Add physical group for the domain surface (good practice)
        geom.add_physical(domain, label="surface_domain")

        # Add points for the boundary
        embedded_tags = []
        for p in points:
            t = gmsh.model.occ.addPoint(p[0], p[1], 0, lc_points)
            embedded_tags.append(t)

        gmsh.model.occ.synchronize()  # Synchronize OCC entities before using them in fields

        # fix the points
        # gmsh.model.mesh.embed(
        #     0, embedded_tags, 2, domain._id)

        gmsh.option.setNumber("Mesh.Algorithm", 6)

        # CRITICAL: Tell Gmsh NOT to force density based on the internal points
        gmsh.option.setNumber("Mesh.MeshSizeFromPoints", 0)
        gmsh.option.setNumber("Mesh.MeshSizeExtendFromBoundary", 0)

        # Allow triangles to be very large
        gmsh.option.setNumber("Mesh.CharacteristicLengthMax", lc)
        # Only limit the absolute minimum to prevent crashes
        gmsh.option.setNumber("Mesh.CharacteristicLengthMin", lc * 0.1)

        # 5. Generate
        gmsh.model.mesh.generate(2)

        gmsh.model.mesh.optimize("Laplace2D")
        gmsh.option.setNumber("Mesh.Smoothing", 10)

        # # This allows the optimizer to move nodes more freely
        gmsh.option.setNumber("Mesh.Optimize", 1)
        gmsh.option.setNumber("Mesh.OptimizeNetgen", 1)

        mesh = geom.generate_mesh()

    return mesh

# %% Build the mesh for the AQ_pm10 observed variable
points = model.points[0]
mesh_io = buildMesh(buffer, 0.35, points)
print(mesh_io)

# plot the mesh (use the fem_solver utlities)
fem_solver = FEMSolver(mesh_io, [Polygon(buffer)])

# plot the mesh using the utilities 
fig, ax = plt.subplots(figsize=(8, 8))
fem_solver.plot_mesh(ax=ax)

# %% Set up the lrssm model (univiarte latent)

# add the mesh object and the domain where the laten domain is defined
# if None it is assumed to be the same of the observation  
model = model.setup([mesh_io])

# %% Estimate the Model (default estimation options)
results = model.fit()
print(results) # resutls.summary()


# %% Plot the likelihood curve
fig, ax = plt.subplots()
ax.plot(-np.array(results.llf_path[1:]))
ax.set_yscale('log')
ax.set_xlabel('Iteration')
ax.set_ylabel('Log Likelihood')
ax.set_title('Log Likelihood Curve')
ax.grid()
plt.show()

Examples

The examples/ directory contains comprehensive notebooks demonstrating:

Run any example with:

cd examples
python example_datasets_load.py

Documentation

Full API documentation and tutorials are available at:

  • Source Code: See the src/geossm/ directory
  • Module Reference:
    • geossm.ssm — Core state space modeling
    • geossm.stmodel — Spatiotemporal models (LRSSM, SSM variants)
    • geossm.datasets — Built-in datasets and data loaders
    • geossm.data_preparation — Data preprocessing utilities
    • geossm.covmodel — Covariance model specifications

For detailed information on specific functions and classes, use Python's built-in help:

import geossm
help(geossm.ssm.StateSpaceModel)

Contributing

Contributions are welcome! To contribute:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/your-feature)
  3. Commit your changes (git commit -am 'Add your feature')
  4. Push to the branch (git push origin feature/your-feature)
  5. Open a Pull Request

For questions or bug reports, please open an Issue.

License

This project is licensed under the MIT License — see the LICENSE file for details.

Citation

If you use GEOSSM in your research, please cite:

@phdthesis{rodeschini2025,
  author = {Rodeschini, Jacopo},
  title = {A State-Space Modelling Framework in Geostatistics with Application to Environmental Data},
  school = {University of Bergamo},
  year = {2025}
}

Contact

Author: Jacopo Rodeschini
Email: jacopo.rodeschini@unibg.it


Made with ❤️ for geospatial data science

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

geossm-1.1.4.tar.gz (3.4 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

geossm-1.1.4-py3-none-any.whl (3.2 MB view details)

Uploaded Python 3

File details

Details for the file geossm-1.1.4.tar.gz.

File metadata

  • Download URL: geossm-1.1.4.tar.gz
  • Upload date:
  • Size: 3.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for geossm-1.1.4.tar.gz
Algorithm Hash digest
SHA256 56ce418af081afa78b83415642adc3c64b6ddb5d4d76bd7f21836232d9b0012f
MD5 5b1d5eb27bd95210b306660fedbe596a
BLAKE2b-256 17552a9aab7ac819eb18543f9ba1a8472a8d842c8d5061a1630b86f97bc50008

See more details on using hashes here.

File details

Details for the file geossm-1.1.4-py3-none-any.whl.

File metadata

  • Download URL: geossm-1.1.4-py3-none-any.whl
  • Upload date:
  • Size: 3.2 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for geossm-1.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 1e734a51505c965071ce7153d1aecd40c2a11eb70f1d53eea459c402e292ef3a
MD5 56172336aef1bbb7f4ac7f0c3c6c94d5
BLAKE2b-256 d9477cebd32d73f400c97e3e6eb71b5d49d42584dde6097178d411264433fece

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page