Skip to main content

Machine Learning Tools for Geotechnical Earthquake Engineering.

Project description

kashima

Interactive Seismic Event Mapping and Catalog Management

Python Version Version

Last updated: February 26, 2026

kashima is a Python library for seismic event visualization and catalog processing that produces interactive Folium-based web maps from global earthquake catalogs and auxiliary datasets.

What is it?

kashima focuses on the mapping workflow for engineering seismology: given one or more sites of interest, it builds reproducible web maps that combine:

  • Global earthquake catalogs (USGS ComCat, Global CMT NDK, ISC Bulletin)
  • Auxiliary fault databases (GEM, USGS Quaternary, EFSM20) and user GeoJSON faults
  • Global ISC station layer (from a packaged CSV, filtered to the map window)
  • Optional user event catalogs (e.g. blasts) in CSV or Parquet form

The heavy lifting (data download, caching, clipping and styling) is encapsulated in a small public API under kashima.mapper.

Who is this for? (TL;DR)

kashima is aimed at engineering seismology, seismic hazard and mining/energy projects where you need reproducible, shareable web maps of earthquakes, faults and stations around one or more sites.

The typical workflow is:

  1. Install the library: pip install kashima.
  2. Pre-populate the global cache once: downloadAllCatalogs(include_faults=True).
  3. Build a map for your site with buildMap(...) and open the generated maps/index.html in a browser.

Features

  • Multi-catalog support: USGS, Global CMT (NDK method), ISC, and custom blasts
  • GMDB maintenance helpers: update provider EventTable.<PROVIDER>.csv and StationTable.<PROVIDER>.csv files; fill audit IDs for events; optionally augment with generation mechanism + faulting style
  • GMM/GMPE evaluation helpers: evaluate Sa(T)/PGA from single-TRT GMPE logic tree seeds (OpenQuake hazardlib required at runtime). See examples/gmm/01_*..04_*.
  • GMDB Vs30 helpers: query and update station Vs30 from GMDB StationTable.*.csv, combining provider (Vs30.owner), neighbor-inferred (Vs30.neighbors) and USGS-proxy (Vs30.USGS) values into a synthetic StationVs30
  • STREC + Slab2 integration (optional but supported): bootstrap Slab2 grids and write auditable *.STREC columns for reviewer-friendly subduction subtype tagging
  • Interactive maps: Folium-based maps with beachball focal mechanisms, distance rings and rich tooltips
  • Global cache: Download catalogs and fault databases once, reuse across projects with incremental updates
  • Advanced visualizations: Heatmaps, clustered markers, epicentral circles, fault overlays
  • Auxiliary data: GEM Active Faults, USGS Quaternary Faults, EFSM20 fault databases
  • Global ISC stations: packaged CSV (~41k stations) automatically clipped to the map radius, with a dedicated layer
  • Multi-fault datasets: combine GEM, USGS Quaternary and EFSM20 faults (and local GeoJSONs) in a single color-coded layer
  • Reproducible projects: every map writes the catalogs actually used to ./maps and (optionally) ./data for auditability

Installation

Requires Python 3.8+.

pip install kashima

Development version:

git clone https://github.com/averrik/kashima.git
cd kashima
pip install -e .

Quickstart

  1. Install kashima (see above).
  2. Initialize the global cache (earthquake catalogs + optional fault databases). Run this once per machine:
from kashima.mapper import downloadAllCatalogs

# First-time setup: fills ~/.cache/kashima (Linux),
# ~/Library/Caches/kashima (macOS) or %LOCALAPPDATA%\kashima\Cache\ (Windows)
downloadAllCatalogs(include_faults=True)
  1. Build your first map around a site of interest:
from kashima.mapper import buildMap

result = buildMap(
    latitude=-32.86758,
    longitude=-68.88867,
    radius_km=500,
    project_name="Mendoza seismicity",
    client="Example Mining Co.",
)

print("HTML map:", result["html"])  # ./maps/index.html
print("Events CSV:", result["csv"])  # ./maps/epicenters.csv

Open the generated index.html in a browser to explore earthquakes, faults and stations interactively.

Usage

GMDB EventTable workflows (audit + enrichment)

The kashima.gmdb module provides workflows for maintaining provider EventTables.

  • updateEventTable(...): fill usgsEventId / iscEventId, optionally remove non-resolvable rows, and augment rows with:
    • faultStyle (from moment tensors)
    • genMech + auditable *.STREC columns (STREC + Slab2)

Bootstrap STREC+Slab2 once per machine:

from kashima.gmdb import ensureStrecData

# Creates ~/.strec/config.ini and downloads Slab2 grids into the configured slab folder.
ensureStrecData(createConfig=True, ensureSlab2=True, downloadSlab2=True)

See docs/gmdb_eventtable.md, docs/gmdb_stationtable.md, and examples/gmdb/06_update_all_eventtables_with_strec.py.

GMDB Vs30 helpers (station site conditions)

GMDB StationTable.*.csv files carry several Vs30-related columns (Vs30.owner, Vs30.neighbors, Vs30.USGS, StationVs30). The kashima.gmdb helpers expose these through a small, file-based API so you do not have to re-implement column semantics.

Query helpers:

from kashima.gmdb import (
    getStationVs30,
    getStationVs30Sources,
    getStationVs30Index,
)

# Single-station Vs30 (synthetic by default: owner > neighbors > USGS)
vs30 = getStationVs30(
    "NWZ",
    "WEL/THZ",
    indexDir="~/kashimaDB/gmdb.v2/index",
    source="synthetic",  # or "owner" / "neighbors" / "usgs"
)

# Full provenance for one station
sources = getStationVs30Sources("NWZ", "WEL/THZ", indexDir="~/kashimaDB/gmdb.v2/index")

# Bulk index: StationID -> Vs30Sources
idx = getStationVs30Index("NWZ", indexDir="~/kashimaDB/gmdb.v2/index")

Update helpers (flatfiles → Vs30.owner → full pipeline):

from kashima.gmdb import (
    updateStationVs30OwnerFromFlatfiles,
    updateStationVs30Pipeline,
)

# Refresh Vs30.owner from provider flatfiles
status = updateStationVs30OwnerFromFlatfiles(
    "NGAW",
    rootDir="~/kashimaDB/gmdb.v2",
    indexDir="~/kashimaDB/gmdb.v2/index",
)

# Run the full Vs30 pipeline (owner + neighbors + USGS + synthetic rebuild)
r = updateStationVs30Pipeline(
    "NGAW",
    rootDir="~/kashimaDB/gmdb.v2",
    indexDir="~/kashimaDB/gmdb.v2/index",
    distanceKm=1.0,
    includeLocOnly=True,
    usgsGrid=None,  # default: downloads global_vs30.grd into the user cache (if missing)
)

See docs/gmdb_vs30.md and examples/gmdb/10_vs30_helpers.py for more details.

GMDB provider ingestion (no download)

kashima.gmdb also includes short helpers to ingest already-downloaded provider data into a GMDB root:

  • ingestFDSNProvider(...): restage + normalize + base tables + RecordTables
  • updateProviderIndex(...): completion step for EventTable/StationTable/Vs30
  • validateProvider(...): RecordTable + filesystem validation
  • auditProviderFDSN(...): optional online audit against FDSN endpoints

MiniSEED payload policy:

  • MiniSEED-based owners (NCEDC/SCEDC/NN/PNSN) are normalized to ASCII .txt payloads in raw.owner/.
  • On apply/rewriteExisting, the original .mseed files are removed (no intermediate artifacts left).

PGA sanity workflow (recommended for MiniSEED owners):

  1. Rebuild RecordTables from payloads.
  2. Audit outliers.
  3. If sentinel-derived spikes exist in already-materialized .txt, repair payloads and rebuild again.

Scripts:

  • Audit PGA outliers: examples/gmdb/23_audit_recordtable_pga_outliers.py
  • Repair .txt sentinels (dry-run default; requires --apply): examples/gmdb/24_repair_fdsn_txt_sentinels.py

Examples:

  • examples/gmdb/19_ingest_fdsn_provider.py
  • examples/gmdb/21_ingest_fdsn_provider_end_to_end.py
  • examples/gmdb/22_normalize_raw_owner.py

Map layers and concepts

Each map produced by buildMap is composed of several layers that you can turn on/off in the Folium LayerControl:

  • Events: epicentral points coloured and sized by magnitude, coming from USGS/GCMT/ISC or an optional user CSV.
  • Clustered view: an alternative representation where nearby events are grouped into clusters to reduce overplotting.
  • Heatmap: a smoothed density field of events, controlled by the heatmap_* parameters.
  • Beachballs: focal mechanisms (from GCMT) drawn as beachball symbols for events above a given magnitude.
  • Faults: line features from global fault databases (GEM, USGS Quaternary, EFSM20) selected via fault_sets, plus any local GeoJSON passed in faults_files, all clipped to the same geographic window as the events.
  • Stations: global ISC stations from the packaged CSV (or your own station_csv_path), clipped to the map window and rendered as square markers.
  • Site marker: a star symbol at the site location (latitude, longitude).
  • Epicentral circles: concentric distance rings around the site, controlled by epicentral_circles.

These layers showcase most of the power of kashima; the parameters of buildMap let you decide which ones to include and how they look.

High-level map API: buildMap

The main entry point is kashima.mapper.buildMap. It:

  • Copies the latest cached USGS/ISC/GCMT catalogs into a project-local data/ directory
  • Optionally merges global fault databases (GEM, USGS Quaternary, EFSM20) and user GeoJSON faults
  • Adds a global ISC stations layer by default (unless you override it)
  • Builds a Folium map and writes maps/index.html + maps/epicenters.csv

Minimal call (requires a pre-populated cache, see Quickstart):

from kashima.mapper import buildMap

result = buildMap(
    latitude=-32.86758,
    longitude=-68.88867,
)

A more realistic example using multiple layers, fault sets and local faults:

from kashima.mapper import buildMap

result = buildMap(
    latitude=-12.90795,
    longitude=+15.24845,
    radius_km=3500,
    # Layer visibility
    show_events_default=True,
    show_cluster_default=False,
    show_heatmap_default=True,
    show_beachballs_default=True,
    show_faults_default=True,
    show_epicentral_circles_default=True,
    # Fault datasets: global cache + local GeoJSONs
    fault_sets=["gem", "usgs", "efsm20"],
    faults_files=[
        "examples/mapper/faults/Angola1982.geojson",
        "examples/mapper/faults/Escosa2024.geojson",
    ],
    # Stations: default ISC CSV from cache, custom title
    station_layer_title="ISC + local stations",
    # Keep ./data snapshot for documentation
    keep_data=True,
)

print(result)

Key parameter groups (see help(buildMap) for the full list and defaults):

  • Location & radius (latitude, longitude, radius_km, event_radius_multiplier): define the geographic window of the map. radius_km sets the base radius, and event_radius_multiplier scales that radius when computing the spatial window used for events, faults and stations.
  • Layers (show_events_default, show_cluster_default, show_heatmap_default, show_beachballs_default, show_faults_default, show_stations_default, show_epicentral_circles_default): control which layers are visible when the map opens. Users can still toggle them later via the Folium LayerControl.
  • Catalogs & data (user_events_csv, keep_data, output_dir): override the global catalogs with your own CSV, preserve the ./data snapshot for auditability and choose where maps/ and data/ are written.
  • Fault configuration (fault_sets, faults_files, regional_faults_color, regional_faults_weight, faults_coordinate_system): select which cached fault databases (any subset of "gem", "usgs", "efsm20") are merged and which extra GeoJSON faults to add (for example the Angola files used in examples/mapper/longonjo.py), and how they are styled.
  • Stations (station_csv_path, station_coordinate_system, station_layer_title, show_stations_default): keep the default global ISC stations or replace them with your own CSV, adjusting CRS and layer title for the stations layer. Use show_stations_default=False to start with the stations layer turned off.
  • Styling & legend (mag_bins, dot_palette, dot_sizes, beachball_sizes, fault_style_meta, color_palette, color_reversed, scaling_factor, legend_title, legend_position): control how magnitudes map to colours and sizes and how the legend is rendered. Much of the visual power of examples like examples/mapper/longonjo.py comes from careful tuning of these parameters.
  • XY coordinates (x_col, y_col, location_crs): work in projected coordinates (for example local UTM) instead of latitude/longitude, useful when your input catalogs are already in a local CRS.
  • Tooltips (tooltip_fields, legend_map): choose which event fields appear in the tooltip and how they are labelled.
  • Map behavior (base_zoom_level, min_zoom_level, max_zoom_level, default_tile_layer, auto_fit_bounds, lock_pan, epicentral_circles): control the initial view (zoom levels and base tile layer) and how many distance rings are drawn around the site via epicentral_circles. auto_fit_bounds and lock_pan exist for future map-behaviour controls and may have no visible effect in some versions; use help(buildMap) for the authoritative description.

buildMap returns a small dictionary:

{
    "html": "path/to/index.html",
    "csv": "path/to/epicenters.csv",
    "event_count": 1234,
}

Catalog API: buildCatalog

For scripted data pipelines you can call buildCatalog directly to fetch and save catalogs without generating maps.

from kashima.mapper import buildCatalog

# Radial USGS query around a site
result = buildCatalog(
    source="usgs",
    output_path="data/usgs-events.csv",
    latitude=-32.86758,
    longitude=-68.88867,
    max_radius_km=500,
    min_magnitude=5.0,
    start_time="2010-01-01",
    end_time="2024-12-31",
)
print(f"Downloaded {result['event_count']} events from {result['source']}")

# Full global catalog (no spatial filter)
result = buildCatalog(
    source="gcmt",
    output_path="data/gcmt-full.csv",
    min_magnitude=5.5,
)

Supported sources are "usgs", "gcmt", "isc" and (in the future) "blast" (see docstring for details and current status).

Global cache & updates

kashima maintains a global cache so catalogs and fault databases are downloaded once and reused across all projects.

from kashima.mapper import (
    downloadAllCatalogs,
    updateAllCatalogs,
    get_cache_dir,
    clear_cache,
)

# One-time setup (or when you want to pre-populate everything)
catalogs = downloadAllCatalogs(include_faults=True)
print("Cache directory:", catalogs["cache_dir"])

# Incremental update (new events only + refreshed fault databases)
updated = updateAllCatalogs(include_faults=True)
print("New USGS events:", updated["usgs_new"])

# Inspect cache location
print("Cache lives in:", get_cache_dir())

# Optional: clear a catalog if needed
# clear_cache("usgs")

On first use, downloadAllCatalogs also copies any bundled data shipped inside the wheel into the cache, so initial setup is often instant.

Fault databases

Global fault datasets live in the cache as GeoJSON files and are consumed automatically by buildMap when show_faults_default=True. You can also work with them explicitly via:

  • buildGEMActiveFaults()
  • buildUSGSQuaternaryFaults()
  • buildEFSM20Faults()

Use fault_sets to choose which cached datasets to merge (any subset of "gem", "usgs", "efsm20") and faults_files to add custom GeoJSON faults (for example the Angola examples in examples/mapper/faults/).

Station layer

By default buildMap adds a global ISC stations layer:

  • The CSV isc_stations.csv is bundled inside the package and copied to the cache on first use.
  • When you do not pass station_csv_path, buildMap reads stations from the cache, clips them to the same geographic window as the events and adds them as a toggleable layer.
  • If you pass station_csv_path, your CSV is used instead and the default ISC stations are ignored.
  • Note: passing an empty string for station_csv_path raises an error; omit it to use the default ISC stations.

Examples

Complete, runnable workflows live in examples/mapper/:

  • Catalog setup & maintenance
    • 00_download_catalogs.py, 00_update_catalogs.py
    • 01_usgs_catalog.py, 02_gcmt_catalog.py, 03_isc_catalog.py, 03_update_catalogs.py, 04_rebuild_cache.py
  • Basic and intermediate maps
    • 04_minimal_map.py, 05_map_with_beachballs.py, 06_map_with_custom_legend.py, 07_map_with_heatmap.py, 08_map_with_faults.py, 09_map_advanced_config.py, longonjo.py
  • Fault databases & stations
    • 05_custom_faults.py, 06_update_active_faults.py, 07_compile_all_fault_databases.py, 08_update_all_catalogs_and_faults.py
  • Custom catalogs
    • 10_blast_catalog.py

Use these scripts as living documentation of typical workflows and advanced configuration.

Offline GMDB workflows live in examples/gmdb/:

  • 01_export_offline_catalog.py, 02_resolve_single_event.py
  • 03_enrich_eventtable_csv.py, 04_enrich_with_robust_inputs.py
  • 05_update_eventtable_csv.py (recommended for maintaining EventTable.<PROVIDER>.csv)

GMM evaluation examples live in examples/gmm/:

  • 01_sa_longtable_distances.py
  • 02_sa_longtable_planar_corners.py
  • 03_sa_longtable_from_r.py

Advanced example: examples/mapper/longonjo.py

This script demonstrates a project-style map centred on Angola with:

  • custom magnitude bins, colour scales and point sizes tuned for satellite imagery;
  • a mix of global fault databases and multiple regional fault GeoJSON files passed via faults_files;
  • a large search radius (radius_km=3500), satellite base tiles, locked panning and keep_data=True so the generated data/ directory can be inspected or versioned.

Use it as a template for real engineering projects: copy the script, adjust coordinates, radius, faults_files and project metadata, and you will obtain a map suitable for inclusion in technical reports.

API overview

The public API of kashima.mapper is defined by what is exported from kashima/mapper/__init__.py. The most important entry points are:

  • Map & catalogs
    • buildMap(): high-level map builder
    • buildCatalog(): generic catalog builder (source="usgs" | "gcmt" | "isc")
    • buildUSGSCatalog(), buildGCMTCatalog(), buildISCCatalog()
  • Faults & auxiliary data
    • buildGEMActiveFaults(), buildUSGSQuaternaryFaults(), buildEFSM20Faults()
  • Cache management
    • downloadAllCatalogs(), updateAllCatalogs()
    • updateUSGSCatalog(), updateGCMTCatalog(), updateISCCatalog()
    • updateGEMActiveFaults(), updateUSGSQuaternaryFaults(), updateEFSM20Faults()
    • get_cache_dir(), clear_cache()
  • Core classes & configuration
    • MapConfig, EventConfig, FaultConfig, StationConfig, BlastConfig
    • BaseMap, USGSCatalog, GCMTCatalog, BlastCatalog, EventMap
    • Constants: EARTH_RADIUS_KM, TILE_LAYERS, calculate_zoom_level()

For the definitive parameter list and defaults, always refer to the Python docstrings:

from kashima.mapper import buildMap, buildCatalog
help(buildMap)
help(buildCatalog)

Help & documentation

  • GitHub Pages: the rendered docs site includes a landing page at docs/ (see docs/index.md).
  • API help: use Python's built-in tools, for example python -m pydoc kashima.mapper.buildMap or help(buildMap) / help(buildCatalog) from an interactive session.
  • User and internal docs: additional notes live in the docs/ directory of the source repository:
    • docs/user_guide.md: offline resolver workflows
    • docs/gmdb_provider_ingestion.md: provider ingestion (no download) + payload policy + QA scripts
    • docs/gmdb_eventtable.md: GMDB provider EventTable maintenance via kashima.gmdb.updateEventTable(...)
    • docs/gmdb_stationtable.md: GMDB provider StationTable maintenance via kashima.gmdb.updateStationTable(...)
    • docs/gmdb_vs30.md: GMDB Vs30 query/update helpers (getStationVs30(...), updateStationVs30Pipeline(...))
    • docs/gmdb_naming_conventions.md, docs/naming_conventions.md, docs/mapper_layers_plan.md
  • Examples as a manual: the scripts under examples/mapper/ illustrate end-to-end workflows and advanced configuration.

Documentation for developers

Some internal design notes and naming rules live in docs/:

  • docs/naming_conventions.md: what is considered public API vs. internal helpers and the naming scheme used across the package.
  • docs/mapper_layers_plan.md: design notes for the ISC stations layer, fault handling and cache behavior.

These documents are primarily for contributors; end users should not rely on internal helpers, only on the public API listed above.

Command-line interface / man page

At the moment kashima is a Python library only: it does not install a standalone kashima command and does not ship a Unix man page.

For built-in help, use Python's introspection:

# From the shell
python -m pydoc kashima.mapper.buildMap
python -m pydoc kashima.mapper.buildCatalog

or from a Python session:

from kashima.mapper import buildMap
help(buildMap)

Dependencies

Core runtime dependencies (installed automatically by pip install kashima):

  • Python (>= 3.8)
  • pandas, numpy, folium, geopandas, pyproj
  • requests, branca, geopy, matplotlib
  • obspy (beachball rendering), pyarrow (parquet cache)

Some fault builders download large GeoJSONs or talk to WFS services and therefore require a working internet connection the first time you run them.

License

MIT License - see LICENSE (to be added)

Citation

@software{kashima2022,
  author = {Verri Kozlowski, Alejandro},
  title = {kashima: Interactive Seismic Event Mapping and Catalog Management},
  year = {2022},
  version = {3.0.0},
  url = {https://averrik.github.io/kashima/}
}

Author

Alejandro Verri Kozlowski
Email: averri@fi.uba.ar
ORCID: 0000-0002-8535-1170
Affiliation: Universidad de Buenos Aires, Facultad de Ingeniería

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kashima-3.1.0.tar.gz (49.4 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kashima-3.1.0-py3-none-any.whl (50.7 MB view details)

Uploaded Python 3

File details

Details for the file kashima-3.1.0.tar.gz.

File metadata

  • Download URL: kashima-3.1.0.tar.gz
  • Upload date:
  • Size: 49.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for kashima-3.1.0.tar.gz
Algorithm Hash digest
SHA256 129e69b462f0379163316a1b2040df90b858733da7f3862058a291959d261186
MD5 024c832b3258a239da8b3f277757dcac
BLAKE2b-256 ad1b79810927711549157e92ce234368d01764f8b268543cdcf90c97bbad91f0

See more details on using hashes here.

File details

Details for the file kashima-3.1.0-py3-none-any.whl.

File metadata

  • Download URL: kashima-3.1.0-py3-none-any.whl
  • Upload date:
  • Size: 50.7 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for kashima-3.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bcd8d3322130e81597ad13fd86f35bb1f678be143fd7447910700c55c0a9afab
MD5 7f2b85f43e8984dea1303a505ff43489
BLAKE2b-256 19d2ba99fdbcc829734eab82bba59bbdd13e5a9625993067aaa1065671750220

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page