Machine Learning Tools for Geotechnical Earthquake Engineering.
Project description
kashima
Interactive Seismic Event Mapping and Catalog Management
Last updated: February 26, 2026
kashima is a Python library for seismic event visualization and catalog processing that produces interactive Folium-based web maps from global earthquake catalogs and auxiliary datasets.
What is it?
kashima focuses on the mapping workflow for engineering seismology: given one or more sites of interest, it builds reproducible web maps that combine:
- Global earthquake catalogs (USGS ComCat, Global CMT NDK, ISC Bulletin)
- Auxiliary fault databases (GEM, USGS Quaternary, EFSM20) and user GeoJSON faults
- Global ISC station layer (from a packaged CSV, filtered to the map window)
- Optional user event catalogs (e.g. blasts) in CSV or Parquet form
The heavy lifting (data download, caching, clipping and styling) is encapsulated in a small public API under kashima.mapper.
Who is this for? (TL;DR)
kashima is aimed at engineering seismology, seismic hazard and mining/energy projects where you need reproducible, shareable web maps of earthquakes, faults and stations around one or more sites.
The typical workflow is:
- Install the library:
pip install kashima. - Pre-populate the global cache once:
downloadAllCatalogs(include_faults=True). - Build a map for your site with
buildMap(...)and open the generatedmaps/index.htmlin a browser.
Features
- Multi-catalog support: USGS, Global CMT (NDK method), ISC, and custom blasts
- GMDB maintenance helpers: update provider
EventTable.<PROVIDER>.csvandStationTable.<PROVIDER>.csvfiles; fill audit IDs for events; optionally augment with generation mechanism + faulting style - GMM/GMPE evaluation helpers: evaluate Sa(T)/PGA from single-TRT GMPE logic tree seeds (OpenQuake hazardlib required at runtime). See
examples/gmm/01_*..04_*. - GMDB Vs30 helpers: query and update station Vs30 from GMDB
StationTable.*.csv, combining provider (Vs30.owner), neighbor-inferred (Vs30.neighbors) and USGS-proxy (Vs30.USGS) values into a syntheticStationVs30 - STREC + Slab2 integration (optional but supported): bootstrap Slab2 grids and write auditable
*.STRECcolumns for reviewer-friendly subduction subtype tagging - Interactive maps: Folium-based maps with beachball focal mechanisms, distance rings and rich tooltips
- Global cache: Download catalogs and fault databases once, reuse across projects with incremental updates
- Advanced visualizations: Heatmaps, clustered markers, epicentral circles, fault overlays
- Auxiliary data: GEM Active Faults, USGS Quaternary Faults, EFSM20 fault databases
- Global ISC stations: packaged CSV (~41k stations) automatically clipped to the map radius, with a dedicated layer
- Multi-fault datasets: combine GEM, USGS Quaternary and EFSM20 faults (and local GeoJSONs) in a single color-coded layer
- Reproducible projects: every map writes the catalogs actually used to
./mapsand (optionally)./datafor auditability
Installation
Requires Python 3.8+.
pip install kashima
Development version:
git clone https://github.com/averrik/kashima.git
cd kashima
pip install -e .
Quickstart
- Install
kashima(see above). - Initialize the global cache (earthquake catalogs + optional fault databases). Run this once per machine:
from kashima.mapper import downloadAllCatalogs
# First-time setup: fills ~/.cache/kashima (Linux),
# ~/Library/Caches/kashima (macOS) or %LOCALAPPDATA%\kashima\Cache\ (Windows)
downloadAllCatalogs(include_faults=True)
- Build your first map around a site of interest:
from kashima.mapper import buildMap
result = buildMap(
latitude=-32.86758,
longitude=-68.88867,
radius_km=500,
project_name="Mendoza seismicity",
client="Example Mining Co.",
)
print("HTML map:", result["html"]) # ./maps/index.html
print("Events CSV:", result["csv"]) # ./maps/epicenters.csv
Open the generated index.html in a browser to explore earthquakes, faults and stations interactively.
Usage
GMDB EventTable workflows (audit + enrichment)
The kashima.gmdb module provides workflows for maintaining provider EventTables.
updateEventTable(...): fillusgsEventId/iscEventId, optionally remove non-resolvable rows, and augment rows with:faultStyle(from moment tensors)genMech+ auditable*.STRECcolumns (STREC + Slab2)
Bootstrap STREC+Slab2 once per machine:
from kashima.gmdb import ensureStrecData
# Creates ~/.strec/config.ini and downloads Slab2 grids into the configured slab folder.
ensureStrecData(createConfig=True, ensureSlab2=True, downloadSlab2=True)
See docs/gmdb_eventtable.md, docs/gmdb_stationtable.md, and examples/gmdb/06_update_all_eventtables_with_strec.py.
GMDB Vs30 helpers (station site conditions)
GMDB StationTable.*.csv files carry several Vs30-related columns (Vs30.owner, Vs30.neighbors, Vs30.USGS, StationVs30).
The kashima.gmdb helpers expose these through a small, file-based API so you do not have to re-implement column semantics.
Query helpers:
from kashima.gmdb import (
getStationVs30,
getStationVs30Sources,
getStationVs30Index,
)
# Single-station Vs30 (synthetic by default: owner > neighbors > USGS)
vs30 = getStationVs30(
"NWZ",
"WEL/THZ",
indexDir="~/kashimaDB/gmdb.v2/index",
source="synthetic", # or "owner" / "neighbors" / "usgs"
)
# Full provenance for one station
sources = getStationVs30Sources("NWZ", "WEL/THZ", indexDir="~/kashimaDB/gmdb.v2/index")
# Bulk index: StationID -> Vs30Sources
idx = getStationVs30Index("NWZ", indexDir="~/kashimaDB/gmdb.v2/index")
Update helpers (flatfiles → Vs30.owner → full pipeline):
from kashima.gmdb import (
updateStationVs30OwnerFromFlatfiles,
updateStationVs30Pipeline,
)
# Refresh Vs30.owner from provider flatfiles
status = updateStationVs30OwnerFromFlatfiles(
"NGAW",
rootDir="~/kashimaDB/gmdb.v2",
indexDir="~/kashimaDB/gmdb.v2/index",
)
# Run the full Vs30 pipeline (owner + neighbors + USGS + synthetic rebuild)
r = updateStationVs30Pipeline(
"NGAW",
rootDir="~/kashimaDB/gmdb.v2",
indexDir="~/kashimaDB/gmdb.v2/index",
distanceKm=1.0,
includeLocOnly=True,
usgsGrid=None, # default: downloads global_vs30.grd into the user cache (if missing)
)
See docs/gmdb_vs30.md and examples/gmdb/10_vs30_helpers.py for more details.
GMDB provider ingestion (no download)
kashima.gmdb also includes short helpers to ingest already-downloaded provider data into a GMDB root:
ingestFDSNProvider(...): restage + normalize + base tables + RecordTablesupdateProviderIndex(...): completion step for EventTable/StationTable/Vs30validateProvider(...): RecordTable + filesystem validationauditProviderFDSN(...): optional online audit against FDSN endpoints
MiniSEED payload policy:
- MiniSEED-based owners (NCEDC/SCEDC/NN/PNSN) are normalized to ASCII
.txtpayloads inraw.owner/. - On apply/rewriteExisting, the original
.mseedfiles are removed (no intermediate artifacts left).
PGA sanity workflow (recommended for MiniSEED owners):
- Rebuild RecordTables from payloads.
- Audit outliers.
- If sentinel-derived spikes exist in already-materialized
.txt, repair payloads and rebuild again.
Scripts:
- Audit PGA outliers:
examples/gmdb/23_audit_recordtable_pga_outliers.py - Repair
.txtsentinels (dry-run default; requires--apply):examples/gmdb/24_repair_fdsn_txt_sentinels.py
Examples:
examples/gmdb/19_ingest_fdsn_provider.pyexamples/gmdb/21_ingest_fdsn_provider_end_to_end.pyexamples/gmdb/22_normalize_raw_owner.py
Map layers and concepts
Each map produced by buildMap is composed of several layers that you can turn on/off in the Folium LayerControl:
- Events: epicentral points coloured and sized by magnitude, coming from USGS/GCMT/ISC or an optional user CSV.
- Clustered view: an alternative representation where nearby events are grouped into clusters to reduce overplotting.
- Heatmap: a smoothed density field of events, controlled by the
heatmap_*parameters. - Beachballs: focal mechanisms (from GCMT) drawn as beachball symbols for events above a given magnitude.
- Faults: line features from global fault databases (GEM, USGS Quaternary, EFSM20) selected via
fault_sets, plus any local GeoJSON passed infaults_files, all clipped to the same geographic window as the events. - Stations: global ISC stations from the packaged CSV (or your own
station_csv_path), clipped to the map window and rendered as square markers. - Site marker: a star symbol at the site location (
latitude,longitude). - Epicentral circles: concentric distance rings around the site, controlled by
epicentral_circles.
These layers showcase most of the power of kashima; the parameters of buildMap let you decide which ones to include and how they look.
High-level map API: buildMap
The main entry point is kashima.mapper.buildMap. It:
- Copies the latest cached USGS/ISC/GCMT catalogs into a project-local
data/directory - Optionally merges global fault databases (GEM, USGS Quaternary, EFSM20) and user GeoJSON faults
- Adds a global ISC stations layer by default (unless you override it)
- Builds a Folium map and writes
maps/index.html+maps/epicenters.csv
Minimal call (requires a pre-populated cache, see Quickstart):
from kashima.mapper import buildMap
result = buildMap(
latitude=-32.86758,
longitude=-68.88867,
)
A more realistic example using multiple layers, fault sets and local faults:
from kashima.mapper import buildMap
result = buildMap(
latitude=-12.90795,
longitude=+15.24845,
radius_km=3500,
# Layer visibility
show_events_default=True,
show_cluster_default=False,
show_heatmap_default=True,
show_beachballs_default=True,
show_faults_default=True,
show_epicentral_circles_default=True,
# Fault datasets: global cache + local GeoJSONs
fault_sets=["gem", "usgs", "efsm20"],
faults_files=[
"examples/mapper/faults/Angola1982.geojson",
"examples/mapper/faults/Escosa2024.geojson",
],
# Stations: default ISC CSV from cache, custom title
station_layer_title="ISC + local stations",
# Keep ./data snapshot for documentation
keep_data=True,
)
print(result)
Key parameter groups (see help(buildMap) for the full list and defaults):
- Location & radius (
latitude,longitude,radius_km,event_radius_multiplier): define the geographic window of the map.radius_kmsets the base radius, andevent_radius_multiplierscales that radius when computing the spatial window used for events, faults and stations. - Layers (
show_events_default,show_cluster_default,show_heatmap_default,show_beachballs_default,show_faults_default,show_stations_default,show_epicentral_circles_default): control which layers are visible when the map opens. Users can still toggle them later via the Folium LayerControl. - Catalogs & data (
user_events_csv,keep_data,output_dir): override the global catalogs with your own CSV, preserve the./datasnapshot for auditability and choose wheremaps/anddata/are written. - Fault configuration (
fault_sets,faults_files,regional_faults_color,regional_faults_weight,faults_coordinate_system): select which cached fault databases (any subset of"gem","usgs","efsm20") are merged and which extra GeoJSON faults to add (for example the Angola files used inexamples/mapper/longonjo.py), and how they are styled. - Stations (
station_csv_path,station_coordinate_system,station_layer_title,show_stations_default): keep the default global ISC stations or replace them with your own CSV, adjusting CRS and layer title for the stations layer. Useshow_stations_default=Falseto start with the stations layer turned off. - Styling & legend (
mag_bins,dot_palette,dot_sizes,beachball_sizes,fault_style_meta,color_palette,color_reversed,scaling_factor,legend_title,legend_position): control how magnitudes map to colours and sizes and how the legend is rendered. Much of the visual power of examples likeexamples/mapper/longonjo.pycomes from careful tuning of these parameters. - XY coordinates (
x_col,y_col,location_crs): work in projected coordinates (for example local UTM) instead of latitude/longitude, useful when your input catalogs are already in a local CRS. - Tooltips (
tooltip_fields,legend_map): choose which event fields appear in the tooltip and how they are labelled. - Map behavior (
base_zoom_level,min_zoom_level,max_zoom_level,default_tile_layer,auto_fit_bounds,lock_pan,epicentral_circles): control the initial view (zoom levels and base tile layer) and how many distance rings are drawn around the site viaepicentral_circles.auto_fit_boundsandlock_panexist for future map-behaviour controls and may have no visible effect in some versions; usehelp(buildMap)for the authoritative description.
buildMap returns a small dictionary:
{
"html": "path/to/index.html",
"csv": "path/to/epicenters.csv",
"event_count": 1234,
}
Catalog API: buildCatalog
For scripted data pipelines you can call buildCatalog directly to fetch and save catalogs without generating maps.
from kashima.mapper import buildCatalog
# Radial USGS query around a site
result = buildCatalog(
source="usgs",
output_path="data/usgs-events.csv",
latitude=-32.86758,
longitude=-68.88867,
max_radius_km=500,
min_magnitude=5.0,
start_time="2010-01-01",
end_time="2024-12-31",
)
print(f"Downloaded {result['event_count']} events from {result['source']}")
# Full global catalog (no spatial filter)
result = buildCatalog(
source="gcmt",
output_path="data/gcmt-full.csv",
min_magnitude=5.5,
)
Supported sources are "usgs", "gcmt", "isc" and (in the future) "blast" (see docstring for details and current status).
Global cache & updates
kashima maintains a global cache so catalogs and fault databases are downloaded once and reused across all projects.
from kashima.mapper import (
downloadAllCatalogs,
updateAllCatalogs,
get_cache_dir,
clear_cache,
)
# One-time setup (or when you want to pre-populate everything)
catalogs = downloadAllCatalogs(include_faults=True)
print("Cache directory:", catalogs["cache_dir"])
# Incremental update (new events only + refreshed fault databases)
updated = updateAllCatalogs(include_faults=True)
print("New USGS events:", updated["usgs_new"])
# Inspect cache location
print("Cache lives in:", get_cache_dir())
# Optional: clear a catalog if needed
# clear_cache("usgs")
On first use, downloadAllCatalogs also copies any bundled data shipped inside the wheel into the cache, so initial setup is often instant.
Fault databases
Global fault datasets live in the cache as GeoJSON files and are consumed automatically by buildMap when show_faults_default=True. You can also work with them explicitly via:
buildGEMActiveFaults()buildUSGSQuaternaryFaults()buildEFSM20Faults()
Use fault_sets to choose which cached datasets to merge (any subset of "gem", "usgs", "efsm20") and faults_files to add custom GeoJSON faults (for example the Angola examples in examples/mapper/faults/).
Station layer
By default buildMap adds a global ISC stations layer:
- The CSV
isc_stations.csvis bundled inside the package and copied to the cache on first use. - When you do not pass
station_csv_path,buildMapreads stations from the cache, clips them to the same geographic window as the events and adds them as a toggleable layer. - If you pass
station_csv_path, your CSV is used instead and the default ISC stations are ignored. - Note: passing an empty string for
station_csv_pathraises an error; omit it to use the default ISC stations.
Examples
Complete, runnable workflows live in examples/mapper/:
- Catalog setup & maintenance
00_download_catalogs.py,00_update_catalogs.py01_usgs_catalog.py,02_gcmt_catalog.py,03_isc_catalog.py,03_update_catalogs.py,04_rebuild_cache.py
- Basic and intermediate maps
04_minimal_map.py,05_map_with_beachballs.py,06_map_with_custom_legend.py,07_map_with_heatmap.py,08_map_with_faults.py,09_map_advanced_config.py,longonjo.py
- Fault databases & stations
05_custom_faults.py,06_update_active_faults.py,07_compile_all_fault_databases.py,08_update_all_catalogs_and_faults.py
- Custom catalogs
10_blast_catalog.py
Use these scripts as living documentation of typical workflows and advanced configuration.
Offline GMDB workflows live in examples/gmdb/:
01_export_offline_catalog.py,02_resolve_single_event.py03_enrich_eventtable_csv.py,04_enrich_with_robust_inputs.py05_update_eventtable_csv.py(recommended for maintainingEventTable.<PROVIDER>.csv)
GMM evaluation examples live in examples/gmm/:
01_sa_longtable_distances.py02_sa_longtable_planar_corners.py03_sa_longtable_from_r.py
Advanced example: examples/mapper/longonjo.py
This script demonstrates a project-style map centred on Angola with:
- custom magnitude bins, colour scales and point sizes tuned for satellite imagery;
- a mix of global fault databases and multiple regional fault GeoJSON files passed via
faults_files; - a large search radius (
radius_km=3500), satellite base tiles, locked panning andkeep_data=Trueso the generateddata/directory can be inspected or versioned.
Use it as a template for real engineering projects: copy the script, adjust coordinates, radius, faults_files and project metadata, and you will obtain a map suitable for inclusion in technical reports.
API overview
The public API of kashima.mapper is defined by what is exported from kashima/mapper/__init__.py. The most important entry points are:
- Map & catalogs
buildMap(): high-level map builderbuildCatalog(): generic catalog builder (source="usgs" | "gcmt" | "isc")buildUSGSCatalog(),buildGCMTCatalog(),buildISCCatalog()
- Faults & auxiliary data
buildGEMActiveFaults(),buildUSGSQuaternaryFaults(),buildEFSM20Faults()
- Cache management
downloadAllCatalogs(),updateAllCatalogs()updateUSGSCatalog(),updateGCMTCatalog(),updateISCCatalog()updateGEMActiveFaults(),updateUSGSQuaternaryFaults(),updateEFSM20Faults()get_cache_dir(),clear_cache()
- Core classes & configuration
MapConfig,EventConfig,FaultConfig,StationConfig,BlastConfigBaseMap,USGSCatalog,GCMTCatalog,BlastCatalog,EventMap- Constants:
EARTH_RADIUS_KM,TILE_LAYERS,calculate_zoom_level()
For the definitive parameter list and defaults, always refer to the Python docstrings:
from kashima.mapper import buildMap, buildCatalog
help(buildMap)
help(buildCatalog)
Help & documentation
- GitHub Pages: the rendered docs site includes a landing page at
docs/(seedocs/index.md). - API help: use Python's built-in tools, for example
python -m pydoc kashima.mapper.buildMaporhelp(buildMap)/help(buildCatalog)from an interactive session. - User and internal docs: additional notes live in the
docs/directory of the source repository:docs/user_guide.md: offline resolver workflowsdocs/gmdb_provider_ingestion.md: provider ingestion (no download) + payload policy + QA scriptsdocs/gmdb_eventtable.md: GMDB provider EventTable maintenance viakashima.gmdb.updateEventTable(...)docs/gmdb_stationtable.md: GMDB provider StationTable maintenance viakashima.gmdb.updateStationTable(...)docs/gmdb_vs30.md: GMDB Vs30 query/update helpers (getStationVs30(...),updateStationVs30Pipeline(...))docs/gmdb_naming_conventions.md,docs/naming_conventions.md,docs/mapper_layers_plan.md
- Examples as a manual: the scripts under
examples/mapper/illustrate end-to-end workflows and advanced configuration.
Documentation for developers
Some internal design notes and naming rules live in docs/:
docs/naming_conventions.md: what is considered public API vs. internal helpers and the naming scheme used across the package.docs/mapper_layers_plan.md: design notes for the ISC stations layer, fault handling and cache behavior.
These documents are primarily for contributors; end users should not rely on internal helpers, only on the public API listed above.
Command-line interface / man page
At the moment kashima is a Python library only: it does not install a standalone kashima command and does not ship a Unix man page.
For built-in help, use Python's introspection:
# From the shell
python -m pydoc kashima.mapper.buildMap
python -m pydoc kashima.mapper.buildCatalog
or from a Python session:
from kashima.mapper import buildMap
help(buildMap)
Dependencies
Core runtime dependencies (installed automatically by pip install kashima):
- Python (>= 3.8)
- pandas, numpy, folium, geopandas, pyproj
- requests, branca, geopy, matplotlib
- obspy (beachball rendering), pyarrow (parquet cache)
Some fault builders download large GeoJSONs or talk to WFS services and therefore require a working internet connection the first time you run them.
License
MIT License - see LICENSE (to be added)
Citation
@software{kashima2022,
author = {Verri Kozlowski, Alejandro},
title = {kashima: Interactive Seismic Event Mapping and Catalog Management},
year = {2022},
version = {3.0.0},
url = {https://averrik.github.io/kashima/}
}
Author
Alejandro Verri Kozlowski
Email: averri@fi.uba.ar
ORCID: 0000-0002-8535-1170
Affiliation: Universidad de Buenos Aires, Facultad de Ingeniería
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file kashima-3.1.0.tar.gz.
File metadata
- Download URL: kashima-3.1.0.tar.gz
- Upload date:
- Size: 49.4 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
129e69b462f0379163316a1b2040df90b858733da7f3862058a291959d261186
|
|
| MD5 |
024c832b3258a239da8b3f277757dcac
|
|
| BLAKE2b-256 |
ad1b79810927711549157e92ce234368d01764f8b268543cdcf90c97bbad91f0
|
File details
Details for the file kashima-3.1.0-py3-none-any.whl.
File metadata
- Download URL: kashima-3.1.0-py3-none-any.whl
- Upload date:
- Size: 50.7 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bcd8d3322130e81597ad13fd86f35bb1f678be143fd7447910700c55c0a9afab
|
|
| MD5 |
7f2b85f43e8984dea1303a505ff43489
|
|
| BLAKE2b-256 |
19d2ba99fdbcc829734eab82bba59bbdd13e5a9625993067aaa1065671750220
|