A library for building and parsing Seismology API message bodies.
Project description
postprocessing_seismo_lib
postprocessing_seismo_lib is a lightweight Python library for building and parsing structured API messages, especially for use with nested JSON structures used in event-based data systems. Currently, the library works on building out the Response format for seismology associator outputs, or extracting the body out of its Response format.
This library is vetted and works against python 3.10.5. This library does not work with any python libraries below 3.10 (it has been specifically vetted against python 3.6.5 and python 3.8.10 and found not to work).
Features
- Extract the
bodysection from a structured JSON file usingextract_body_from_file - Create request for a body object using
wrap_data, with provided associator or pickfilter files. - Validates if input and output formats are to specification using
wrap_data - Builds a full message with status, headers and body using
convert_file_to_json, with provided csv, arcout or quakeml files
This library also provides utilities for converting between different seismic pick data formats, including:
- SOA → SOA (ANSS-informed enhancements)
- SOA → ANSS
- ANSS → SOA
- PhaseNet CSV → SOA
These functions are useful for normalizing pick data across different pipelines and tools. See the section How to Run the Conversion Examples for details.
Use cases of this library
- Individual users
- Pipeline scripts
Installation
pip install postprocessing-seismo-lib
OR
pip install --upgrade postprocessing-seismo-lib
After installation, we have provided sample files that can be vetted against the library's functions, specifically against the three features listed above. Run the below script to analyze the contents of each file, and see if the outputs are generated locally:
import json, importlib.resources
from postprocessing_seismo_lib import wrap_data, extract_body_from_file, convert_file_to_json
pick_file = json.load(importlib.resources.files('postprocessing_seismo_lib.example_data').joinpath('xxxx_file_containing_picks.json').open('r'))
print("Picks")
print(pick_file)
filtered_pick_file = json.load(importlib.resources.files('postprocessing_seismo_lib.example_data').joinpath('xxxx_file_containing_filtered_picks.json').open('r'))
print("Filtered picks")
print(pick_file)
json_path = importlib.resources.files('postprocessing_seismo_lib.example_data').joinpath('40584759_csv.json')
print("JSON file with body")
print(json_path)
#FIRST USE CASE: Extract body from a JSON file
body_data = extract_body_from_file(str(json_path))
print("Body extracted:")
print(body_data)
#SECOND USE CASE: Create RetrieveParameter wrapping around input data for various modules
## FOR THE ASSOCIATOR MODULE:
input_path = importlib.resources.files('postprocessing_seismo_lib.example_data').joinpath('xxxx_file_containing_filtered_picks.json')
wrap_data(
input_file_path=str(input_path),
output_file_path='output_associator.json',
evid='evid_filtered_picks',
module='associator'
)
## FOR THE PICK FILTER MODULE:
### THE BELOW SCENARIO filters picks under the default conditions:
[1] mode='hypoPN'
[2] testType='local'
[3] logging='False'
input_path = importlib.resources.files('postprocessing_seismo_lib.example_data').joinpath('xxxx_file_containing_picks.json')
wrap_data(
input_file_path=str(input_path),
output_file_path='output_pickfilter.json',
evid='evid_picks',
module='pickfilter'
)
### THE BELOW SCENARIO shows that we can adjust those conditions within the pickfilter:
[1] mode='st-proc'
[2] testType='local'
[3] logging='True'
wrap_data(
input_file_path = [YOUR INPUT FILE PATH],
output_file_path = [YOUR OUTPUT FILE PATH],
module = 'pickfilter',
evid = '[NAME OF EVID USED]',
mode = 'st-proc',
testType = 'local',
logging = 'True'
)
#THIRD USE CASE: Create Response wrapping around known data
gamma_events = importlib.resources.files('postprocessing_seismo_lib.example_data').joinpath('40584759_gamma_events.csv')
gamma_picks = importlib.resources.files('postprocessing_seismo_lib.example_data').joinpath('40584759_gamma_picks.csv')
xml_file_nosignifier = importlib.resources.files('postprocessing_seismo_lib.example_data').joinpath('40584759_events_testGOUA')
xml_file_signifier = importlib.resources.files('postprocessing_seismo_lib.example_data').joinpath('40584759_events_test.xml')
arcout_file = importlib.resources.files('postprocessing_seismo_lib.example_data').joinpath('z78966423_api_stproc_9999.arcout')
print("GaMMa events")
print(gamma_events)
print(type(gamma_events))
print("GaMMa picks")
print(gamma_picks)
print(type(gamma_picks))
# For CSV
convert_file_to_json(
input_file="", # not used for CSV
output_file="response_csv.json",
id="id_testing",
event_file=str(gamma_events),
pick_file=str(gamma_picks),
error_log_file="csv_error_log.txt"
)
# For QuakeML XML (this input file has no XML signifiers but was parsed successfully as XML here)
convert_file_to_json(
input_file=str(xml_file_nosignifier),
output_file="response_quakeml_nosignifiers.json",
id="id testing",
error_log_file="quakeml_error_log_one.txt"
)
#Conventional QuakeML XML here
convert_file_to_json(
input_file=str(xml_file_signifier),
output_file="response_quakeml_signifiers.json",
id="id testing",
error_log_file="quakeml_error_log_two.txt"
)
# For ArcOut
convert_file_to_json(
input_file=str(arcout_file),
output_file="response_arcout.json",
id="id testing",
error_log_file="arcout_error_log.txt"
)
Example Scenarios
Extraction of body
The below function allows for extracting out the body from an output response file:
from postprocessing_seismo_lib import extract_body_from_file
body_data = extract_body_from_file("output_response_association.json")
body_data = extract_body_from_file("output_response_pickfilter.json")
where as an example, output_response_association.json is:
{
"status": 404,
"headers": {
"Content-Type": "application/json"
},
"body": {
"id": "78604159",
"format": "none.noeventsfound",
"data": []
}
}
Creation of the request for a body object
The below function creates the request from the body object, which can be extracted from the above function. All four variables listed below need to be specified:
from postprocessing_seismo_lib import wrap_data
#creating the request for the associator input
wrap_data(
input_file_path='[xxxx_file_containing_filtered_picks].json',
output_file_path='output_associator.json',
evid='[Name of choice]',
module='associator'
)
#creating the request for the pickfilter input
## Pickfilter default settings:
[1] mode='hypoPN'
[2] testType='local'
[3] logging='False'
wrap_data(
input_file_path='[xxxx_file_containing_picks].json',
output_file_path='output_pickfilter.json',
evid='[Name of choice]',
module='pickfilter'
)
### Pickfilter, adjusting various settings:
[1] mode='st-proc'
[2] testType='local'
[3] logging='True'
wrap_data(
input_file_path = [YOUR INPUT FILE PATH],
output_file_path = [YOUR OUTPUT FILE PATH],
module = 'pickfilter',
evid = '[NAME OF EVID USED]',
mode = 'st-proc',
testType = 'local',
logging = 'True'
)
The request format will be different across each module. Currently, the module takes in 'associator' and 'pickfilter' but this will be expanded in future updates.
Specifically, this function reads a list of pick dictionaries from a JSON file specified by input_file_path, validates them against a schema, wraps the data into a module-specific JSON structure, validates the output, and writes it to a new file specified by output_file_path. Any errors are logged to a file named wrap_data_errors.log.
As an example, our input_file_path='[xxxx_file_containing_picks].json' might look like this (as a list of dictionaries):
[
{
"Amplitude": {
"Amplitude": 1039.6302490234,
"SNR": 11.074
},
"Filter": [
{
"HighPass": 1.0,
"Type": "HighPass"
}
],
"Onset": "emergent",
"Phase": "S",
"Picker": "deep-learning",
"Polarity": "no-result",
"Quality": [
{
"Standard": "PhaseNet",
"Value": 0.851
},
{
"Standard": "hypoinverse",
"Value": 2
}
],
"Site": {
"Channel": "HHE",
"Location": "",
"Network": "CI",
"Station": "WOR"
},
"Source": {
"AgencyID": "CI",
"Author": "hypoPN"
},
"Time": "2025-04-22T21:51:15.148Z",
"Type": "Pick"
},
{
...
}
]
and its output would be the necessary format to POST into the associator API endpoint:
{
"RetrieveParameters": {
"pickFile": "Ryan_testingAgainPicks_picks.json",
"pickDataStr": [
{
"Amplitude": {
"Amplitude": 1039.6302490234,
"SNR": 11.074
},
"Filter": [
{
"HighPass": 1.0,
"Type": "HighPass"
}
],
"Onset": "emergent",
"Phase": "S",
"Picker": "deep-learning",
"Polarity": "no-result",
"Quality": [
{
"Standard": "PhaseNet",
"Value": 0.851
},
{
"Standard": "hypoinverse",
"Value": 2
}
],
"Site": {
"Channel": "HHE",
"Location": "",
"Network": "CI",
"Station": "WOR"
},
"Source": {
"AgencyID": "CI",
"Author": "hypoPN"
},
"Time": "2025-04-22T21:51:15.148Z",
"Type": "Pick"
},
...
]
}
}
Creation of full response format
Below shows how to build out the Response format for provided files. In all cases below, you provide an ID and an output file name (of type json). Also, provide the error log file, in case any errors occur. If any errors exist, a file of the name you specified will be generated. If no errors exist, the output JSON file will be generated at the path where you run the python script.
If you are converting from csv to json, you provide the _events.csv and _picks.csv that are generated from pinging the associator API, and set them to event_file and pick_file. Leave the input_file blank. For quakeML or arcout conversion to json, specify the input_file.
from postprocessing_seismo_lib import convert_file_to_json
# For CSV
convert_file_to_json(
input_file="", # not used for CSV
output_file="[Output file name].json",
id="[Name of choice]",
event_file="[xxxx]_gamma_events.csv",
pick_file="[xxxx]_gamma_picks.csv",
error_log_file="csv_error_log.txt"
)
# For QuakeML XML (this input file has no XML signifiers but was parsed successfully as XML here)
convert_file_to_json(
input_file="[xxxx]_events_test",
output_file="[xxxx]_quakeml.json",
id="[Name of choice]",
error_log_file="quakeml_error_log.txt"
)
#Conventional QuakeML XML here
convert_file_to_json(
input_file="[xxxx]_events_test.xml",
output_file="[xxxx]_quakeml.json",
id="[Name of choice]",
error_log_file="quakeml_error_log.txt"
)
# For ArcOut
convert_file_to_json(
input_file="[xxxx]_api_stproc_9999.arcout",
output_file="[Output file name].json",
id="[Name of choice]",
error_log_file="arcout_error_log.txt"
)
Conversion of Formats
This library also allows for conversion between SOA and NEIC Pick formats. The following function calls help to achieve this:
from postprocessing_seismo_lib import (
anss_to_soa_pick_format,
soa_to_anss_pick_format,
phasenet_csv_to_soa_pick_format,
soa_to_soa_pick_format_using_anss_libraries
)
This uses anss-formats==0.1.0, which leverages the Pick and Detection classes that can be utilized within this library, and is the stable latest version.
How to Run the Conversion Examples
The library includes example datasets under postprocessing_seismo_lib.example_data. You can use these to test each conversion workflow.
Start by importing dependencies:
import json
import importlib.resources
import pandas as pd
Example 1: SOA → SOA (ANSS-Informed)
Enhances SOA picks using ANSS-informed logic.
pick_file_one_contents = json.load(
importlib.resources.files('postprocessing_seismo_lib.example_data')
.joinpath('79765767_picks.json')
.open('r')
)
print("Inspecting the pick contents, example one")
print(pick_file_one_contents)
pick_file_two_contents = json.load(
importlib.resources.files('postprocessing_seismo_lib.example_data')
.joinpath('60209491_picks.json')
.open('r')
)
print("Inspecting the pick contents, example two")
print(pick_file_two_contents)
pick_file_one = importlib.resources.files('postprocessing_seismo_lib.example_data').joinpath('79765767_picks.json')
print("Path for the pick file, example one")
print(pick_file_one)
pick_file_two = importlib.resources.files('postprocessing_seismo_lib.example_data').joinpath('60209491_picks.json')
print("Path for the pick file, example one")
print(pick_file_two)
soa_picks_informed_by_anss = soa_to_soa_pick_format_using_anss_libraries(pick_file_one)
print("SOA picks (ANSS-informed):")
print(soa_picks_informed_by_anss)
Example 2: SOA → ANSS
Converts SOA-formatted picks into ANSS format using station metadata.
archive_stations = importlib.resources.files('postprocessing_seismo_lib.example_data').joinpath('archive_stations_loc_dates.csv')
pick_file_two = importlib.resources.files('postprocessing_seismo_lib.example_data').joinpath('60209491_picks.json')
print("Path for the archive station file")
print(archive_stations)
df = pd.read_csv(archive_stations)
print("Archive station contents:")
print(df)
formatted_anss_picks = soa_to_anss_pick_format(pick_file_two, archive_stations)
print("ANSS picks, from SOA")
print(formatted_anss_picks)
# List of picks in ANSS format
picks_anss = formatted_anss_picks["picks"]
string_dict_picks = json.dumps(picks_anss, indent=4, default=str)
# saved file
with open("picks_ANSS_from_library.json", "w") as f:
f.write(string_dict_picks)
Example 3: PhaseNet CSV → SOA
Converts PhaseNet TensorFlow CSV output into SOA pick format.
picks_phasenet_tensorflow = importlib.resources.files('postprocessing_seismo_lib.example_data').joinpath('picks_phasenet_tensorflow.csv')
print("Path for the picks_phasenet_tensorflow file")
print(picks_phasenet_tensorflow)
df = pd.read_csv(picks_phasenet_tensorflow)
print("Phasenet picks, tensorflow, contents:")
print(df)
highpass_filt = 1.0
pick_list = phasenet_csv_to_soa_pick_format(picks_phasenet_tensorflow, highpass_filt)
print("SOA picks, from Phasenet CSV")
print(pick_list)
Example 4: ANSS → SOA
Converts ANSS-formatted picks into SOA format.
anss_file_contents = json.load(importlib.resources.files('postprocessing_seismo_lib.example_data').joinpath('picks_ANSS.json').open('r'))
print("Inspecting the pick contents, ANSS format")
print(anss_file_contents)
anss_file_path = importlib.resources.files('postprocessing_seismo_lib.example_data').joinpath('picks_ANSS.json')
print("Path for the pick file, ANSS format")
print(anss_file_path)
soa_picks=anss_to_soa_pick_format(anss_file_path)
# Access list of picks
print("SOA picks, from ANSS")
picks_soa=soa_picks["picks"]
print(picks_soa)
Notes
- All conversion functions return a dictionary containing a "picks" key.
- You can directly manipulate or save this list depending on your workflow.
- Ensure input files match expected formats (JSON or CSV as required).
- These utilities are especially useful for integrating:
- PhaseNet outputs
- ANSS datasets
- SOA-based processing pipelines
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file postprocessing_seismo_lib-0.1.66.tar.gz.
File metadata
- Download URL: postprocessing_seismo_lib-0.1.66.tar.gz
- Upload date:
- Size: 209.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
275ac2aa9b30c883cf226947775b569581f8a67e3c113375d3293b9a1a5f1451
|
|
| MD5 |
e5e64c2742bd1631b8329b322b78d267
|
|
| BLAKE2b-256 |
c2c5fd210ce31e96d027a45e6f9fccd65bdf9c594cb1da951f3a3a694b3dc761
|
File details
Details for the file postprocessing_seismo_lib-0.1.66-py3-none-any.whl.
File metadata
- Download URL: postprocessing_seismo_lib-0.1.66-py3-none-any.whl
- Upload date:
- Size: 238.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3a0d519a711ddc55f5de63fa394be4904f75dc3dba6e89b0c2c3285084de43f0
|
|
| MD5 |
0f400f4a79ba975b99caa9bc757708cf
|
|
| BLAKE2b-256 |
911189a2c7646512f6594a04985511e3f73aedcc84fc83e56659e856a4a6b2cb
|