Skip to main content

a set of tools for geophysical data processing

Project description

gdp: Geophysical Data Processing

gdp provides a set of tools that are available through command-line-interface (CLI) to process and/convert common geophysical data types.

Release notes

Version 0.1.0

This version is the first version that is published on PyPI and it includes the following tools:

Tool Description
cat concatenate/reformat numerical or non-numerical data
union generate the union of input data files
intersect generate the intersect of input data files
difference generate the difference of input data files
split split a concatenated dataset into multiple data files
min calculate minimum of values in numerical column(s)
max calculate maximum of values in numerical column(s)
sum calculate summation of values in numerical column(s)
mean calculate mean of values in numerical column(s)
median calculate median of values in numerical column(s)
std calculate standard deviation of values in numerical column(s)
pip output points inside/outside a polygon (ray tracing method)
gridder gridding/interpolation of 2D/map data with Gaussian smoothing applied
mseed2sac convert mseed to sac. this script also handles data fragmentation issue.
sac2dat convert sac to dat (ascii); output format: time, amplitude
nc2dat convert nc data to dat/ascii

Examples

Example gdp commands are explained below:

gdp  mseed2sac mseed_dataset/*  --reformat --offset -500 --resample 10 -o sac_dataset 

Description: This command will convert mseed files in 'mseed_dataset' to another directory named 'sac_dataset'. Flag '--reformat' will cause creating of sub-folders in the output directory in 'YYJJJHHMMSS' format, and the sacfiles within these sub-directories will be renamed as 'YYJJJHHMMSS_STA.CHN', where 'STA' is the station code and 'CHN' is the channel code. If reformat is enabled, offset time can be adjusted using '--offset'. Finally, '--resample 10' results in resampling of output timeseries to 10 Hz.

gdp sac2dat sac_dataset/*  -o timeseries --timerange 0 3600

Description: This command will output the first hour (0-3600 s) of the sac data in sac_dataset/*

gdp nc2dat model.nc --metadata
gdp nc2dat model.nc -v vs vp --fmt .2 .6 -o model.dat

Description: This tool can be used to convert NetCDF files to ascii format. In this example, by running the first command, the program will output meta data information related to 'model.nc'. It's necessary to figure out the data fields that one is interested to extract from the nc file first (in this case, they are 'vp' and 'vs'). The second command will print to file the results in a custom format to 'model.dat'.

gdp data cat file* -x 1 2 -v 5 3 4 --header 2 --footer 4 --fmt .2 .4 --sort --uniq --noextra -o concatenated.txt

Description: This command will concatenate files in current directory matching names 'files*'. While reading, 2 header lines and 4 footer lines will be omitted. Positional columns are the first and second columns (-x 1 2), and value/numerical columns are [5 3 4]. Positional columns will be printed in %.2f format, and value columns will be printed in %.4f. If files have extra (non-numerical) columns other than the first 5 columns, '--noextra' will cause not printing them. Flag '-o' can be used to set the output file name and if not specified, the results will be printed to standard output.Many of these flags are also common for the following commands.

gdp data union file_1.dat file_2.dat file_3.dat

Description: Output union of a set of numerical data files (two or more) while considering positional columns (default=[1 2]) and value columns as [3] (defaults; these could be modified using '-x' '-v' flags).

gdp data intersect file_1.dat file_2.dat file_3.dat

Description: Output intersect of a set of numerical data files (two or more) considering positional columns (similar positional columns that could be specified using '-x' flag; the value of the first file will be the output). Note that the first value of the flag '--fmt' will be important here.

gdp data difference file_1.dat file_2.dat file_3.dat

Description: Output difference of a set of numerical data files (two or more) considering positional columns. In this case, data points that are unique to 'file_1.dat' will be the output results.

gdp data split dataset.dat --method ncol --number 4 --start -2 --name 3 -o outdir

Description: This command is useful to split/unmerge a concatenated dataset ('dataset.dat'). Two methods can be choosen: (1) nrow: split based on a fixed number of rows, (2) ncol: split based on a row that has a unique number of columns as an identifier. In case of method 'ncol' above: '--number 4' specifies that the row with unique number of columns has 4 columns (reference row); '--start -2' specifies the start line or row offset relative to the reference line; '--name 3' specifies the row offset relative to 'start line' that will be used for output file names; '-o outdir' specifies output directory (it can be omitted for printing to the standard output)

gdp data min  *.xyz -v 1 2 3
gdp data max  *.xyz -v 1 2 3
gdp data sum  *.xyz -v 1 2 3
gdp data mean  *.xyz -v 1 2 3
gdp data median  *.xyz -v 1 2 3
gdp data std  *.xyz -v 1 2 3

Description: Output min, max, sum, mean, median, or std of the three first columns in *.xyz files.

gdp data pip  *.xyz  --polygon polygon.dat
gdp data pip  *.xyz  --polygon polygon.dat -i

Description: Only output points inside or outside ('-i') of the given polygon. Alternatively '--lonrange' and '--latrange' flags could be used to define the polygon.

gdp data gridder vs_model/depth* --spacing 0.2 --smoothing 50 --polygon polygon.dat -o outdir

Description: This command will perform gridding (2D interpolation) to the input xyz format data files. In case of the above command: '--spacing 0.2' specifies that grid spacing along both longitude and latitude is 0.2 degrees (two values can be given as well; [lon_spacing, lat_spacing]); '--smoothing 50' sets a 50 km Gaussian smoothing to the output data; '--polygon polygon.dat' is optional: if given, only points inside the given polygon will be printed out.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gdp-0.1.0.tar.gz (129.7 kB view details)

Uploaded Source

Built Distributions

gdp-0.1.0-cp310-abi3-macosx_10_9_universal2.whl (291.2 kB view details)

Uploaded CPython 3.10+ macOS 10.9+ universal2 (ARM64, x86-64)

gdp-0.1.0-cp38-abi3-macosx_10_9_x86_64.whl (218.3 kB view details)

Uploaded CPython 3.8+ macOS 10.9+ x86-64

File details

Details for the file gdp-0.1.0.tar.gz.

File metadata

  • Download URL: gdp-0.1.0.tar.gz
  • Upload date:
  • Size: 129.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.7.5

File hashes

Hashes for gdp-0.1.0.tar.gz
Algorithm Hash digest
SHA256 6bd3355072333225d03d9f3602fbc0452388874d2dbc90e4d9053b10c79cbbb3
MD5 d9404658b0786773ff349bee62ae01a0
BLAKE2b-256 6256307fa780731c5748c74a11f0d14f5cc0dbc251ffdb71837d2cbf6a93fe8f

See more details on using hashes here.

File details

Details for the file gdp-0.1.0-cp310-abi3-macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for gdp-0.1.0-cp310-abi3-macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 cf0a3a897ae0812d6475e7ba972a2287f4997a360d4c01fec7096495075ab175
MD5 d44d3825fcfa645cf7006a9d5f02b903
BLAKE2b-256 b39e34985d24006685dc4b873d2d6fbc815b525ec1d3d71ecae1e25aadf327c4

See more details on using hashes here.

File details

Details for the file gdp-0.1.0-cp38-abi3-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for gdp-0.1.0-cp38-abi3-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 8a407efed6a3310d37041f6d15f9670589a17ad2e9a83f4ef63c1ce58f1b5e9d
MD5 95e0b8b277927aef119582a80cb34fb1
BLAKE2b-256 07eb6e7b358a51c4e968bac6b75849bc8c26a4b0ecddc5321c85089b3e3edf92

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page