Python library for fast access to seismic data using TileDB
Project description
TileDB-Segy
TileDB-Segy is a small MIT licensed Python library for easy interaction with seismic data, powered by TileDB. It combines an intuitive, segyio-like API with a powerful storage engine.
Feature summary
Available features
- Converting from SEG-Y and Seismic Unix formatted seismic data to TileDB arrays.
- Simple and powerful read-only API, closely modeled after
segyio
. - 100% unit test coverage.
- Fully type-annotated.
Currently missing features
- API for write operations.
- Converting back to SEG-Y.
- TileDB configuration and performance tuning.
- Comprehensive documentation.
- Real-world usage.
Installation
TileDB-Segy can be installed:
-
from PyPI by
pip
:pip install tiledb-segy
-
from source by cloning the Git repository:
git clone https://github.com/TileDB-Inc/TileDB-Segy.git cd TileDB-Segy pip install .
You may run the test suite with:
python setup.py test
Converting from SEG-Y
TileDB-Segy comes with a commandline interface (CLI) called segy2tiledb
for converting
SEG-Y and Seismic Unix formatted files to TileDB formatted arrays. At minimum it takes
an input file and generates a directory at the same parent directory with the input and
extension .tsgy
:
$ segy2tiledb a123.segy
$ du -sh a123.*
73M a123.sgy
55M a123.tsgy
To see the full list of options run:
$ segy2tiledb -h
usage: segy2tiledb [-h] [-o] [-g {auto,structured,unstructured}] [--su]
[--iline ILINE] [--xline XLINE]
[--endian {big,msb,little,lsb}] [-s TILE_SIZE]
input [output]
Convert a SEG-Y file to tiledb-segy format
positional arguments:
input Input SEG-Y file path
output Output directory path (default: None)
optional arguments:
-h, --help show this help message and exit
-o, --overwrite Overwrite the output directory if it already exists (default: False)
-g {auto,structured,unstructured}, --geometry {auto,structured,unstructured}
Output geometry:
- auto: same as the input SEG-Y.
- structured: same as `auto` but abort if a geometry cannot be inferred.
- unstructured: opt out on building geometry information.
(default: auto)
segyio options:
--su Open a seismic unix file instead of SEG-Y (default: False)
--iline ILINE Inline number field in the trace headers (default: 189)
--xline XLINE Crossline number field in the trace headers (default: 193)
--endian {big,msb,little,lsb}
File endianness, big/msb (default) or little/lsb (default: big)
tiledb options:
-s TILE_SIZE, --tile-size TILE_SIZE
Tile size in bytes.
Larger tile size improves disk access time at the cost of higher memory (default: 4000000)
API
TileDB-Segy generally follows the segyio
API; you may consult its
documentation to learn about
the public attributes (ilines
, xlines
, offsets
, samples
) and addressing modes
(trace
, header
, attributes
', iline
, xline
, fast
, slow
, depth_slice
,
gather
, text
, bin
).
You can find usage examples in the following Jupyter notebooks:
Differences from segyio
-
Addressing modes that return a generator of numpy arrays in
segyio
, intiledb-segy
they return a single numpy array of higher dimension. For example, in a SEG-Y with 50 ilines, 20 xlines, 100 samples, and 3 offsets:f.iline[0:5]
:segyio
returns a generator that yields 5 2D numpy arrays of (20, 100) shapetiledb-segy
returns a 3D numpy array of (5, 20, 100) shape
f.iline[0:5, :]
:segyio
returns a generator that yields 15 2D numpy arrays of (20, 100) shapetiledb-segy
returns a 4D numpy array of (5, 3, 20, 100) shape
-
The mappings returned by
bin
,header
andattributes(name)
have string keys instead ofsegyio.TraceField
enums or integers. -
tiledb.segy.open(dir_path)
, thesegyio.open(file_path)
equivalent, does not take any optional parameters (e.g.strict
orignore_geometry
). -
Unstructured and structured SEG-Y are represented as instances of two different classes,
tiledb.segy.Segy
andtiledb.segy.StructuredSegy
respectively.StructuredSegy
extendsSegy
, so the whole unstructured API is inherited by the structured.- All attributes and addressing modes specific to structured files (e.g.
ilines
orgather
) are available only toStructuredSegy
. In contrastsegyio
returnsNone
or raises an exception if these properties are accessed on unstructured files. segyio.tools.dt
is exposed asSegy.dt(fallback=4000.0)
method.segyio.tools.cube
is exposed asStructuredSegy.cube()
method.- There is no
unstructured
attribute; usenot isinstance(f, StructuredSegy)
instead.
-
There is no
tracecount
attribute; uselen(trace)
instead. -
There is no
ext_headers
attribute; uselen(text[1:])
instead.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file tiledb-segy-0.3.1.tar.gz
.
File metadata
- Download URL: tiledb-segy-0.3.1.tar.gz
- Upload date:
- Size: 16.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.11.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 86849802df0ff42a9645d9ef93f4302908bdcd47cd0810f3c0173c9ed40ef2dd |
|
MD5 | dffcb7a90f6b11c0a1d9ead8fa7295f9 |
|
BLAKE2b-256 | 160d1810929fd91f350c1ff0dd3bbbdfcfe76ab6554916a784b5164742cf3ad0 |
File details
Details for the file tiledb_segy-0.3.1-py3-none-any.whl
.
File metadata
- Download URL: tiledb_segy-0.3.1-py3-none-any.whl
- Upload date:
- Size: 16.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.11.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 446bca2fecde820277d1559588c409985b4ee493c1844c31997ee788105d03e2 |
|
MD5 | 2b7c806cb49d98efe26a3beffdceb6ca |
|
BLAKE2b-256 | c95314fb84803586ad1d776aae6182341ad2f9b92bd9bb6b3e3a6df915f3b4e2 |