Skip to main content

QC module for seismic trace detection — part of seismoai

Project description

seismoai-qc

Quality Control module for seismic trace detection — part of the seismoai pipeline.

Built on real data from the Forge 2D Survey (2017): 166 SGY files, 27,722 total traces.

Install

pip install seismoai-qc

Usage

from seismoai_qc import detect_dead_traces, detect_noisy_traces, qc_report
import numpy as np

# traces: 2D numpy array (167, 4001) — output of seismoai_io
report = qc_report(traces)
print(report['label'].value_counts())
# good     21373
# dead      6183
# noisy      166

# Pass to seismoai_model:
labels = report['label']

Functions

Function What it does
detect_dead_traces(traces) Flags traces with std_dev < 1e-4 (no signal)
detect_noisy_traces(traces) Flags traces with max_amp > 50 (spike up to 758)
qc_report(traces) Full DataFrame with labels for seismoai_model

Real Data Stats

Analyzed across all 166 SGY files (27,722 total traces):

Category Count Percentage
Good traces 21,373 77.1%
Dead traces 6,183 22.3%
Noisy traces 166 0.6%
Total 27,722 100%

Why are some traces dead?

Dead traces (std_dev < 0.0001) occur at far offsets where seismic source energy does not reach the receiver. In Forge 2D Survey, 22.3% traces are dead — mostly far-offset receivers.

Why do some traces reach 758?

Normal traces stay below 15 (99th percentile = 10.71). Noisy traces spike up to 758.22 due to electrical interference or instrument malfunction during acquisition. Only 0.6% traces are noisy.

Thresholds (derived from real data)

# Dead threshold
std_dev < 0.0001
# Dead trace std range: 0.00000021 to 0.0001
# Live trace std range: > 0.001 (clear gap)

# Noisy threshold  
max_amp > 50.0
# Normal 99th percentile: 10.71
# Noisy minimum: 200+ (all 166 noisy traces exceed 200)
# Worst spike: 758.22

How Pair 4 Uses This

from seismoai_io import load_sgy
from seismoai_qc import qc_report

# Load traces
traces = load_sgy("file.sgy")

# Get QC labels
report = qc_report(traces)

# Hand off to seismoai_model
labels = report['label']   # 'good', 'dead', 'noisy'
features = traces[report['is_dead'] == False]  # remove dead

Run Tests

pytest tests/ -v

Real Dataset Test

python test_real_data.py

Reflection

We built the seismoai_qc module which detects dead and noisy traces in seismic data from the Forge 2D Survey. Dead traces have std_dev below 0.0001 — out of 27,722 total traces, 6,183 (22.3%) were dead because far-offset receivers did not receive enough source energy. Noisy traces have amplitudes above 50, spiking as high as 758, caused by electrical interference — we found 166 noisy traces (0.6%). We derived these thresholds by analyzing the real dataset rather than guessing. Our qc_report() produces labeled output that Pair 4 will use directly to train their noise classifier on 21,373 good traces.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

seismoai_qc-0.2.0.tar.gz (4.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

seismoai_qc-0.2.0-py3-none-any.whl (4.2 kB view details)

Uploaded Python 3

File details

Details for the file seismoai_qc-0.2.0.tar.gz.

File metadata

  • Download URL: seismoai_qc-0.2.0.tar.gz
  • Upload date:
  • Size: 4.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.0

File hashes

Hashes for seismoai_qc-0.2.0.tar.gz
Algorithm Hash digest
SHA256 2d3afd27daf8a0749cccfe9958fb4a09ab3846f6fb7826505b6ef79f80064d95
MD5 910853c84eb0c66075cc41917c5c5321
BLAKE2b-256 557406fb5b26d3f43ce4ca523b46347e1be2d7ef9c8de13c8e4020c6fd695305

See more details on using hashes here.

File details

Details for the file seismoai_qc-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: seismoai_qc-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 4.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.0

File hashes

Hashes for seismoai_qc-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ff6372ae5e6f17628800fa48644054ceff8c4bfcc8d51022faf148b0f75ef9e4
MD5 a7f4847ae9af6808b1b7c7b071deae0e
BLAKE2b-256 e1bbb93a335fcc1f6e71caddf40cc10196726762d966b0d54a1b796bf9c56338

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page