Skip to main content

Machine learning research in water exploration

Project description



WATex: machine learning research in water exploration

Life is much better with potable water

Documentation Status GitHub GitHub Workflow Status (with branch) Coverage Status GitHub release (latest SemVer including pre-releases) DOI PyPI - Python Version PyPI version Conda Version Anaconda-Server Badge

Overview

WATex is a Python-based library primarily designed for Groundwater Exploration (GWE). It introduces innovative strategies aimed at minimizing losses encountered during hydro-geophysical exploration projects. Integrating methods from Direct-current (DC) resistivity—including Electrical Profiling (ERP) and Vertical Electrical Sounding (VES)—alongside short-period electromagnetic (EM), geology, and hydrogeology, WATex leverages Machine Learning techniques to enhance exploration outcomes. Key features include:

  • Automating the identification of optimal drilling locations to reduce the incidence of unsuccessful drillings and unsustainable boreholes.
  • Predicting well water content, including groundwater flow rates and water inrush levels.
  • Restoring EM signal integrity in areas plagued by significant interference noise.
  • And more.

Documentation

For comprehensive information and additional resources, visit the WATex library website. To quickly navigate through the software's API reference, access the API reference page. Explore the examples section for a preview of potential results. Additionally, a detailed step-by-step guide is provided to tackle real-world engineering challenges, such as computing DC parameters and predicting the k-parameter.

License

WATex is distributed under the BSD-3-Clause License.

Installation

WATex is best supported on Python 3.9 or later.

From pip

Install WATex directly from the Python Package Index (PyPI) with the following command:

pip install watex

From conda

For users who prefer the conda ecosystem, WATex can be installed from the conda-forge distribution channel:

conda install -c conda-forge watex

From Source

To access the most current development version of the code, installation from the source is recommended. Use the following commands to clone the repository and install:

git clone https://github.com/WEgeophysics/watex.git

Additional Information

For a comprehensive installation guide, including how to manage dependencies effectively, please refer to our Installation Guide.

Some Demos

1. Drilling Location Auto-detection

In this demonstration, we showcase the process of automatically detecting optimal locations for drilling by generating 50 stations of synthetic ERP resistivity data. The data is characterized by minimum and maximum resistivity values set at 10 ohm.m and 10,000 ohm.m, respectively:

import watex as wx
data = wx.make_erp(n_stations=50, max_rho=1e4, min_rho=10., as_frame=True, seed=42)

Naive Auto-detection (NAD)

The NAD method identifies a suitable drilling location without considering any restrictions or constraints that might be present at the survey site during Groundwater Exploration (GWE). A location is deemed "suitable" if it is expected to yield a flow rate of at least 1m³/hr:

from watex.methods import ResistivityProfiling
robj = ResistivityProfiling(auto=True).fit(data)
robj.sves_
Out[1]: 'S025'

The algorithm proposes station S25 as the optimal drilling location, which is stored in the sves_ attribute.

Auto-detection with Constraints (ADC)

In contrast, the ADC method accounts for constraints observed in the survey area during the Drilling Water Supply Chain (DWSC). These constraints are often encountered in real-world scenarios. For example, a station near a heritage site may be excluded due to drilling restrictions. When multiple constraints exist, they should be compiled into a dictionary detailing the reasons for each and passed to the constraints parameter. This ensures that these stations are disregarded during the automatic detection process:

restrictions = {
    'S10': 'Household waste site, avoid contamination',

    'S27': 'Municipality site, no authorization for drilling',
    'S29': 'Heritage site, drilling prohibited',
    'S42': 'Anthropic polluted place, potential future contamination risk',
    'S46': 'Marsh zone, likely borehole dry-up during dry season'
}
robj = ResistivityProfiling(constraints=restrictions, auto=True).fit(data)
robj.sves_
# Output: 'S033'

This method revises the suitable drilling location to station S33, taking into account the specified constraints. Should a station be near a restricted area, the system raises a warning to advise against risking drilling operations at that location.

Important Reminder: Prior to initiating drilling operations, ensure a DC-sounding (VES) is conducted at the identified location. WATex calculates an additional parameter known as ohmic-area (ohmS) to evaluate the presence and effectiveness of fracture zones at that site. For further information, refer to the WATex documentation.

2. EM Tensor Recovery and Analysis

This demonstration outlines the process of recovering and analyzing electromagnetic (EM) tensor data. We begin by fetching 20 audio-frequency magnetotelluric (AMT) data points stored as EDI objects from the Huayuan area in Hunan Province, China, known for multiple interference noises:

import watex as wx
e = wx.fetch_data('huayuan', samples=20, key='noised')  # Returns an EM object
edi_data = e.data  # Retrieve the array of EDI objects

Before restoring EM data, it's crucial to assess the data quality and evaluate the confidence intervals to ensure reliability at each station. Typically, this quality control (QC) analysis focuses on errors within the resistivity tensor:

from watex.methods import EMAP
po = EMAP().fit(edi_data)  # Creates an EM Array Profiling processing object
r = po.qc(tol=0.2, return_ratio=True)  # Good data deemed from 80% significance level
r
Out[9]: 0.95

To visualize the confidence intervals at the 20 AMT stations:

from watex.utils import plot_confidence_in
plot_confidence_in(edi_data)

For a more thorough quality control, we use the qc function to filter out invalid data and interpolate frequencies. To determine the number of frequencies dropped during this analysis:

from watex.utils import qc
QCo = qc(edi_data, tol=.2, return_qco=True)  # Returns the quality control object
len(e.emo.freqs_)  # Original number of frequencies in noisy data
Out[10]: 56
len(QCo.freqs_)  # Number of frequencies in valid data after QC
Out[11]: 53
QCo.invalid_freqs_  # Frequencies discarded based on the tolerance parameter
Out[12]: array([81920.0, 48.53, 5.625])  # 81920.0, 48.53, and 5.625 Hz

The plot_confidence_in function is crucial for assessing whether tensor values for these frequencies are recoverable at each station. It's important to note that data is considered unrecoverable if the confidence level falls below 50%.

Should the initial QC rate of 95% not meet our standards, we can proceed to restore the impedance tensor Z:

Z = po.zrestore()  # Returns 3D tensors for XX, XY, YX, and YY components

Evaluating the new QC ratio post-restoration confirms the effectiveness of our recovery efforts:

r, = wx.qc(Z)
r
Out[13]: 1.0

As observed, the tensor restoration achieves a 100% success rate across all stations, significantly improving upon the initial analysis. To visualize this enhancement in confidence levels:

plot_confidence_in(Z)

For further exploration on EM tensor restoration, phase tensor analysis, strike plotting, data filtering, and more, users are encouraged to visit the following links for detailed examples:

Citations

Should you find the WATex software beneficial for your research or any published work, we kindly ask you to cite the following article:

Kouadio, K.L., Liu, J., Liu, R., 2023. watex: machine learning research in water exploration. SoftwareX, 101367(2023). https://doi.org/10.1016/j.softx.2023.101367

In publications that mention WATex, acknowledging scikit-learn may also be relevant due to its integral role in the software's development.

For additional insights and examples, refer to our compilation of case history papers that utilized WATex.

Contributions

The development and success of WATex have been made possible through contributions from the following institutions:

  1. Department of Geophysics, School of Geosciences & Info-physics, Central South University, China.
  2. Hunan Key Laboratory of Nonferrous Resources and Geological Hazards Exploration, Changsha, Hunan, China.
  3. Laboratoire de Geologie, Ressources Minerales et Energetiques, UFR des Sciences de la Terre et des Ressources Minières, Université Félix Houphouët-Boigny, Côte d'Ivoire.

For inquiries, suggestions, or contributions, please reach out to the main developer, LKouadio at etanoyau@gmail.com.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

watex-0.3.3.tar.gz (8.8 MB view details)

Uploaded Source

Built Distribution

watex-0.3.3-py3-none-any.whl (8.5 MB view details)

Uploaded Python 3

File details

Details for the file watex-0.3.3.tar.gz.

File metadata

  • Download URL: watex-0.3.3.tar.gz
  • Upload date:
  • Size: 8.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.9

File hashes

Hashes for watex-0.3.3.tar.gz
Algorithm Hash digest
SHA256 309e7790bc233bf3726f9ff7ff6a7ba643776d0fb0341bd6571bc57a4c105dd0
MD5 31410647ebc420a12936acd4884d5380
BLAKE2b-256 5b11b5e6b4f7203275872226c851078125ae8a15ba99364ed3fa2f3ffdf05d58

See more details on using hashes here.

File details

Details for the file watex-0.3.3-py3-none-any.whl.

File metadata

  • Download URL: watex-0.3.3-py3-none-any.whl
  • Upload date:
  • Size: 8.5 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.9

File hashes

Hashes for watex-0.3.3-py3-none-any.whl
Algorithm Hash digest
SHA256 56a89bfcf98b5f7ad16599ee46125e780077233fc2a3d779af191dc2da9e8c07
MD5 003ae6fa5e1be75163f789bf3380a44e
BLAKE2b-256 5e3a510d99605240df48b52b124ba63c6833795a4e9ea1d73e79afdf4a5a5217

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page