Machine learning research in water exploration
Project description
WATex: machine learning research in water exploration
Life is much better with potable water
Overview
WATex is a Python-based library primarily designed for Groundwater Exploration (GWE). It introduces innovative strategies aimed at minimizing losses encountered during hydro-geophysical exploration projects. Integrating methods from Direct-current (DC) resistivity—including Electrical Profiling (ERP) and Vertical Electrical Sounding (VES)—alongside short-period electromagnetic (EM), geology, and hydrogeology, WATex leverages Machine Learning techniques to enhance exploration outcomes. Key features include:
- Automating the identification of optimal drilling locations to reduce the incidence of unsuccessful drillings and unsustainable boreholes.
- Predicting well water content, including groundwater flow rates and water inrush levels.
- Restoring EM signal integrity in areas plagued by significant interference noise.
- And more.
Documentation
For comprehensive information and additional resources, visit the WATex library website. To quickly navigate through the software's API reference, access the API reference page. Explore the examples section for a preview of potential results. Additionally, a detailed step-by-step guide is provided to tackle real-world engineering challenges, such as computing DC parameters and predicting the k-parameter.
License
WATex is distributed under the BSD-3-Clause License.
Installation
WATex is best supported on Python 3.9 or later.
From pip
Install WATex directly from the Python Package Index (PyPI) with the following command:
pip install watex
From conda
For users who prefer the conda ecosystem, WATex can be installed from the conda-forge distribution channel:
conda install -c conda-forge watex
From Source
To access the most current development version of the code, installation from the source is recommended. Use the following commands to clone the repository and install:
git clone https://github.com/WEgeophysics/watex.git
Additional Information
For a comprehensive installation guide, including how to manage dependencies effectively, please refer to our Installation Guide.
Some Demos
1. Drilling Location Auto-detection
In this demonstration, we showcase the process of automatically detecting optimal locations
for drilling by generating 50 stations of synthetic ERP resistivity data. The data is characterized
by minimum and maximum resistivity values set at 10 ohm.m
and 10,000 ohm.m
, respectively:
import watex as wx
data = wx.make_erp(n_stations=50, max_rho=1e4, min_rho=10., as_frame=True, seed=42)
Naive Auto-detection (NAD)
The NAD method identifies a suitable drilling location without considering any restrictions or constraints that might be present at the survey site during Groundwater Exploration (GWE). A location is deemed "suitable" if it is expected to yield a flow rate of at least 1m³/hr:
from watex.methods import ResistivityProfiling
robj = ResistivityProfiling(auto=True).fit(data)
robj.sves_
Out[1]: 'S025'
The algorithm proposes station S25
as the optimal drilling location, which is stored
in the sves_
attribute.
Auto-detection with Constraints (ADC)
In contrast, the ADC method accounts for constraints observed in the survey area during
the Drilling Water Supply Chain (DWSC). These constraints are often encountered in real-world
scenarios. For example, a station near a heritage site may be excluded due to drilling restrictions.
When multiple constraints exist, they should be compiled into a dictionary detailing the reasons for
each and passed to the constraints
parameter. This ensures that these stations are disregarded during
the automatic detection process:
restrictions = {
'S10': 'Household waste site, avoid contamination',
'S27': 'Municipality site, no authorization for drilling',
'S29': 'Heritage site, drilling prohibited',
'S42': 'Anthropic polluted place, potential future contamination risk',
'S46': 'Marsh zone, likely borehole dry-up during dry season'
}
robj = ResistivityProfiling(constraints=restrictions, auto=True).fit(data)
robj.sves_
# Output: 'S033'
This method revises the suitable drilling location to station S33
, taking into account
the specified constraints. Should a station be near a restricted area, the system raises a warning
to advise against risking drilling operations at that location.
Important Reminder: Prior to initiating drilling operations, ensure a DC-sounding (VES) is conducted at the identified location. WATex calculates an additional parameter known as ohmic-area
(ohmS) to evaluate the presence and effectiveness of fracture zones at that site. For further information, refer to the WATex documentation.
2. EM Tensor Recovery and Analysis
This demonstration outlines the process of recovering and analyzing electromagnetic (EM) tensor data. We begin by fetching 20 audio-frequency magnetotelluric (AMT) data points stored as EDI objects from the Huayuan area in Hunan Province, China, known for multiple interference noises:
import watex as wx
e = wx.fetch_data('huayuan', samples=20, key='noised') # Returns an EM object
edi_data = e.data # Retrieve the array of EDI objects
Before restoring EM data, it's crucial to assess the data quality and evaluate the confidence intervals to ensure reliability at each station. Typically, this quality control (QC) analysis focuses on errors within the resistivity tensor:
from watex.methods import EMAP
po = EMAP().fit(edi_data) # Creates an EM Array Profiling processing object
r = po.qc(tol=0.2, return_ratio=True) # Good data deemed from 80% significance level
r
Out[9]: 0.95
To visualize the confidence intervals at the 20 AMT stations:
from watex.utils import plot_confidence_in
plot_confidence_in(edi_data)
For a more thorough quality control, we use the qc
function to filter out invalid data and
interpolate frequencies. To determine the number of frequencies dropped during this analysis:
from watex.utils import qc
QCo = qc(edi_data, tol=.2, return_qco=True) # Returns the quality control object
len(e.emo.freqs_) # Original number of frequencies in noisy data
Out[10]: 56
len(QCo.freqs_) # Number of frequencies in valid data after QC
Out[11]: 53
QCo.invalid_freqs_ # Frequencies discarded based on the tolerance parameter
Out[12]: array([81920.0, 48.53, 5.625]) # 81920.0, 48.53, and 5.625 Hz
The plot_confidence_in
function is crucial for assessing whether tensor values for these
frequencies are recoverable at each station. It's important to note that data is considered
unrecoverable if the confidence level falls below 50%.
Should the initial QC rate of 95% not meet our standards, we can proceed to restore the
impedance tensor Z
:
Z = po.zrestore() # Returns 3D tensors for XX, XY, YX, and YY components
Evaluating the new QC ratio post-restoration confirms the effectiveness of our recovery efforts:
r, = wx.qc(Z)
r
Out[13]: 1.0
As observed, the tensor restoration achieves a 100% success rate across all stations, significantly improving upon the initial analysis. To visualize this enhancement in confidence levels:
plot_confidence_in(Z)
For further exploration on EM tensor restoration, phase tensor analysis, strike plotting, data filtering, and more, users are encouraged to visit the following links for detailed examples:
Citations
Should you find the WATex software beneficial for your research or any published work, we kindly ask you to cite the following article:
Kouadio, K.L., Liu, J., Liu, R., 2023. watex: machine learning research in water exploration. SoftwareX, 101367(2023). https://doi.org/10.1016/j.softx.2023.101367
In publications that mention WATex, acknowledging scikit-learn may also be relevant due to its integral role in the software's development.
For additional insights and examples, refer to our compilation of case history papers that utilized WATex.
Contributions
The development and success of WATex have been made possible through contributions from the following institutions:
- Department of Geophysics, School of Geosciences & Info-physics, Central South University, China.
- Hunan Key Laboratory of Nonferrous Resources and Geological Hazards Exploration, Changsha, Hunan, China.
- Laboratoire de Geologie, Ressources Minerales et Energetiques, UFR des Sciences de la Terre et des Ressources Minières, Université Félix Houphouët-Boigny, Côte d'Ivoire.
For inquiries, suggestions, or contributions, please reach out to the main developer, LKouadio at etanoyau@gmail.com.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file watex-0.3.3.tar.gz
.
File metadata
- Download URL: watex-0.3.3.tar.gz
- Upload date:
- Size: 8.8 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 309e7790bc233bf3726f9ff7ff6a7ba643776d0fb0341bd6571bc57a4c105dd0 |
|
MD5 | 31410647ebc420a12936acd4884d5380 |
|
BLAKE2b-256 | 5b11b5e6b4f7203275872226c851078125ae8a15ba99364ed3fa2f3ffdf05d58 |
File details
Details for the file watex-0.3.3-py3-none-any.whl
.
File metadata
- Download URL: watex-0.3.3-py3-none-any.whl
- Upload date:
- Size: 8.5 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 56a89bfcf98b5f7ad16599ee46125e780077233fc2a3d779af191dc2da9e8c07 |
|
MD5 | 003ae6fa5e1be75163f789bf3380a44e |
|
BLAKE2b-256 | 5e3a510d99605240df48b52b124ba63c6833795a4e9ea1d73e79afdf4a5a5217 |