Package to prepare well log data for ML projects.

These details have not been verified by PyPI

Project links

Project description

akerbp.mlpet

Preprocessing tools for Petrophysics ML projects at Eureka

Installation

Install the package by running the following (requires Python 3.9 or later)

    pip install akerbp-mlpet

Quick start

For a short example of how to use the mlpet Dataset class for pre-processing data see below. Please refer to the tests folder of this repository for more examples as well as some examples of the settings.yaml file:

    import os
    from akerbp.mlpet import Dataset
    from akerbp.mlpet import utilities

    # Instantiate an empty dataset object using the example settings and mappings provided
    ds = Dataset(
            settings=os.path.abspath("settings.yaml"), # Absolute file paths are required
            folder_path=os.path.abspath(r"./"), # Absolute file paths are required
    )

    # Populate the dataset with data from a file (support for multiple file formats and direct cdf data collection exists)
    ds.load_from_pickle(r"data.pkl") # Absolute file paths are preferred

    # The original data will be kept in ds.df_original and will remain unchanged
    print(ds.df_original.head())

    # Split the data into train-validation sets
    df_train, df_test = utilities.train_test_split(
            df=ds.df_original,
            target_column=ds.label_column,
            id_column=ds.id_column,
            test_size=0.3,
    )

    # Preprocess the data for training according to default workflow
    # print(ds.default_preprocessing_workflow) <- Uncomment to see what the workflow does
    df_preprocessed = ds.preprocess(df_train)

The procedure will be exactly the same for any other dataset class. The only difference will be in the "settings". For a full list of possible settings keys see either the built documentation or the akerbp.mlpet.Dataset class docstring. Make sure that the curve names are consistent with those in the dataset.

The loaded data is NOT mapped at load time but rather at preprocessing time (i.e. when preprocess is called).

Recommended workflow for preprocessing

Due to the operations performed by certain preprocessing methods in akerbp.mlpet, the order in which the different preprocessing steps can sometimes be important for achieving the desired results. Below is a simple guide that should be followed for most use cases:

Misrepresented missing data should always be handled first (using set_as_nan)
This should then be followed by data cleaning methods (e.g. remove_outliers, remove_noise, remove_small_negative_values)
Depending on your use case, once the data is clean you can then impute missing values (see imputers.py). Note however that some features depend on the presence of missing values to provide better estimates (e.g. calculate_VSH)
Add new features (using methods from feature_engineering.py) or using process_wells from preprocessors.py if the features should be well specific.
Fill missing values if any still exist or were created during step 4. (using fillna_with_fillers)
Scale whichever features you want (using scale_curves from preprocessors.py). In some use cases this step could also come before step 5.
Encode the GROUP & FORMATION column if you want to use it for training. (using encode_columns from preprocessors.py)
Select or drop the specific features you want to keep for model training. (using select_columns or drop_columns from preprocessors.py)

NOTE: The dataset class drops all input columns that are not explicitly named in your settings.yaml or settings dictionary passed to the Dataset class at instantiation. This is to ensure that the data is not polluted with features that are not used. Therefore, if you have features that are being loaded into the Dataset class but are not being preprocessed, these need to be explicitly defined in your settings.yaml or settings dictionary under the keyword argument keep_columns.

API Documentation

Full API documentation of the package can be found under the docs folder once you have run the make html command.

For developers

This repository uses uv for dependency management.

create or update the local environment with all optional groups:
```
  uv sync --all-groups
```

run checks in the uv environment:

  uv run pytest
  uv run ruff check .
  uv run ruff format .
  uv run mypy --config-file pyproject.toml

to make the API documentation, from the root directory of the project run:
```
  cd docs/
  uv run make html
```

requirements.txt is generated from the lockfile; update it with:

  uv export --all-groups --no-hashes --no-editable --no-annotate --format requirements-txt --output-file requirements.txt

to install mlpet in editable mode in another environment, use:

  uv pip install -e /path/to/expres-ml-mlpet
  # or: pip install -e .

License

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

6.0.3

Feb 19, 2026

This version

6.0.2

Jan 19, 2026

6.0.1

Nov 14, 2025

6.0.0

Sep 23, 2025

5.1.0

Mar 19, 2025

5.0.0

Mar 6, 2025

4.1.0

Mar 1, 2024

4.0.0

Jan 12, 2024

3.5.4

Jan 3, 2024

3.5.3

Sep 20, 2023

3.5.2

Jun 27, 2023

3.5.1

May 15, 2023

3.5.0

Mar 24, 2023

3.4.0

Jan 10, 2023

3.3.5

Aug 18, 2023

3.3.4

Aug 16, 2023

3.3.3

Jan 16, 2023

3.3.2

Nov 16, 2022

3.3.1

Nov 7, 2022

3.3.0

Oct 31, 2022

3.2.2

Oct 6, 2022

3.2.1

Sep 19, 2022

3.2.0

Sep 6, 2022

3.1.1a1 pre-release

Aug 2, 2022

3.1.1a0 pre-release

Aug 1, 2022

3.1.0

Jun 30, 2022

3.0.0

Jun 29, 2022

3.0.0rc2 pre-release

Jun 15, 2022

3.0.0rc1 pre-release

Jun 15, 2022

2.0.5

Jun 14, 2022

2.0.4

Apr 28, 2022

2.0.3

Apr 4, 2022

2.0.2

Mar 22, 2022

2.0.2a0 pre-release

Mar 15, 2022

2.0.1

Feb 22, 2022

2.0.1a0 pre-release

Feb 17, 2022

2.0.0

Feb 16, 2022

1.1.4

Feb 16, 2022

1.1.3

Feb 16, 2022

1.1.2

Feb 2, 2022

1.1.1

Jan 27, 2022

1.1.0

Jan 25, 2022

1.0.2

Jan 11, 2022

1.0.1

Jan 10, 2022

1.0.0

Jan 10, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

akerbp_mlpet-6.0.2.tar.gz (60.7 kB view details)

Uploaded Jan 19, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

akerbp_mlpet-6.0.2-py3-none-any.whl (67.1 kB view details)

Uploaded Jan 19, 2026 Python 3

File details

Details for the file akerbp_mlpet-6.0.2.tar.gz.

File metadata

Download URL: akerbp_mlpet-6.0.2.tar.gz
Upload date: Jan 19, 2026
Size: 60.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for akerbp_mlpet-6.0.2.tar.gz
Algorithm	Hash digest
SHA256	`658b810800da260458604a4cab46910c1b66cdf4e9fe9ed89d29fca15e5674f5`
MD5	`0b3f6429dda8c19472011f3b36aac904`
BLAKE2b-256	`39bf5b8d8ce0158de713c86f1c955851f5bce2e4f6c5a66e2a548b9825343973`

See more details on using hashes here.

File details

Details for the file akerbp_mlpet-6.0.2-py3-none-any.whl.

File metadata

Download URL: akerbp_mlpet-6.0.2-py3-none-any.whl
Upload date: Jan 19, 2026
Size: 67.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for akerbp_mlpet-6.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6226805a6bd2a0f87fcc8d5469527b68c204a2e110bd9eef4d7dca1d4b121d23`
MD5	`c940e81ee9f742a04f0533047928cd62`
BLAKE2b-256	`6f02e7f44707ff4e4dcb94b2b7dca5918e73471ab2f1732d4e2bc4c1d7f8f7de`

See more details on using hashes here.

akerbp-mlpet 6.0.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

akerbp.mlpet

Installation

Quick start

Recommended workflow for preprocessing

API Documentation

For developers

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes