GCS driver for the Khiops tool
Project description
Khiops driver for Google Cloud Storage aka GCS
This repository hosts the source code for the Khiops filesystem driver enabling transparent manipulation for data stored in GCS buckets.
Quickstart
If you just want to start using Khiops with your data located on GCS, simply install the driver package next to Khiops. If you installed Khiops the standard way, the driver package can be installed via conda like so:
conda install -c conda-forge khiops-driver-gcs
Or, if you have used your system package manager, you will have to install the driver by the same method. For debian/ubuntu, you will do this:
CODENAME=$(lsb_release -cs) && \
TEMP_DEB="$(mktemp)" && \
wget -O "$TEMP_DEB" "https://github.com/KhiopsML/khiopsdriver-gcs/releases/download/0.0.14/khiops-driver-gcs_0.0.14-1-${CODENAME}.amd64.deb" && \
sudo dpkg -i "$TEMP_DEB && \
rm -f $TEMP_DEB
or if using Rocky linux, do this:
sudo yum update -y && sudo yum install wget -y && \
CENTOS_VERSION=$(rpm -E %{rhel}) && \
TEMP_RPM="$(mktemp).rpm" && \
wget -O "$TEMP_RPM" "https://github.com/KhiopsML/khiopsdriver-gcs/releases/download/0.0.14/khiops-driver-gcs_0.0.14-1.el${CENTOS_VERSION}.x86_64.rpm" && \
sudo yum install "$TEMP_RPM" -y && \
rm -f $TEMP_RPM
You can check that the driver is installed propery by running
khiops -s
You should see an output similar to this:
Khiops 10.3.0
Drivers:
GCS driver (0.0.14) for URI scheme 'gs'
Environment variables:
None
Internal environment variables:
None
which indicates that the driver was loaded properly and will be used for datafiles following the gs:// pattern.
Authentication
In order to access the data stored on a GCS bucket, in most cases a valid authentication in required. The Khiops GCS driver by default uses the standard Application Default Credentials authentication. This means that once you have valid credentials setup in your environment, Khiops will be using these exactly like your python script or google provided tools like gcloud or gsutil.
In order to setup your local environment with these credentials (assuming you have installed the gcloud CLI), you will have to do the following:
gcloud init
gcloud auth application-default login
Voilà! You now have access to your data in GCS buckets! The exact same authentication mechanism will allow a containerized Khiops script to run on the Google infrastructure.
Logging
You can log information, warnings, errors and debug traces to a file using the following environment variables (they must both be defined to log anything):
GCS_DRIVER_LOGLEVEL: available values areoff,critical,error,warning,info,debug,trace(they are actually the values of the spdlog logging library)GCS_DRIVER_LOGFILE: path to the log file (which does not need to already exist).
Tip: you can define
GCS_DRIVER_LOGFILEto be/dev/stderror/dev/stdoutif you want to log to standard error or standard output, respectively.
Example usage
Khiops usage (low level)
khiops -b -i gs://mydatabucket/khiops_samples/scenario.kh
Python sample
# Imports
import os
from khiops import core as kh
# Set the file paths
dictionary_file_path = "gs://mydatabucket/khiops_samples/Adult/Adult.kdic"
data_table_path = "gs://mydatabucket/khiops_samples/Adult/Adult.kdic"
results_dir = "khiops_output"
# Train the predictor
kh.train_predictor(
dictionary_file_path,
"Adult",
data_table_path,
"class",
results_dir,
max_trees=0,
)
Development: Coverage reports
Coverage targets are available on Linux in non-Release builds when BUILD_TESTS=ON.
Tests are executed through ctest so coverage matches the test registry.
Configure and build in Debug mode:
cmake --preset ninja-dbg -DBUILD_TESTS=ON
cmake --build --preset ninja-dbg
Run tests directly with ctest (optional baseline check):
ctest --preset ninja-dbg --output-on-failure
Generate unit-only coverage (tests labeled unit):
cmake --build --preset ninja-dbg --target khiops-gcs_coverage_unit
cmake --build --preset ninja-dbg --target khiops-gcs_cobertura_unit
Generate full coverage (all tests known by ctest):
cmake --build --preset ninja-dbg --target khiops-gcs_coverage_full
cmake --build --preset ninja-dbg --target khiops-gcs_cobertura_full
Artifacts are generated under build/debug/:
- HTML reports:
build/debug/coverage-unit/index.htmlandbuild/debug/coverage-full/index.html - Cobertura XML:
build/debug/coverage-unit.xmlandbuild/debug/coverage-full.xml
Legacy targets are still available and map to full coverage:
cmake --build --preset ninja-dbg --target khiops-gcs_coverage
cmake --build --preset ninja-dbg --target khiops-gcs_cobertura
Development: GitHub CI coverage UX
Coverage reporting in CI uses only native GitHub capabilities (no external service).
On Linux workflow runs:
- The workflow writes a
Coverage Reportsection to the run summary. - Pull requests receive a single updatable comment with current coverage status.
- Coverage artifacts are uploaded only when the expected reports are generated.
Artifact names in GitHub Actions:
coverage-unit-ubuntu-latestcoverage-full-ubuntu-latest
If coverage generation fails or skips, upload is skipped consistently and the summary/comment explicitly indicates missing reports.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file khiops_driver_gcs-0.0.23.tar.gz.
File metadata
- Download URL: khiops_driver_gcs-0.0.23.tar.gz
- Upload date:
- Size: 4.6 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ab969d44450664ef51728f29c04db798f079589ab486b788e12b1a8b06e83b2a
|
|
| MD5 |
d830acf59b7fdfb867b02867f5516e2f
|
|
| BLAKE2b-256 |
fbd2cb70be9348bde59b1f5bbca852aad39618567489575c6be5fb0d2ac227eb
|
Provenance
The following attestation bundles were made for khiops_driver_gcs-0.0.23.tar.gz:
Publisher:
pack-pip.yml on KhiopsML/khiopsdriver-gcs
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
khiops_driver_gcs-0.0.23.tar.gz -
Subject digest:
ab969d44450664ef51728f29c04db798f079589ab486b788e12b1a8b06e83b2a - Sigstore transparency entry: 2058778107
- Sigstore integration time:
-
Permalink:
KhiopsML/khiopsdriver-gcs@a8a38b1d46ebd9adc471850484db43aac46f2d55 -
Branch / Tag:
refs/tags/0.0.23 - Owner: https://github.com/KhiopsML
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pack-pip.yml@a8a38b1d46ebd9adc471850484db43aac46f2d55 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file khiops_driver_gcs-0.0.23-py3-none-win_amd64.whl.
File metadata
- Download URL: khiops_driver_gcs-0.0.23-py3-none-win_amd64.whl
- Upload date:
- Size: 1.4 MB
- Tags: Python 3, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4fbbab54c2374cf144d633db438878a4c4a9c1e4521ccae263a72cc2323d6d5c
|
|
| MD5 |
a30476bf5ff5ce3481c78672beda87b8
|
|
| BLAKE2b-256 |
8fbd4fcdb33774a391a94f84cab29ad9bd57d1cc5fc3f21636931b6879d3025a
|
Provenance
The following attestation bundles were made for khiops_driver_gcs-0.0.23-py3-none-win_amd64.whl:
Publisher:
pack-pip.yml on KhiopsML/khiopsdriver-gcs
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
khiops_driver_gcs-0.0.23-py3-none-win_amd64.whl -
Subject digest:
4fbbab54c2374cf144d633db438878a4c4a9c1e4521ccae263a72cc2323d6d5c - Sigstore transparency entry: 2058780797
- Sigstore integration time:
-
Permalink:
KhiopsML/khiopsdriver-gcs@a8a38b1d46ebd9adc471850484db43aac46f2d55 -
Branch / Tag:
refs/tags/0.0.23 - Owner: https://github.com/KhiopsML
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pack-pip.yml@a8a38b1d46ebd9adc471850484db43aac46f2d55 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file khiops_driver_gcs-0.0.23-py3-none-manylinux_2_28_x86_64.whl.
File metadata
- Download URL: khiops_driver_gcs-0.0.23-py3-none-manylinux_2_28_x86_64.whl
- Upload date:
- Size: 5.0 MB
- Tags: Python 3, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
90dddd21f8e9edf6344a9dd8f06e8fb7624c81a763e65c59a919d16469bf677d
|
|
| MD5 |
f02ad67004979af1816cdacded3c57dd
|
|
| BLAKE2b-256 |
e1ee5f444660e648ee77d1ebc69283956c7d224ff4912777e6a498c4a69b07a6
|
Provenance
The following attestation bundles were made for khiops_driver_gcs-0.0.23-py3-none-manylinux_2_28_x86_64.whl:
Publisher:
pack-pip.yml on KhiopsML/khiopsdriver-gcs
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
khiops_driver_gcs-0.0.23-py3-none-manylinux_2_28_x86_64.whl -
Subject digest:
90dddd21f8e9edf6344a9dd8f06e8fb7624c81a763e65c59a919d16469bf677d - Sigstore transparency entry: 2058779449
- Sigstore integration time:
-
Permalink:
KhiopsML/khiopsdriver-gcs@a8a38b1d46ebd9adc471850484db43aac46f2d55 -
Branch / Tag:
refs/tags/0.0.23 - Owner: https://github.com/KhiopsML
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pack-pip.yml@a8a38b1d46ebd9adc471850484db43aac46f2d55 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file khiops_driver_gcs-0.0.23-py3-none-manylinux_2_28_aarch64.whl.
File metadata
- Download URL: khiops_driver_gcs-0.0.23-py3-none-manylinux_2_28_aarch64.whl
- Upload date:
- Size: 5.0 MB
- Tags: Python 3, manylinux: glibc 2.28+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c5908192ac39ef9d3f60b1ed8415df7ab489052a384d50946374295c877d7a30
|
|
| MD5 |
e753ff7ad5e1616b9da94c1c6286d894
|
|
| BLAKE2b-256 |
a5806d05c9d4f6773da1399fbbd2b69cb236d44e76900abbd5768d7b8617b4b4
|
Provenance
The following attestation bundles were made for khiops_driver_gcs-0.0.23-py3-none-manylinux_2_28_aarch64.whl:
Publisher:
pack-pip.yml on KhiopsML/khiopsdriver-gcs
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
khiops_driver_gcs-0.0.23-py3-none-manylinux_2_28_aarch64.whl -
Subject digest:
c5908192ac39ef9d3f60b1ed8415df7ab489052a384d50946374295c877d7a30 - Sigstore transparency entry: 2058780239
- Sigstore integration time:
-
Permalink:
KhiopsML/khiopsdriver-gcs@a8a38b1d46ebd9adc471850484db43aac46f2d55 -
Branch / Tag:
refs/tags/0.0.23 - Owner: https://github.com/KhiopsML
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pack-pip.yml@a8a38b1d46ebd9adc471850484db43aac46f2d55 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file khiops_driver_gcs-0.0.23-py3-none-macosx_11_0_arm64.whl.
File metadata
- Download URL: khiops_driver_gcs-0.0.23-py3-none-macosx_11_0_arm64.whl
- Upload date:
- Size: 3.9 MB
- Tags: Python 3, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6850a115e8a6ccf91f6b48952395ed32a31e0a0b0b1c96561bb7898f2b58c4d4
|
|
| MD5 |
351eb6773f3e6e9c7533b25f408eb62e
|
|
| BLAKE2b-256 |
aa8a35fcea47c39df3c20f9282be63a15a26cd39d70469fcd21c6a51a3670e8c
|
Provenance
The following attestation bundles were made for khiops_driver_gcs-0.0.23-py3-none-macosx_11_0_arm64.whl:
Publisher:
pack-pip.yml on KhiopsML/khiopsdriver-gcs
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
khiops_driver_gcs-0.0.23-py3-none-macosx_11_0_arm64.whl -
Subject digest:
6850a115e8a6ccf91f6b48952395ed32a31e0a0b0b1c96561bb7898f2b58c4d4 - Sigstore transparency entry: 2058778731
- Sigstore integration time:
-
Permalink:
KhiopsML/khiopsdriver-gcs@a8a38b1d46ebd9adc471850484db43aac46f2d55 -
Branch / Tag:
refs/tags/0.0.23 - Owner: https://github.com/KhiopsML
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pack-pip.yml@a8a38b1d46ebd9adc471850484db43aac46f2d55 -
Trigger Event:
workflow_dispatch
-
Statement type: