PySpark custom data source for Fluvius Energy API
Project description
# spark-fluvius
PySpark custom data sources for the Fluvius Energy API.
Read energy measurements and mandates directly into Spark DataFrames.
Installation
pip install pyspark-fluvius
Quick Start
from pyspark.sql import SparkSession
from pyspark_fluvius import register_datasources
# Create SparkSession
spark = SparkSession.builder.appName("MyApp").getOrCreate()
# Register Fluvius data sources
register_datasources()
# Read mandates
mandates_df = spark.read.format("fluvius.mandates") \
.option("status", "Approved") \
.load()
# Read energy data
energy_df = spark.read.format("fluvius.energy") \
.option("ean", "541234567890123456") \
.option("period_type", "readTime") \
.option("granularity", "daily") \
.option("from_date", "2024-01-01") \
.option("to_date", "2024-01-31") \
.load()
Authentication
Credentials can be provided via environment variables or Spark options.
Environment Variables
# Required
export FLUVIUS_SUBSCRIPTION_KEY="your-subscription-key"
export FLUVIUS_CLIENT_ID="your-client-id"
export FLUVIUS_TENANT_ID="your-tenant-id"
export FLUVIUS_SCOPE="your-scope"
export FLUVIUS_DATA_ACCESS_CONTRACT_NUMBER="your-contract-number"
# For sandbox (client secret auth)
export FLUVIUS_CLIENT_SECRET="your-client-secret"
# For production (certificate auth)
export FLUVIUS_CERTIFICATE_THUMBPRINT="your-thumbprint"
export FLUVIUS_PRIVATE_KEY="-----BEGIN RSA PRIVATE KEY-----..."
# Or use a file path:
export FLUVIUS_PRIVATE_KEY_PATH="/path/to/private_key.pem"
Spark Options
df = spark.read.format("fluvius.mandates") \
.option("subscription_key", "...") \
.option("client_id", "...") \
.option("tenant_id", "...") \
.option("scope", "...") \
.option("data_access_contract_number", "...") \
.option("client_secret", "...") \
.load()
Data Sources
fluvius.mandates
Read mandate data from the Fluvius API.
Options:
| Option | Description |
|---|---|
reference_number |
Filter by custom reference number |
ean |
Filter by GSRN EAN-code |
data_service_types |
Comma-separated list (e.g., "VH_dag,VH_kwartier_uur") |
energy_type |
"E" (electricity) or "G" (gas) |
status |
Requested, Approved, Rejected, or Finished |
mandate_expiration_date |
ISO format date filter |
renewal_status |
ToBeRenewed, RenewalRequested, or Expired |
last_updated_from |
ISO format datetime |
last_updated_to |
ISO format datetime |
environment |
"sandbox" (default) or "production" |
Schema:
| Column | Type |
|---|---|
| reference_number | string |
| status | string |
| ean | string |
| energy_type | string |
| data_period_from | timestamp |
| data_period_to | timestamp |
| data_service_type | string |
| mandate_expiration_date | timestamp |
| renewal_status | string |
fluvius.energy
Read energy measurement data from the Fluvius API.
Required Options:
| Option | Description |
|---|---|
ean |
GSRN EAN-code (required) |
period_type |
"readTime" or "insertTime" (required) |
Optional Options:
| Option | Description |
|---|---|
reference_number |
Custom reference number |
granularity |
e.g., "daily", "hourly_quarterhourly" |
complex_energy_types |
e.g., "active,reactive" |
from_date |
ISO format date (e.g., "2024-01-01") |
to_date |
ISO format date (e.g., "2024-01-31") |
environment |
"sandbox" (default) or "production" |
Schema:
| Column | Type | Description |
|---|---|---|
| ean | string | EAN code of the installation |
| energy_type | string | "E" or "G" |
| metering_type | string | Type of metering installation |
| measurement_start | timestamp | Start of measurement period |
| measurement_end | timestamp | End of measurement period |
| granularity | string | daily, hourly, or quarter_hourly |
| meter_seq_number | string | Physical meter sequence (if applicable) |
| meter_id | string | Physical meter ID (if applicable) |
| subheadpoint_ean | string | Subheadpoint EAN (for submetering) |
| subheadpoint_type | string | auxiliary, offtake, or production |
| subheadpoint_seq_number | string | Subheadpoint sequence number |
| offtake_total_value | double | Offtake measurement value |
| offtake_total_unit | string | Measurement unit (e.g., kWh) |
| offtake_total_validation_state | string | READ, EST, VAL, or NVAL |
| offtake_total_gas_conversion_factor | string | P, D, or C (gas only) |
| offtake_day_value | double | Day tariff offtake |
| offtake_day_unit | string | Unit |
| offtake_day_validation_state | string | Validation state |
| offtake_night_value | double | Night tariff offtake |
| offtake_night_unit | string | Unit |
| offtake_night_validation_state | string | Validation state |
| injection_total_value | double | Injection measurement |
| injection_total_unit | string | Unit |
| injection_total_validation_state | string | Validation state |
| injection_day_value | double | Day tariff injection |
| injection_day_unit | string | Unit |
| injection_day_validation_state | string | Validation state |
| injection_night_value | double | Night tariff injection |
| injection_night_unit | string | Unit |
| injection_night_validation_state | string | Validation state |
| production_total_value | double | Production measurement |
| production_total_unit | string | Unit |
| production_total_validation_state | string | Validation state |
Requirements
- Python 3.13+
- PySpark 4.0+
- fluvius-energy-api 0.1.1+
License
AGPL-3.0-or-later
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pyspark_fluvius-0.2.0.tar.gz.
File metadata
- Download URL: pyspark_fluvius-0.2.0.tar.gz
- Upload date:
- Size: 71.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
456ccda98247bfbd69a4dee2aa077765f4375b04ba61581949460534ecd08a36
|
|
| MD5 |
b4ef863c5f210460ef79efd7030cd2d2
|
|
| BLAKE2b-256 |
cab0c85b8415a833be6b71667ee9e6718fe447ba62c8f65a95c37759902e0f6a
|
Provenance
The following attestation bundles were made for pyspark_fluvius-0.2.0.tar.gz:
Publisher:
publish.yml on warreee/spark-fluvius
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pyspark_fluvius-0.2.0.tar.gz -
Subject digest:
456ccda98247bfbd69a4dee2aa077765f4375b04ba61581949460534ecd08a36 - Sigstore transparency entry: 920017986
- Sigstore integration time:
-
Permalink:
warreee/spark-fluvius@977e880461b40bed2caec268c0e4314065569cad -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/warreee
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@977e880461b40bed2caec268c0e4314065569cad -
Trigger Event:
release
-
Statement type:
File details
Details for the file pyspark_fluvius-0.2.0-py3-none-any.whl.
File metadata
- Download URL: pyspark_fluvius-0.2.0-py3-none-any.whl
- Upload date:
- Size: 28.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ea08e1239f6840427c60456932eae94ed58757b2c565fa1ab46280b259784ab9
|
|
| MD5 |
f11cfebe261b401b1c5ad03e8d56e37a
|
|
| BLAKE2b-256 |
b161db7930d3c7c3b5376f221cc7458c08f89ebe3dade702df95ce9ff5d70b83
|
Provenance
The following attestation bundles were made for pyspark_fluvius-0.2.0-py3-none-any.whl:
Publisher:
publish.yml on warreee/spark-fluvius
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pyspark_fluvius-0.2.0-py3-none-any.whl -
Subject digest:
ea08e1239f6840427c60456932eae94ed58757b2c565fa1ab46280b259784ab9 - Sigstore transparency entry: 920017991
- Sigstore integration time:
-
Permalink:
warreee/spark-fluvius@977e880461b40bed2caec268c0e4314065569cad -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/warreee
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@977e880461b40bed2caec268c0e4314065569cad -
Trigger Event:
release
-
Statement type: