Skip to main content

Photovoltaic per site modeling

Project description

Photovoltaic (PV) Site Prediction Model

ease of contribution: hard

This repo contains code to train and evaluate a model to produce the forecasted energy production from solar panels (PV). It does this by providing a framework to forecast ahead, using pv data from sites, weather data (NWPs as multidimensional geospatial zarrs) and sateliite imagery (from the EUMETSAT Geostationary satellite).

Organisation of the repo

.
├── exp_reports         # Experiment reports - markdown notes about experiments we have made
├── exp_results         # Default output for the {train,eval}_model.py scripts
├── notebooks           # Diverse notebooks
├── data                # Placeholder for data files
└── dashboards          # Experimental streamlit dashboards
└── psp                 # Main python package
    ├── clients         # Client specific code
    ├── data_sources    # Data sources (PV, NWP, Satellite, etc.)
    ├── exp_configs     # Experimentation configs - a config defines the different options for
    │                   # training and evaluation models. This directory contains many ready
    │                   # configs where the paths points to the data on Leonardo.
    ├── models          # The machine learning code
    ├── scripts         # Scripts (entry points)
    └── tests           # Unit tests

Training and evaluating a model

poetry run python psp/scripts/train_model.py \
    --exp-config-name test_config1 \
    -n test

poetry run python psp/scripts/eval_model.py \
    -n test

# This will have generated a model and test results in `exp_results/test`.

# You can then look at the results in the `expriment_analysis.ipynb` and
# `sample_analysis.ipynb` notebooks by setting EXP_NAMES=["test"] in the first cells.

# Call the scripts with `--help` to see more options, in particular to run on more than one CPU.

# The script run_exp.sh can be used to train and then evaluate a model, for example
./run_exp.sh exp_config_to_use name_for_exp

Setting up the experiement configuration

The configuration and parameters for the specific model setup is done in a python file, with the file saved under psp/exp_configs. In this configuration file you can:

Set up data sources

  • Set the location for the PV data and:
    • Set the names of the different variables in this dataset.
    • Set a lag associated with using PV data in real time.
  • Specify how to use the tilt, orienation and capacity metadata.
  • Set the location for the NWP sources, define the coordinate system it's in, rename variables and set the variables that the model should use.
  • Specificy the location for the satelitte imagery data:
    • Also indicate the size of the patch size used for satellite images.

Set up specific model configuration

  • Set the interval between forecasts (duration) and the number of horizons to forecast for (num_horizons).
  • Chose to normalise target and features.
  • Select the number of training samples.
  • Chose amounts to drop NWP and PV data (set data to NaNs during training) to help the model handle NaNs in production.

PV inputs

This model forecasts the power produced by a specific solar site. If forecasting for 15 minute intervals it is best to use 15 minute data for training. To do this the you may want to resample the data. The associated timestamp which the generation represents sould be the middle of the window. More information on how the model resamples the PV data can be found in the training.py file.

Inputs to the model:

  • PV features such as the generation over the last 30 minutes and what happened in the previous days. (This can be modified in the recent_history.py model)
  • Additional PV data can be passed as a feature. Respective PV lags should be used to simulate production conditions.
  • Clear sky irradiance is calculated using PVLib to give the total irradiance for the specific Plane Of Array (POA) which is used as a feature and to normalise the PV data.
  • Recent PV power is added as a feature which is calcuated based on recent_power_minutes set in the recent_history class, where the average of data available within recent_power_minutes is used.
  • num_days_history can also be set to help calcualte the historical mean, medium and maximum at that time over the past number of days.

Training, Validating and Testing

Training, validation and testing can be split across different pv_ids for which the ratios can be specified in the make_pv_splits function in the experiment configuration. This is so that the model is trained off one of set of pv_ids and then validated and tested on an unseen set of pv_ids.

When training, these pv_ids will be outputed. The time range is the same for teh train and validation pv_id set.

When there is only a single or small number of sites, the argument, pv_split=None, should be passed into the config to avoid splitting up the pv_ids.

Inference and Backtests

When the eval_model.py script is run, different parameters can be passed to specify the conditions of the backtest. One option is to simulate a backtest without features generated from live PV data.

To do this a --no-live-pv flag can used when running the script, which will set the live PV features to NaNs. If this is used is important that the model has been trained off some NaNs from PV during training.

Prerequisites

Development

# Installation of the dependencies.
poetry install

# Formatting
make format

# Linting
make lint

# Running the tests.
make test

# Starting the jupyter notebooks.
make notebook

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pv_site_prediction-0.1.20.tar.gz (5.4 MB view details)

Uploaded Source

Built Distribution

pv_site_prediction-0.1.20-py3-none-any.whl (5.5 MB view details)

Uploaded Python 3

File details

Details for the file pv_site_prediction-0.1.20.tar.gz.

File metadata

  • Download URL: pv_site_prediction-0.1.20.tar.gz
  • Upload date:
  • Size: 5.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for pv_site_prediction-0.1.20.tar.gz
Algorithm Hash digest
SHA256 b8c2ef148823d8a493341631ab92ead171910aafaf8b1a976e2e0165a1a48b1e
MD5 60935caae7ce1988fc7e430191286016
BLAKE2b-256 30dde362b52c3b59e21bc5574650802a3f1765a214c7f15cc7692e6c71ef08a3

See more details on using hashes here.

File details

Details for the file pv_site_prediction-0.1.20-py3-none-any.whl.

File metadata

File hashes

Hashes for pv_site_prediction-0.1.20-py3-none-any.whl
Algorithm Hash digest
SHA256 42a99b83f02f228d4cedd1eea521ae9506e4d88a3cc22b3937c479593377ccab
MD5 8cbf38a466a1765e8a4c235338d17227
BLAKE2b-256 d848bcc15b1b5a91545172d7e57a2f1d371ff385e2037eb432f8540c9965b88f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page