Skip to main content

A Python package for quality control (QC) checks on BSRN station-to-archive files.

Project description

bsrn

PyPI version Python Versions Documentation Status Downloads License: MIT

This GitHub repository is dazhiyang/bsrn: the source code and development tooling for the bsrn Python package.

bsrn is a community-developed toolbox that provides a set of robust functions and classes for processing and analyzing solar radiation data. The core mission of bsrn is to provide an open, reliable, interoperable, and benchmark-standard set of tools tailored specifically for the Baseline Surface Radiation Network (BSRN).

It features automated quality control (QC), high-precision solar geometry, clear-sky modeling, clear-sky detection (CSD), cloud enhancement event (CEE) detection, irradiance separation, and comprehensive data retrieval and visualization capabilities.

๐Ÿ“– Documentation (Read the Docs)

๐Ÿš€ Getting Started

Installation

The core bsrn package is designed to be lightweight and fast. You can install it using pip:

From PyPI (stable release):

pip install bsrn

From GitHub (latest development version):

pip install git+https://github.com/dazhiyang/bsrn.git

Optional Visualization Tools

If you want to use the built-in plotting features (like data availability charts or clear-sky calendars), you will need to install the optional visualization dependencies (plotnine, matplotlib, and scipy):

pip install bsrn[viz]

Usage

For standard quality control and clear-sky modeling, simply import the base package:

import bsrn

# Access core modules like bsrn.qc, bsrn.modeling, bsrn.io

If you installed the [viz] extra and want to generate plots, you must explicitly import the visualization submodule:

import bsrn.visualization

# Access plotting tools like bsrn.visualization.calendar.plot_calendar()

Quick Example (Single-File Workflow)

from bsrn.io.retrieval import download_bsrn_stn, get_bsrn_file_inventory
from bsrn.io.reader import read_station_to_archive
from bsrn.physics.geometry import add_solpos_columns
from bsrn.modeling.clear_sky import add_clearsky_columns
from bsrn.qc.wrapper import run_qc

# 1. See what data is available
inventory = get_bsrn_file_inventory(["QIQ"], username="your_user", password="your_pass")

# 2. Download data for a station
download_bsrn_stn("QIQ", "data/QIQ", username="your_user", password="your_pass")

# 3. Read a single monthly file (one file at a time)
df = read_station_to_archive("data/QIQ/qiq0124.dat.gz")

# 4. Add solar position (recommended before time-averaging or clear-sky)
df = add_solpos_columns(df, "QIQ")

# 5. Add clear-sky reference columns (defaults to Ineichen)
df = add_clearsky_columns(df, "QIQ")

# 6. Run Quality Control (QC)
df = run_qc(df, "QIQ")

# 7. Add satellite-derived CAMS CRS all-sky columns
from bsrn.io.crs import add_crs_columns
df = add_crs_columns(df, "QIQ")

# 8. Visualize with plotnine
from bsrn.visualization.clearsky_models import plot_clearsky_models
plot_clearsky_models(df, "QIQ", date="2024-06-20", save_path="clearsky_qiq.pdf")

๐Ÿ›  Features

The QC features, of which the implementation is primarily based on the BSRN Operations Manual (2018) and Forstinger et al. (2021). See code for other references.

  • Level 1 (Physically Possible): Absolute physical bounds for $G_h, B_n, D_h$, and $L_d$.
  • Level 2 (Extremely Rare): Climatological limits for specific regimes.
  • Level 3 (Comparison): Consistency checks ($G_h$ vs $B_n \cos Z + D_h$) with zenith-dependent thresholds.
  • Level 4 (Diffuse Ratio): Diffuse-fraction and $k$โ€“$k_t$ checks combining $G_h$, $D_h$, and extraterrestrial irradiance.
  • Level 5 (K-Indices): Advanced clearness-index and $k_b$/$k_t$ index tests using clear-sky benchmarks and site elevation.
  • Level 6 (Tracker-Off Detection): Identify tracking errors by comparing measured values with clear-sky and extraterrestrial irradiance.

Other important features include:

  • Solar Geometry: Native NREL SPA implementation for high-precision solar position calculations.
  • Clear-Sky Models: Ineichen (monthly Linke turbidity), McClear (CAMS SoDa API, from 2004 onward), and REST2 (MERRA-2 from Hugging Face).
  • Satellite Data: Load CAMS solar radiation service (CRS) and National Solar Radiation Database (NSRDB) all-sky irradiance directly from Hugging Face into memory.
  • Clear-Sky Detection (CSD): Reno, Ineichen, Lefevre, and BrightSun methods to identify clear-sky periods from irradiance time series.
  • Cloud Enhancement Event (CEE) Detection: Killinger, Yang, and Gueymard methods to detect events when measured GHI significantly exceeds references.
  • Irradiance Separation: Erbs, BRL, Engerer2, and Yang4 models to estimate diffuse fraction and DHI/BNI from GHI.
  • Robust Retrieval: High-level API for FTP downloads from BSRN-AWI with exponential backoff retries (analysis functions assume one station-to-archive file at a time).
  • Station-to-archive formatting: The bsrn.archive subpackage provides logical-record specifications (LR_SPECS), Fortran-style validation, and ASCII get_bsrn_format output for BSRN header and data records (LR0001โ€“LR4000), with BSRNRecord in api and concrete LR* classes in formatter.
  • Visualization: Data availability heatmaps and k vs kt separation plots via the very pretty plotnine (which reminds me of the good old R days).

๐Ÿ“‚ File Structure

[!NOTE] Not all files are uploaded with Git. Data files and intermediate outputs are excluded via .gitignore.

bsrn-qc/
โ”œโ”€โ”€ pyproject.toml
โ”œโ”€โ”€ LICENSE
โ”œโ”€โ”€ README.md
โ”œโ”€โ”€ .gitignore
โ”œโ”€โ”€ .readthedocs.yaml              # Read the Docs build config
โ”œโ”€โ”€ src/
โ”‚   โ””โ”€โ”€ bsrn/
โ”‚       โ”œโ”€โ”€ __init__.py
โ”‚       โ”œโ”€โ”€ constants.py               # Station database, Linke turbidity & physical constants
โ”‚       โ”œโ”€โ”€ archive/                   # Station-to-archive logical records (WRMC-style LR layouts)
โ”‚       โ”‚   โ”œโ”€โ”€ specs.py               # LR_SPECS + station directory & A3โ€“A7 code tables
โ”‚       โ”‚   โ”œโ”€โ”€ api.py                 # BSRNRecord (assignment validation); get_azimuth_elevation
โ”‚       โ”‚   โ”œโ”€โ”€ formatter.py           # LR0001โ€“LR4000 classes, get_bsrn_format, lr0001_format helpers
โ”‚       โ”‚   โ””โ”€โ”€ validation.py          # Field validators (R validateFunc parity)
โ”‚       โ”œโ”€โ”€ io/
โ”‚       โ”‚   โ”œโ”€โ”€ reader.py              # Read xxxmmyy.dat.gz station-to-archive files
โ”‚       โ”‚   โ”œโ”€โ”€ retrieval.py           # FTP downloads with retries
โ”‚       โ”‚   โ”œโ”€โ”€ merra2.py              # MERRA-2 parquet fetch (Hugging Face โ†’ RAM)
โ”‚       โ”‚   โ”œโ”€โ”€ mcclear.py             # SoDa McClear client helpers
โ”‚       โ”‚   โ”œโ”€โ”€ crs.py                 # SoDa CAMS solar radiation service (CRS) client helpers
โ”‚       โ”‚   โ”œโ”€โ”€ nsrdb.py               # NREL NSRDB all-sky data client helpers
โ”‚       โ”‚   โ””โ”€โ”€ writers.py             # Export results
โ”‚       โ”œโ”€โ”€ physics/
โ”‚       โ”‚   โ”œโ”€โ”€ spa.py                 # Native NREL SPA (solar position algorithm)
โ”‚       โ”‚   โ””โ”€โ”€ geometry.py            # Solar position and extraterrestrial irradiance
โ”‚       โ”œโ”€โ”€ qc/
โ”‚       โ”‚   โ”œโ”€โ”€ ppl.py                 # Physically possible limits (Level 1)
โ”‚       โ”‚   โ”œโ”€โ”€ erl.py                 # Extremely rare limits (Level 2)
โ”‚       โ”‚   โ”œโ”€โ”€ closure.py             # Internal consistency checks (Level 3)
โ”‚       โ”‚   โ”œโ”€โ”€ diff_ratio.py          # Diffuse ratio checks (Level 4)
โ”‚       โ”‚   โ”œโ”€โ”€ k_index.py             # Radiometric index tests (Level 5)
โ”‚       โ”‚   โ”œโ”€โ”€ tracker.py             # Solar tracker off detection (Level 6)
โ”‚       โ”‚   โ””โ”€โ”€ wrapper.py             # High-level QC pipeline
โ”‚       โ”œโ”€โ”€ visualization/
โ”‚       โ”‚   โ”œโ”€โ”€ availability.py        # File coverage heatmaps (plotnine)
โ”‚       โ”‚   โ”œโ”€โ”€ qc_table.py            # QC result tables
โ”‚       โ”‚   โ”œโ”€โ”€ separation.py          # Decomposition visualization
โ”‚       โ”‚   โ””โ”€โ”€ timeseries.py          # Time series plots
โ”‚       โ”œโ”€โ”€ utils/
โ”‚       โ”‚   โ”œโ”€โ”€ calculations.py        # Supporting math
โ”‚       โ”‚   โ”œโ”€โ”€ quality.py             # Quality utilities
โ”‚       โ”‚   โ”œโ”€โ”€ clear_sky_detection.py # Clear-sky detection (Reno, Ineichen, Lefevre, BrightSun)
โ”‚       โ”‚   โ””โ”€โ”€ cee_detection.py       # Cloud enhancement detection (Killinger, Yang, Gueymard)
โ”‚       โ””โ”€โ”€ modeling/
โ”‚           โ”œโ”€โ”€ clear_sky.py           # Ineichen clear-sky model
โ”‚           โ””โ”€โ”€ separation.py          # Irradiance separation (Erbs, BRL, Engerer2, Yang4)
โ”œโ”€โ”€ docs/
โ”‚   โ”œโ”€โ”€ conf.py                        # Sphinx config; source dir = docs/ (tutorials + sphinx/ RST)
โ”‚   โ”œโ”€โ”€ index.rst                      # Site homepage (root index.html for Read the Docs)
โ”‚   โ”œโ”€โ”€ requirements.txt               # Sphinx / Read the Docs dependencies
โ”‚   โ”œโ”€โ”€ examples/                      # Examples landing page (index.rst) + optional scripts
โ”‚   โ”‚   โ””โ”€โ”€ index.rst
โ”‚   โ”œโ”€โ”€ tutorials/                     # Jupyter tutorials + index.rst (nbsphinx)
โ”‚   โ”‚   โ”œโ”€โ”€ 1.data_downloading.ipynb
โ”‚   โ”‚   โ”œโ”€โ”€ 2.quality_control.ipynb
โ”‚   โ”‚   โ”œโ”€โ”€ 3.time_averaging.ipynb
โ”‚   โ”‚   โ”œโ”€โ”€ 4.clear_sky_detection.ipynb
โ”‚   โ”‚   โ””โ”€โ”€ 5.cloud_enhancement_event.ipynb
โ”‚   โ””โ”€โ”€ sphinx/                        # RST (user_guide, api, _static); not the doc homepage
โ”‚       โ”œโ”€โ”€ api/                       # API reference (io, qc, physics, โ€ฆ)
โ”‚       โ””โ”€โ”€ user_guide/                # installation, getting_started, package_overview, โ€ฆ

๐Ÿ“– Examples

Solar Position

import pandas as pd
from bsrn.physics.geometry import get_solar_position, get_bni_extra

times = pd.date_range("2024-07-01", periods=1440, freq="1min", tz="UTC")
solpos = get_solar_position(times, lat=47.80, lon=124.49, elev=170)

print(solpos[["zenith", "apparent_zenith", "azimuth"]].head())

Extraterrestrial Irradiance

from bsrn.physics.geometry import get_bni_extra

bni_extra = get_bni_extra(times)  # Spencer (1971) method

Clear-Sky GHI (Ineichen)

from bsrn.modeling.clear_sky import add_clearsky_columns

# Automatically computes solar geometry if missing, but it is highly
# recommended to call `add_solpos_columns(df)` first for 1-minute data!
df = add_clearsky_columns(df, "QIQ")
# Adds columns: ghi_clear, bni_clear, dhi_clear

Clear-Sky GHI from McClear (CAMS)

from bsrn.modeling.clear_sky import add_clearsky_columns

# McClear data are available from 2004-01-01 onward.
# McClear ๆ•ฐๆฎ่‡ช 2004-01-01 ่ตทๅฏ็”จใ€‚
df = add_clearsky_columns(
    df,
    station_code="QIQ",
    model="mcclear",
    mcclear_email="your_email@example.com",  # SoDa / CAMS account email
)
# Adds columns: ghi_clear, bni_clear, dhi_clear based on CAMS McClear

Clear-Sky GHI from REST2 (MERRA-2 via Hugging Face)

REST2 uses MERRA-2 atmospheric inputs only from the Hugging Face dataset dazhiyang/bsrn-merra2 (hourly Parquet files per station, station_code/*.parquet). The bsrn package fetches them into RAM (no disk cache) when model="rest2" is used.

from bsrn.modeling.clear_sky import add_clearsky_columns

# MERRA-2 is fetched from Hugging Face into RAM automatically.
df = add_clearsky_columns(df, station_code="QIQ", model="rest2")
# Adds columns: ghi_clear, bni_clear, dhi_clear based on REST2 + MERRA-2

The dataset README for Hugging Face is maintained in this repo at data/bsrn_static_assets/README.md (published to the Hub separately from PyPI).

All-Sky GHI from NSRDB (NREL via Hugging Face)

Similar to REST2, NSRDB all-sky data is fetched directly from the Hugging Face dataset dazhiyang/bsrn-nsrdb-conus (and other variants).

from bsrn.io.nsrdb import add_nsrdb_columns

# Fetch NSRDB all-sky GHI/DNI/DHI from Hugging Face
df = add_nsrdb_columns(df, station_code="QIQ", variant="conus")
# Adds columns: ghi_nsrdb, bni_nsrdb, dhi_nsrdb

Clear-Sky Detection

from bsrn.utils import detect_clearsky

# Requires GHI and clear-sky GHI (e.g. from add_clearsky_columns)
out = detect_clearsky("reno", ghi=df["ghi"], ghi_clear=df["ghi_clear"], times=df.index)
# out["is_clearsky"] is True/False/NA; out["cloud_flag"] is 0/1/NaN
# Other methods: "ineichen", "lefevre", "brightsun" (different inputs)

Cloud Enhancement Event (CEE) Detection

from bsrn.utils.cee_detection import detect_cee

# Killinger CEE detection: requires 1โ€‘min GHI, clear-sky GHI, zenith, and a 1โ€‘min index
out_cee_k = detect_cee(
    "killinger",
    ghi=df["ghi"],
    ghi_clear=df["ghi_clear"],
    zenith=df["zenith"],
    times=df.index,
)
# out_cee_*["is_enhancement"] is True/False/NA; out_cee_*["cee_flag"] is 0/1/NaN

Data Availability Heatmap

from bsrn.visualization.availability import plot_bsrn_availability

fig = plot_bsrn_availability(inventory_df, station_code="QIQ")
fig.save("availability.png", dpi=300)

Station-to-archive logical records (bsrn.archive)

Use LR_SPECS for field names, formats, and validators; build text with LR* classes (formatter) or helpers such as lr0001_format, lr0100_data_format.

from bsrn.archive import LR_SPECS, lr0001_format

# Required keys for LR0001 are listed in LR_SPECS["LR0001"]
# out = lr0001_format({"stationNumber": 94, "month": 1, "year": 2024, "version": 1})

๐Ÿ“œ License

MIT License. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bsrn-0.1.4.tar.gz (149.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bsrn-0.1.4-py3-none-any.whl (171.9 kB view details)

Uploaded Python 3

File details

Details for the file bsrn-0.1.4.tar.gz.

File metadata

  • Download URL: bsrn-0.1.4.tar.gz
  • Upload date:
  • Size: 149.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for bsrn-0.1.4.tar.gz
Algorithm Hash digest
SHA256 c432d6f2a2e093b97dc2f96348108fbd321cb4a7c8420692bf70159235bd2b24
MD5 2ddeeac7447a99f57080f0779489b9fd
BLAKE2b-256 158e62ca6dabdd530c2d84c7e49c0382ea4eb9d8091550423b8d094a9788e85b

See more details on using hashes here.

File details

Details for the file bsrn-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: bsrn-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 171.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for bsrn-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 d7dfe1a6816a696e325e550e76478786bc57dea31646411d9e028f9cad34fe22
MD5 c59d4382f8d9dbcb0e9a221a6fc02d1e
BLAKE2b-256 6802ce81c59d5567781332a4fb50c3f754a11a077d8f1876e57d91ce42a81d39

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page