Skip to main content

This module provides functions for downloading metadata and data files from

Project description

Aeolus

Download, standardise and store air quality data from UK monitoring networks.

Features

  • Simple, clean API for downloading air quality data
  • Support for multiple UK regulatory networks
  • Automatic retry logic for resilient downloads
  • Standardised data format (using Pandas)
  • Composable data transformations
  • Database storage support (SQLAlchemy/SQLModel)

Quick Start

import aeolus
from datetime import datetime

# List available data sources
sources = aeolus.list_sources()
print(sources)  # ['AQE', 'AURN', 'LMAM', 'LOCAL', 'NI', 'SAQD', 'SAQN', 'WAQN']

# Get site metadata
sites = aeolus.get_metadata("AURN")
print(f"Found {len(sites)} monitoring sites")

# Download air quality data
data = aeolus.download(
    sources="AURN",
    sites=["MY1"],  # Marylebone Road, London
    start_date=datetime(2024, 1, 1),
    end_date=datetime(2024, 1, 31)
)

print(f"Downloaded {len(data)} measurements")
print(data.head())

Installation

To install Aeolus, run the following command in your terminal:

pip install aeolus-aq

Wheels and source distributions are available under Releases on Github.

Data Sources

Currently, Aeolus supports downloading data from the following networks:

UK Networks

  • AURN (DEFRA's Automatic Urban and Rural Network)
  • SAQN (Scottish Air Quality Network)
  • WAQN (Wales Air Quality Network)
  • NI (Northern Ireland Air Quality Network)
  • AQE (Air Quality England)
  • LOCAL (Local regulatory networks in England)
  • Breathe London (requires API key: BL_API_KEY)

Global Networks

Data from regulatory networks is sourced via the OpenAir project (using RData files provided by each regulatory network). My thanks to David Carslaw and all other contributors (see Carslaw & Ropkins, 2012 for further information).

Data from Breathe London is licensed under the Open Government Licence v3.0. For further information, see https://www.breathelondon.org.

Setup

API Keys

Some data sources require API keys. Copy .env.example to .env and add your keys:

cp .env.example .env
# Edit .env and add your API keys

Required for:

The .env file is git-ignored for security.

Usage Examples

Download from OpenAQ (Global Data)

import aeolus
from datetime import datetime

# Download from any OpenAQ location worldwide
# Find location IDs at: https://explore.openaq.org/
data = aeolus.download(
    sources="OpenAQ",
    sites=["2178"],  # Example: a monitoring station
    start_date=datetime(2024, 1, 1),
    end_date=datetime(2024, 1, 31)
)

# Data is automatically standardized to match other sources
print(data.head())

Download from Multiple Sources

import aeolus
from datetime import datetime

# Download from multiple networks at once
data = aeolus.download(
    sources=["AURN", "SAQN"],
    sites=["MY1", "GLA4"],
    start_date=datetime(2024, 1, 1),
    end_date=datetime(2024, 1, 31)
)

# Data is automatically combined into one DataFrame
print(data['source_network'].unique())  # ['AURN', 'SAQN']

# Can also combine UK and global sources
data = aeolus.download(
    sources=["AURN", "OpenAQ"],
    sites=["MY1", "2178"],
    start_date=datetime(2024, 1, 1),
    end_date=datetime(2024, 1, 31)
)

Get Separate DataFrames per Source

# Get data separated by source
data_by_source = aeolus.download(
    sources=["AURN", "SAQN"],
    sites=["MY1", "GLA4"],
    start_date=datetime(2024, 1, 1),
    end_date=datetime(2024, 1, 31),
    combine=False
)

# Returns a dictionary
for source, df in data_by_source.items():
    print(f"{source}: {len(df)} records")

Filter and Transform Data

Aeolus provides composable transformation functions:

from aeolus.transforms import pipe, filter_rows, select_columns, sort_values

# Download and transform in one go
no2_data = pipe(
    aeolus.download("AURN", ["MY1"], start_date, end_date),
    filter_rows(lambda df: df["measurand"] == "NO2"),
    filter_rows(lambda df: df["value"].notna()),
    select_columns("site_code", "date_time", "value", "units"),
    sort_values("date_time")
)

Work with Site Metadata

# Get all sites for a network
sites = aeolus.get_metadata("AURN")

# Filter to urban background sites
urban_sites = sites[sites["location_type"] == "Urban Background"]

# Get site codes for download
site_codes = urban_sites["site_code"].tolist()

# Download data for those sites
data = aeolus.download("AURN", site_codes, start_date, end_date)

Data Format

Site Metadata

Metadata is returned as a pandas DataFrame with the following columns:

  • site_code: Unique site identifier
  • site_name: Human-readable site name
  • latitude: Site latitude (decimal degrees)
  • longitude: Site longitude (decimal degrees)
  • source_network: Name of the source network
  • location_type: Type of location (e.g., "Urban Background", "Roadside")
  • owner: Organization operating the site

Air Quality Data

Data is returned as a pandas DataFrame with the following columns:

  • site_code: Site identifier
  • date_time: Measurement timestamp
  • measurand: Pollutant/parameter measured (e.g., "NO2", "PM2.5", "O3")
  • value: Measured value
  • units: Units of measurement (typically "ug/m3")
  • source_network: Name of source network
  • ratification: Ratification status
  • created_at: When record was created

Advanced Features

Automatic Retry Logic

Aeolus automatically retries failed network requests with exponential backoff, making downloads resilient to temporary network issues.

Composable Transformations

Build custom data processing pipelines:

from aeolus.transforms import compose, filter_rows, add_column

# Create a reusable pipeline
my_pipeline = compose(
    filter_rows(lambda df: df["value"] > 0),
    add_column("year", lambda df: df["date_time"].dt.year),
    # ... more transformations
)

# Apply to any DataFrame
processed_data = my_pipeline(raw_data)

Database Storage

Store data in a database (SQLite, PostgreSQL, etc.):

from aeolus import add_sites_to_database, add_data_to_database

# Store site metadata
add_sites_to_database(sites, database_file="air_quality.db")

# Store measurement data
add_data_to_database(data, database_file="air_quality.db")

Requirements

  • Python >= 3.11
  • pandas >= 2.3.3
  • rdata >= 0.11
  • requests >= 2.32.5
  • sqlmodel >= 0.0.27
  • tenacity >= 8.2.0

Architecture

Aeolus uses a functional architecture with:

  • Type-safe interfaces: TypedDicts and type aliases for consistency
  • Composable transformations: Small, pure functions that combine into pipelines
  • Source registry: Extensible system for adding new data sources
  • Automatic retries: Network resilience built-in

For more details, see the CHANGES.md file.

Contributing

Contributions are welcome! The codebase is designed to be extensible. To add a new data source:

  1. Create a fetcher function following the DataFetcher type signature
  2. Create a normalizer using the composable transforms
  3. Register your source with the registry

See src/aeolus/sources/regulatory.py for examples.

Licence

Aeolus is licensed under the GNU General Public License v3.0 or later. For further information, see https://www.gnu.org/licenses/gpl-3.0.en.html.

Citation

If you use Aeolus in your research, please cite:

Carslaw, D. C. and K. Ropkins, (2012) openair --- an R package for air quality data analysis. Environmental Modelling & Software. Volume 27-28, 52-61.

(For the OpenAir project which provides the underlying data for regulatory networks)

Contact

For any questions or feedback, please contact Ruaraidh Dobson at ruaraidh.dobson@gmail.com.

Changelog

See CHANGES.md for version history and recent improvements.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aeolus_aq-0.1.1a0-py3-none-any.whl (50.2 kB view details)

Uploaded Python 3

File details

Details for the file aeolus_aq-0.1.1a0-py3-none-any.whl.

File metadata

  • Download URL: aeolus_aq-0.1.1a0-py3-none-any.whl
  • Upload date:
  • Size: 50.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.18 {"installer":{"name":"uv","version":"0.9.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for aeolus_aq-0.1.1a0-py3-none-any.whl
Algorithm Hash digest
SHA256 ce329b7a697354c2585592ae7f56519061406b98509aa821add635df9600ad01
MD5 039a0a91826fa18df8b8e78b4efcd21a
BLAKE2b-256 0412b7db3b789a2591ab3fab90c2fd565f323240e8d189a9f3d09dc2af76e173

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page