Skip to main content

Tools for downloading and processing Ocean Networks Canada hydrophone data

Project description

🌊 ONC Hydrophone Data Tools

PyPI version Python 3.9+ License: MIT Docs

Tools for downloading and processing Ocean Networks Canada hydrophone data, including spectrograms, FLAC audio files, and custom spectrogram generation.

📦 Installation

pip install onc-hydrophone-data

If you want CPU-only PyTorch (recommended for spectrogram generation on most hosts):

pip install onc-hydrophone-data \
  --index-url https://download.pytorch.org/whl/cpu \
  --extra-index-url https://pypi.org/simple

For development:

git clone https://github.com/Spiffical/onc-hydrophone-data.git
cd onc-hydrophone-data
pip install -e .

⚙️ Configuration

  1. Get your ONC API token from: https://data.oceannetworks.ca/Profile

  2. Create a .env file in your project directory:

ONC_TOKEN=your_onc_token_here
DATA_DIR=./data

🚀 Quick Start

📓 Tutorial Notebook - The best way to get started with interactive examples.

Python API

from onc_hydrophone_data.onc.common import load_config
from onc_hydrophone_data.data import HydrophoneDownloader
from onc_hydrophone_data.audio import SpectrogramGenerator

# Load credentials from .env file
onc_token, data_dir = load_config()

# Download spectrograms using intelligent sampling
downloader = HydrophoneDownloader(onc_token, data_dir)
downloader.download_spectrograms_with_sampling_schedule(
    deviceCode="ICLISTENHF6020",
    start_date=(2021, 1, 1),
    threshold_num=100
)

# Generate custom spectrograms from audio files
generator = SpectrogramGenerator(win_dur=2.0, overlap=0.75)
generator.process_directory("data/DEVICE/audio/", "output/spectrograms/")

Command Line

# Interactive mode (guided setup - recommended)
python scripts/download_hydrophone_data.py

# Download spectrograms with specific parameters
python scripts/download_hydrophone_data.py --mode sampling \
    --device ICLISTENHF6020 --start-date 2021 1 1 --threshold 500

# Include FLAC audio files
python scripts/download_hydrophone_data.py --mode sampling \
    --device ICLISTENHF6020 --start-date 2021 1 1 --threshold 100 --download-audio

# Generate custom spectrograms
python scripts/generate_spectrograms.py --input-dir data/DEVICE/audio/ --win-dur 2.0

Deployment Availability Visualization

from onc_hydrophone_data.data.deployment_checker import HydrophoneDeploymentChecker
from onc_hydrophone_data.utils import (
    plot_deployment_availability_timeline,
    plot_availability_calendar,
)

checker = HydrophoneDeploymentChecker(onc_token)
availability = checker.get_device_availability("ICLISTENHF6324", bin_size="day")
plot_deployment_availability_timeline(availability)
plot_availability_calendar(availability)

✨ Features

  • Smart Sampling: Intelligently distributes downloads across date ranges
  • Parallel ONC Requests: Submits many requests at once so ONC processes them in parallel, then downloads when ready (faster than sequential requests)
  • Audio Downloads: Download raw audio (FLAC/WAV) alongside spectrograms
  • Custom Spectrograms: Generate spectrograms with configurable parameters
  • Deployment Validation: Ensures data exists for requested time periods
  • Deployment Availability Visuals: Timeline/calendar views of data availability by device
  • Interactive Mode: Guided CLI for easy setup

📁 Output Structure

Downloads are organized in a clean, flat structure:

data/
└── ICLISTENHF6020/
    └── sampling_2021-01-01_to_2021-01-31/
        ├── onc_spectrograms/     # ONC-downloaded spectrograms (MAT/PNG)
        │   ├── *.mat             # Spectrogram data files
        │   └── anomaly_report.txt # Any validation issues (if found)
        ├── audio/                # Downloaded audio files
        │   └── *.flac
        └── custom_spectrograms/  # Locally-generated spectrograms
            ├── mat/              # Custom MAT files
            └── png/              # Custom PNG plots

🛠️ Troubleshooting

Issue Solution
Invalid ONC Token Verify token in .env file
No data found Use --check-deployments to verify coverage
Memory errors Reduce --spectrograms-per-batch

📚 Documentation

Docs site: https://spiffical.github.io/onc-hydrophone-data/
See the Tutorial Notebook for comprehensive examples including:

  • Different download modes (sampling, range, specific times)
  • Parallel download optimization
  • Custom spectrogram generation
  • JSON timestamp requests

📄 License

MIT License - see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

onc_hydrophone_data-0.2.2.tar.gz (90.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

onc_hydrophone_data-0.2.2-py3-none-any.whl (93.7 kB view details)

Uploaded Python 3

File details

Details for the file onc_hydrophone_data-0.2.2.tar.gz.

File metadata

  • Download URL: onc_hydrophone_data-0.2.2.tar.gz
  • Upload date:
  • Size: 90.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for onc_hydrophone_data-0.2.2.tar.gz
Algorithm Hash digest
SHA256 df42b4955e01a1f84d0e9d3d314e2fb2299de2312f50318abedf6ef4e6c40169
MD5 31557460b48e6d7a21039d1b2294c459
BLAKE2b-256 cb159197c78293aa416ba8328a8c8519c872e2b7ed08a67223efe9e8e252d39a

See more details on using hashes here.

File details

Details for the file onc_hydrophone_data-0.2.2-py3-none-any.whl.

File metadata

File hashes

Hashes for onc_hydrophone_data-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 d2a893ac7d20fdade0123bb0d98203e735ebb9bfd99e3afb564b6fded636f09b
MD5 7595ba45360968d6648cdf5d1a7a78a1
BLAKE2b-256 e5d2243a0f0ae8bbc42057404cf63ace8257c9b40d1e0b05aafebedbc8bb5b26

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page