Skip to main content

Modern CLI-based environmental sensor collector using Polars, Arrow, and Delta Lake for Enviro+

Project description

OpenSensor Enviroplus

Modern, CLI-based environmental sensor data collector using Polars, Apache Arrow, and Hive-partitioned Parquet for Raspberry Pi Enviro+.

Part of the OpenSensor.Space network for open environmental data.

Features

  • UUID v7 Station IDs: Time-ordered UUIDs for better database performance
  • Modern Stack: Polars streaming, Apache Arrow, Hive-partitioned Parquet
  • Memory Efficient: Optimized for Raspberry Pi with limited RAM
  • CLI-First: Simple Python commands replace bash scripts
  • Smart Logging: Rich console output for easy debugging
  • Cloud Sync: Built-in sync using obstore (50% faster than boto3)
  • Prefix-based IAM: S3 bucket access control per station
  • Type Safe: Pydantic settings with validation
  • Production Ready: Graceful error handling, automatic retries
  • Browser-queryable: DuckDB-wasm compatible Parquet output

Quick Start

Prerequisites

# Update system packages
sudo apt-get update

# Install git
sudo apt-get install -y git

# Install UV package manager (fast Python package manager)
curl -LsSf https://astral.sh/uv/install.sh | sh
source $HOME/.cargo/env

Installation

# Clone the repository
git clone https://github.com/walkthru-earth/opensensor-enviroplus.git
cd opensensor-enviroplus

# Install dependencies with UV
uv sync

# Activate virtual environment (optional - uv run handles this)
source .venv/bin/activate

Setup

# Interactive setup (creates .env configuration)
opensensor setup

# Or non-interactive
opensensor setup --station-id "01234567-89ab-cdef-0123-456789abcdef" --no-interactive

Usage

Run as Systemd Service (Recommended)

# Quick setup (install + enable + start) - automatically handles sudo
opensensor service setup

# View service status
opensensor service status

# View live logs
opensensor service logs --follow

# Restart service
opensensor service restart

# Stop service
opensensor service stop

# Complete removal
opensensor service remove

Note: Service commands automatically request sudo when needed.

Manual Commands

# Start collecting data
opensensor start

# Run in foreground (for debugging)
opensensor start --foreground

# View status
opensensor status

# Sync to cloud
opensensor sync

# View logs
opensensor logs

# Follow logs in real-time
opensensor logs --follow

# View configuration
opensensor config

Service Management

The service commands automatically detect your user, project path, and virtual environment:

# View auto-detected configuration
opensensor service info

# Individual service commands (automatically handle sudo)
opensensor service install    # Create systemd service
opensensor service enable     # Enable on boot
opensensor service start      # Start service
opensensor service stop       # Stop service
opensensor service restart    # Restart service
opensensor service disable    # Disable on boot
opensensor service uninstall  # Remove service

Configuration

Configuration via .env file (auto-generated by opensensor setup):

# Station identification (UUID v7 - auto-generated)
OPENSENSOR_STATION_ID=019ab383-d789-74e2-a460-bb92b1c13681

# Data collection
OPENSENSOR_READ_INTERVAL=5              # Seconds between sensor reads
OPENSENSOR_BATCH_DURATION=900           # 15-minute batches

# Temperature compensation (for Raspberry Pi CPU heat)
OPENSENSOR_TEMP_COMPENSATION_ENABLED=true
OPENSENSOR_TEMP_COMPENSATION_FACTOR=2.25

# Output settings
OPENSENSOR_OUTPUT_DIR=output
OPENSENSOR_COMPRESSION=snappy           # Fast compression (snappy, zstd, gzip)

# Cloud sync (optional)
OPENSENSOR_SYNC_ENABLED=true
OPENSENSOR_SYNC_INTERVAL_MINUTES=15

# S3/MinIO storage
OPENSENSOR_STORAGE_BUCKET=my-sensor-bucket
OPENSENSOR_STORAGE_PREFIX=sensors/station-019ab383  # For IAM scoping
OPENSENSOR_STORAGE_REGION=us-west-2
OPENSENSOR_STORAGE_ENDPOINT=            # Optional: for MinIO/custom S3

# AWS credentials
OPENSENSOR_AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE
OPENSENSOR_AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCY

# Logging
OPENSENSOR_LOG_LEVEL=INFO
OPENSENSOR_LOG_DIR=logs

See .env.example for a complete template with IAM policy examples.

Architecture

Data Flow

Sensors (5s) -> Polars Collector -> Hive-Partitioned Parquet (15min) -> S3/MinIO (obstore)

Output Format (Hive-Partitioned Parquet)

output/
  station=019ab383-d789-74e2-a460-bb92b1c13681/
    year=2025/
      month=11/
        day=24/
          data_1430.parquet  # Batch written at 14:30
          data_1445.parquet  # Batch written at 14:45

Benefits:

  • Browser-queryable with DuckDB-wasm
  • Partition pruning for fast time-range queries
  • Simple, universal format (no proprietary transaction logs)
  • Perfect for append-only time-series data

See ARCHITECTURE.md for detailed diagrams and scalability analysis.

Differences from Original

Feature Old (enviroplus-python) New (opensensor-enviroplus)
Station IDs Manual/random UUID v7 (time-ordered)
Data library pandas + DuckDB Polars + Apache Arrow
Storage Partitioned Parquet Hive-partitioned Parquet
Configuration bash scripts + env vars Pydantic Settings + .env
Setup install.sh opensensor setup CLI
Cloud sync rclone (process spawn) obstore (Rust, 50% faster)
IAM policies N/A Prefix-based scoping
Logging print statements Rich + structured logging
Memory usage Higher (pandas) 50% lower (Polars streaming)
CLI None Typer with 7 commands
Read interval 1 second 5 seconds (configurable)
Batch duration Variable 15 minutes (900s)

Development

# Install with dev dependencies
uv sync --group dev

# Format code
uv run ruff format .

# Lint code
uv run ruff check .

# Run with UV (no venv activation needed)
uv run opensensor --help

Tech Stack

  • Python 3.10+ - Modern Python with type hints
  • UV - Fast Rust-based package manager (10-100x faster than pip)
  • Polars 1.35+ - High-performance DataFrames with streaming
  • PyArrow 22+ - Columnar memory format (zero-copy operations)
  • uuid6 - RFC 9562 UUID v7 implementation
  • obstore - Rust-powered object storage (S3/GCS/Azure)
  • Pydantic Settings - Type-safe configuration
  • Typer + Rich - Beautiful CLI with auto-completion
  • Ruff - Extremely fast Python linter and formatter

License

MIT License - see LICENSE file for details

Credits

Built by the WalkThru Earth team for the OpenSensor.Space network.

Dependencies:

Contributing

Contributions welcome! Please:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Run tests and linting (uv run ruff check .)
  4. Commit your changes (git commit -m 'feat: add amazing feature')
  5. Push to the branch (git push origin feature/amazing-feature)
  6. Open a Pull Request

Support

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

opensensor_enviroplus-0.2.2.tar.gz (27.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

opensensor_enviroplus-0.2.2-py3-none-any.whl (27.4 kB view details)

Uploaded Python 3

File details

Details for the file opensensor_enviroplus-0.2.2.tar.gz.

File metadata

  • Download URL: opensensor_enviroplus-0.2.2.tar.gz
  • Upload date:
  • Size: 27.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for opensensor_enviroplus-0.2.2.tar.gz
Algorithm Hash digest
SHA256 6b6f4e8186c766d9030a11c28e7a92a3abc812a5638d7871930bd40519825c09
MD5 8c3bfd38964b6c401111937310e9d240
BLAKE2b-256 6464ccaecde425dd1995a07cb5ecfb27cce1d420377602dcae051273031934fa

See more details on using hashes here.

Provenance

The following attestation bundles were made for opensensor_enviroplus-0.2.2.tar.gz:

Publisher: publish.yml on walkthru-earth/opensensor-enviroplus

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file opensensor_enviroplus-0.2.2-py3-none-any.whl.

File metadata

File hashes

Hashes for opensensor_enviroplus-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 52cbf09058a5dfaa0ac3fd896f790517e7154316d687d88026ed847b9dd2b891
MD5 358b4ae1abfba4c91bdc834765a8d847
BLAKE2b-256 330fbf3682acc6803561329d64cf7866f1bae4403628896035791a92b7fa33e9

See more details on using hashes here.

Provenance

The following attestation bundles were made for opensensor_enviroplus-0.2.2-py3-none-any.whl:

Publisher: publish.yml on walkthru-earth/opensensor-enviroplus

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page