Skip to main content

Automated registration package for urban sensor endpoints into metadata catalogs

Project description

Wrench 🔧

A powerful framework for building automated sensor registration pipelines.

Python 3.12+ Code style: ruff License: MIT

Overview

Wrench is a modular, extensible workflow framework designed to streamline the process of harvesting, enriching, and registering sensor metadata from diverse IoT sources into urban data catalogs. It provides a standardized pipeline architecture with interchangeable components to help make sensor data more discoverable and valuable.

Features

  • 🔄 Automated Metadata Harvesting: Extract metadata from various IoT data sources with minimal configuration
  • 📊 Standardized Data Models: Type-safe data structures using Pydantic for consistent handling of metadata
  • 🔍 Advanced Classification: Group similar sensors using machine learning and taxonomy-based approaches
  • Metadata Enrichment: Enhance sensor descriptions with contextual information using LLM technologies
  • 🏗️ Modular Architecture: Compose workflows from interchangeable components for maximum flexibility
  • 🔌 Extensible Interfaces: Easily add support for new data sources and catalog systems
  • 🤖 LLM Integration: Leverage AI capabilities for automatic content generation and classification

Installation

pip install auto-wrench

To install with specific component dependencies:

pip install 'auto-wrench[teleclass]'

Core Components

Wrench consists of three main component types that can be combined in a pipeline:

  1. Harvesters: Extract metadata from IoT data sources (e.g., SensorThings API)
  2. Groupers: Classify and organize sensors into meaningful groups
  3. Catalogers: Register the processed metadata into data catalogs (e.g., SDDI/CKAN)

Each component type follows a standardized interface, making it easy to extend with custom implementations.

Quick Start

The following example sets up a complete pipeline with a SensorThings API harvester, a TELEClass grouper for classification, and an SDDI cataloger for registration:

from wrench.cataloger import SDDICataloger
from wrench.common.pipeline import Pipeline
from wrench.grouper import TELEClassGrouper
from wrench.harvester import SensorThingsHarvester
from wrench.utils import ContentGenerator

# Initialize components with their respective configurations
harvester = SensorThingsHarvester(
    config="config/sta_config.yaml",
    content_generator=ContentGenerator(config="config/generator_config.yaml")
)

grouper = TELEClassGrouper(config="config/teleclass_config.yaml")

cataloger = SDDICataloger(config="config/sddi_config.yaml")

# Assemble and run the pipeline
pipeline = Pipeline(
    harvester=harvester,
    grouper=grouper,
    cataloger=cataloger
)

pipeline.run()

Configuration

Each component can be configured via YAML files. Here's a basic example for the SensorThings harvester:

# sta_config.yaml
base_url: "https://example.org/v1.1"
identifier: "city_sensors"
title: "City Sensor Network"
description: "Environmental sensors across the city"

pagination:
  page_delay: 0.2
  timeout: 60
  batch_size: 100

translator:
  url: "https://translate.example.org"
  source_lang: "de"

Component Overview

Harvesters

Harvesters connect to data sources and extract metadata. Wrench includes:

  • SensorThingsHarvester: Connects to OGC SensorThings API endpoints
  • Extensible base class for creating custom harvesters

Groupers

Groupers organize sensors into logical groups:

  • TELEClassGrouper: Taxonomy-enhanced classification using LLMs and corpus-based methods
  • Can be extended with custom grouping algorithms

Catalogers

Catalogers register metadata into data catalogs:

  • SDDICataloger: Registers metadata into SDDI/CKAN-based catalogs

  • Extensible interface for supporting other catalog systems

Advanced Features

Translation Support

Wrench includes built-in support for translating metadata using services like LibreTranslate:

# Translation is configured in the harvester configuration
translator:
  url: "https://translate.example.org"
  source_lang: "auto"  # Automatically detect source language

LLM-Enhanced Content Generation

Generate rich descriptions for sensor groups using LLM services:

content_generator = ContentGenerator(config="config/generator_config.yaml")
harvester = SensorThingsHarvester(
    config="config/sta_config.yaml",
    content_generator=content_generator
)

Development

Setting up the Development Environment

# Clone the repository
git clone https://github.com/yourusername/wrench.git
cd wrench

# Run the make target
make setup

# Install component dependencies
uv pip install -e ".[teleclass,sensorthings]"

Code Style

This project follows the Ruff code style. Format your code using:

ruff format .
ruff check .

Contributing

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Please ensure your code follows our coding standards and includes appropriate tests.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support and Documentation

For support, please:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

auto_wrench-0.2.0.tar.gz (82.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

auto_wrench-0.2.0-py3-none-any.whl (85.5 kB view details)

Uploaded Python 3

File details

Details for the file auto_wrench-0.2.0.tar.gz.

File metadata

  • Download URL: auto_wrench-0.2.0.tar.gz
  • Upload date:
  • Size: 82.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.6.13

File hashes

Hashes for auto_wrench-0.2.0.tar.gz
Algorithm Hash digest
SHA256 4853b3d22c2be34420c0c8156ca33487bdfd9ed35660b8ff6722d3c4c4960eb8
MD5 6bc9effb5e2ecfc733cbefe7f6be0c1b
BLAKE2b-256 1e837f3b2a960e9f447f8d3826469b824ccdfb6429b2e191b99f79b1627ef12d

See more details on using hashes here.

File details

Details for the file auto_wrench-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for auto_wrench-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f8284101aa33af8278024967cb660e27c34161b29de75debae8dcd8522fbe963
MD5 d0c2260aeb661849b0c28c73a9b7ef71
BLAKE2b-256 d6fce38c3327b246f14c9b9f73910e94a9c055e332f9d2803a2ace709190d460

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page