Automated registration package for urban sensor endpoints into metadata catalogs
Project description
Wrench 🔧
A powerful framework for building automated sensor registration pipelines.
Overview
Wrench is a modular, extensible workflow framework designed to streamline the process of harvesting, enriching, and registering sensor metadata from diverse IoT sources into urban data catalogs. It provides a standardized pipeline architecture with interchangeable components to help make sensor data more discoverable and valuable.
Features
- 🔄 Automated Metadata Harvesting: Extract metadata from various IoT data sources with minimal configuration
- 📊 Standardized Data Models: Type-safe data structures using Pydantic for consistent handling of metadata
- 🔍 Advanced Classification: Group similar sensors using machine learning and taxonomy-based approaches
- ✨ Metadata Enrichment: Enhance sensor descriptions with contextual information using LLM technologies
- 🏗️ Modular Architecture: Compose workflows from interchangeable components for maximum flexibility
- 🔌 Extensible Interfaces: Easily add support for new data sources and catalog systems
- 🤖 LLM Integration: Leverage AI capabilities for automatic content generation and classification
Installation
pip install auto-wrench
To install with specific component dependencies:
# For SensorThings support
pip install 'auto-wrench[sensorthings]'
# For KINETIC grouper
pip install 'auto-wrench[kinetic]'
# Multiple components
pip install 'auto-wrench[sensorthings,kinetic]'
Core Components
Wrench consists of four main component types that can be combined in a pipeline:
- Harvesters: Extract metadata from IoT data sources (e.g., SensorThings API)
- Groupers: Classify and organize sensors into meaningful groups using various ML approaches
- MetadataEnrichers: Build spatial and temporal metadata for services and sensor groups
- Catalogers: Register the processed metadata into data catalogs (e.g., SDDI/CKAN)
Each component type follows a standardized interface, making it easy to extend with custom implementations.
Quick start
The following example sets up a complete pipeline with a SensorThings API harvester, a KINETIC grouper for classification, and an SDDI cataloger for registration:
import asyncio
from wrench.cataloger.sddi import SDDICataloger
from wrench.grouper.kinetic import KINETIC
from wrench.harvester.sensorthings import SensorThingsHarvester
from wrench.metadataenricher.sensorthings import SensorThingsMetadataEnricher
from wrench.pipeline.sensor_pipeline import SensorRegistrationPipeline
from wrench.utils.config import LLMConfig
llm_config = LLMConfig(base_url="https://my-llm.com", model="llama3.3:70b-instruct-q4_K_M")
# Initialize components
harvester = SensorThingsHarvester(
base_url="https://example.org/v1.1",
pagination_config={"page_delay": 0.2, "timeout": 60, "batch_size": 100},
)
grouper = KINETIC(
llm_config=llm_config,
embedder="intfloat/multilingual-e5-large-instruct",
resolution=1,
)
metadata_enricher = SensorThingsMetadataEnricher(
base_url="https://example.org/v1.1",
title="City Sensor Network",
description="Environmental sensors across the city",
llm_config=llm_config,
)
cataloger = SDDICataloger(
base_url="https://catalog.example.org",
api_key="your-api-key",
owner_org="your-organization",
)
# Assemble and run the pipeline
pipeline = SensorRegistrationPipeline(
harvester=harvester,
grouper=grouper,
metadataenricher=metadata_enricher,
cataloger=cataloger,
)
result = asyncio.run(pipeline.run_async())
Configuration
Wrench supports YAML-based pipeline configuration through PipelineRunner.from_config_file().
The top-level template_ key selects the pipeline template. Environment variables are
resolved using ${VAR_NAME} syntax.
# pipeline_config.yaml
template_: SensorPipeline
harvester:
sensorthings:
base_url: "https://example.org/v1.1"
pagination_config:
page_delay: 0.2
timeout: 60
batch_size: 100
grouper:
kinetic:
llm_config:
model: ${OLLAMA_MODEL}
base_url: ${OLLAMA_URL}
api_key: ${OLLAMA_API_KEY}
metadataenricher:
sensorthings:
base_url: "https://example.org/v1.1"
title: "City Sensor Network"
description: "Environmental sensors across the city"
llm_config:
model: ${OLLAMA_MODEL}
base_url: ${OLLAMA_URL}
api_key: ${OLLAMA_API_KEY}
cataloger:
noop: {}
Run a YAML-configured pipeline with:
import asyncio
from wrench.pipeline.config import PipelineRunner
runner = PipelineRunner.from_config_file("pipeline_config.yaml")
result = asyncio.run(runner.run({}))
Component Overview
Harvesters
Harvesters connect to data sources and extract metadata. Wrench includes:
- SensorThingsHarvester: Connects to OGC SensorThings API endpoints
- Extensible base class for creating custom harvesters
Groupers
Groupers organize sensors into logical groups using various machine learning approaches:
- KINETIC: Keyword-Informed, Network-Enhanced Topical Intelligence Classifier with hierarchical clustering
- LDAGrouper: Latent Dirichlet Allocation for topic modeling and device grouping
- BERTopicGrouper: BERTopic-based clustering with HDBSCAN and UMAP for topic discovery
- Can be extended with custom grouping algorithms
MetadataEnrichers
MetadataEnrichers build spatial and temporal metadata for items and groups:
- SensorThingsMetadataEnricher: Builds metadata for SensorThings API data sources
- Extensible base class for different data source types
Catalogers
Catalogers register metadata into data catalogs:
- SDDICataloger: Registers metadata into SDDI/CKAN-based catalogs
- Extensible interface for supporting other catalog systems
Advanced Features
Advanced grouping with ML
Different groupers offer various approaches for sensor classification:
from wrench.utils.config import LLMConfig
# KINETIC for hierarchical topic clustering
from wrench.grouper.kinetic import KINETIC
grouper = KINETIC(
llm_config=LLMConfig(base_url="https://my-llm.com", model="llama3.3:70b-instruct-q4_K_M"),
embedder="intfloat/multilingual-e5-large-instruct",
lang="en",
resolution=1,
)
# LDA for topic modeling (no extra dependencies required)
from wrench.grouper.lda import LDAGrouper
from wrench.grouper.lda.models import LDAConfig
grouper = LDAGrouper(config=LDAConfig(n_topics=10, alpha=0.1, beta=0.01))
# BERTopic for density-based clustering (requires sentence-transformers, hdbscan, umap-learn, bertopic)
from wrench.grouper.bertopic import BERTopicGrouper
from wrench.grouper.bertopic.models import BERTopicConfig
grouper = BERTopicGrouper(config=BERTopicConfig(min_topic_size=10))
Development
Setting up the Development Environment
# Clone the repository
git clone https://github.com/urbansense/wrench.git
cd wrench
# Run the make target for full setup
make setup
# Install component dependencies for development
uv pip install -e ".[sensorthings,kinetic]"
Code Style and Testing
This project follows the Ruff code style and uses comprehensive testing:
# Format and lint code
make format
make lint
# Run tests with coverage
make test
# Run specific test types
make test_unit
make test_e2e
# Type checking
make lint_types
Contributing
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
Please ensure your code follows our coding standards and includes appropriate tests.
License
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
Support and Documentation
For support, please:
- Open an issue in the GitHub repository
- Check the documentation
- Contact the development team at jeffrey.limnardy@tum.de
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file auto_wrench-0.4.0.tar.gz.
File metadata
- Download URL: auto_wrench-0.4.0.tar.gz
- Upload date:
- Size: 136.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.10.12 {"installer":{"name":"uv","version":"0.10.12","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ab9476e617eacaade0c93fa88b5cb93c54c27b3ad0d913c1d6965eba126722c6
|
|
| MD5 |
2487e886e8acf4bac96c80b3b27b2b7d
|
|
| BLAKE2b-256 |
ebca47f1c657e5b9cd8590fdb63502e956a15c598ab4965543be8be17a9e8d7a
|
File details
Details for the file auto_wrench-0.4.0-py3-none-any.whl.
File metadata
- Download URL: auto_wrench-0.4.0-py3-none-any.whl
- Upload date:
- Size: 147.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.10.12 {"installer":{"name":"uv","version":"0.10.12","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4b377283bd33809b45fbcad4c778e8ca960c5dcf276621e02f08db24179eca16
|
|
| MD5 |
23819a10ea5233f44272400ec32410c8
|
|
| BLAKE2b-256 |
9d2678da7cfad60e43ff97f699287acec6d382ff15989b2d79825550aad27f6a
|