Skip to main content

A lightweight and polyglot stream-processing library, to be used as a data backplane-, message relay-, or pipeline-subsystem.

Project description

About

LorryStream is a lightweight and polyglot stream-processing library, to be used as a data backplane-, message relay-, or pipeline-subsystem, in the spirit of socat and GStreamer. It is based on Streamz, Dask, and other Python libraries.

You can use LorryStream to store data received from the network into databases, or to relay it back to the network, for example into different bus systems. It can be used both as a standalone program, and as a library.

  • Use as protocol translator, bridge

It is conceived to generalize and improve the corresponding subsystems of programs and frameworks like Kotori, Wetterdienst, Luftdatenpumpe, amqp-forward, ttnlogger, Kahn, or mqttwarn.

Details

  • Data sources are message bus systems like AMQP, Kafka, MQTT, ZeroMQ, and network listener endpoints for TCP, UDP, HTTP, and WebSocket.

  • Data sinks are RDBMS databases supported by SQLAlchemy, or other message brokers.

Motivation

  • Implement a reusable solution, simple to install and operate, that doesn’t depend on vendor-provided infrastructure, and can easily be embedded into existing frameworks and software stacks, or integrated otherwise by running it as a separate service.

  • Help the community and industry to modernize their aging DAQ backend systems designed within the previous decades.

Background

Flow-Based Programming (FBP) is a programming paradigm that uses a “data processing factory” metaphor for designing and building applications. It is a special case of dataflow programming characterized by asynchronous, concurrent processes “under the covers”.

FBP has been found to support improved development time and maintainability, reusability, rapid prototyping, simulation, improved performance, and good communication among developers, maintenance staff, users, systems people, and management - not to mention that FBP naturally takes advantage of multiple cores, without the programmer having to struggle with the intricacies of multitasking.

Flow-based Programming

Caveat

Please note that LorryStream is alpha-quality software, and a work in progress. Contributions of all kinds are very welcome, in order to make it more solid.

Breaking changes should be expected until a 1.0 release, so version pinning is recommended, especially when you use it as a library.

Only a few features sketched out in the README have actually been implemented right now.

Synopsis

The canonical command is lorry relay <source> <sink>. Please note %23 is #.

lorry relay \
    "mqtt://localhost/testdrive/%23" \
    "crate://localhost/?table=testdrive"

If you prefer a GStreamer-like pipeline definition syntax.

lorry launch "mqttsrc location=mqtt://localhost/testdrive/%23 ! sqlsink location=crate://localhost/?table=testdrive"

Quickstart

If you are in a hurry, and want to run LorryStream without any installation, just use the OCI image on Podman or Docker.

docker run --rm --network=host ghcr.io/daq-tools/lorrystream \
    lorry relay \
    "mqtt://localhost/testdrive/%23" \
    "crate://localhost/?table=testdrive"

Setup

Install lorrystream from PyPI.

pip install lorrystream

Usage

This section outlines some example invocations of LorryStream, both on the command line, and per library use. Other than the resources available from the web, testing data can be acquired from the repository’s testdata folder.

Prerequisites

For properly running some of the example invocations outlined below, you will need a few servers. The easiest way to spin up those instances is to use Podman or Docker.

docker run --name=mosquitto --rm -it --publish=1883:1883 \
    eclipse-mosquitto:2.0.15 mosquitto -c /mosquitto-no-auth.conf

https://github.com/docker-library/docs/blob/master/eclipse-mosquitto/README.md

docker run --name=cratedb --rm -it --publish=4200:4200 --publish=5432:5432 \
    crate:5.2 -Cdiscovery.type=single-node

https://github.com/docker-library/docs/blob/master/crate/README.md

Command line use

Help

lorry --help
lorry info
lorry relay --help

Bus to storage

# Relay messages from MQTT to CrateDB.
lorry relay \
    "mqtt://localhost/testdrive/%23" \
    "crate://localhost/?table=testdrive"

Bus to bus

# Relay messages from AMQP to MQTT.
lorry relay \
    "amqp://localhost/testdrive/demo" \
    "mqtt://localhost/testdrive/demo"

Library use

>>> from lorrystream import parse_launch
>>> parse_launch("mqttsrc location=mqtt://localhost/testdrive/%23 ! sqlsink location=crate://localhost/?table=testdrive")

OCI

OCI images are available on the GitHub Container Registry (GHCR). We are publishing image variants for general availability- and nightly-releases, and pull requests.

In order to always run the latest nightly development version, and to use a shortcut for that, this section outlines how to use an alias for lorry, and a variable for storing the data source and sink URIs. It may be useful to save a few keystrokes on subsequent invocations.

docker pull ghcr.io/daq-tools/lorrystream:nightly
alias lorry="docker run --rm --interactive ghcr.io/daq-tools/lorrystream:nightly lorry"
SOURCE=mqtt://localhost/testdrive/%23
SINK=crate://crate@localhost:4200/?table=testdrive

lorry relay "${SOURCE}" "${SINK}"

Project information

Resources

Contributions

The LorryStream library is an open source project, and is managed on GitHub. Every kind of contribution, feedback, or patch, is much welcome. Create an issue or submit a patch if you think we should include a new feature, or to report or fix a bug.

Development

In order to setup a development environment on your workstation, please head over to the development sandbox documentation. When you see the software tests succeed, you should be ready to start hacking.

License

The project is licensed under the terms of the LGPL license, see LICENSE.

Prior art

We are maintaining a list of other projects with the same or similar goals like LorryStream.

Kudos

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lorrystream-0.0.2.tar.gz (38.7 kB view details)

Uploaded Source

Built Distribution

lorrystream-0.0.2-py3-none-any.whl (29.9 kB view details)

Uploaded Python 3

File details

Details for the file lorrystream-0.0.2.tar.gz.

File metadata

  • Download URL: lorrystream-0.0.2.tar.gz
  • Upload date:
  • Size: 38.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.7

File hashes

Hashes for lorrystream-0.0.2.tar.gz
Algorithm Hash digest
SHA256 3f1fc6630664dd6fabc00c79ebf6d2e68847d4f01247414d21592fae794be75f
MD5 a8de07808a6231a2cd40d986153febc5
BLAKE2b-256 6373b35c3adefe948e2897423a091b7ddee8daeb3ee1e47e0311216e0fd894d1

See more details on using hashes here.

File details

Details for the file lorrystream-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: lorrystream-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 29.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.7

File hashes

Hashes for lorrystream-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 20a7ed0aa48f58445531cc222ca86ecf404260c989eb50d5b14d3f152dc9a3e2
MD5 0251cb8c517345320557109ea3e16d7d
BLAKE2b-256 17d9e609c70cc06be72afff6ca1dd3504312a705157ebd596621173e530179d9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page