Skip to main content

Time-SERies COMmunication using gRPC for data science and machine learning applications.

Project description

TSERCOM

Time SERies COMmunication

CI Tests codecov

Tsercom is a Python library designed to simplify the transmission and management of time-series data across networks using gRPC. It provides tools for establishing communication between clients and servers, handling data serialization, managing persistent client identities, and synchronizing timestamps, making it suitable for distributed data science and machine learning applications.

Key Features

  • Simplified gRPC Management: Abstracts away much of the boilerplate for setting up and managing gRPC services and clients.
  • ZeroConf Client/Server Discovery: Automatically discover and connect clients and servers on the network using mDNS (optional, via discovery module).
  • Automatic Reconnection: Includes utilities to help build resilient clients that can handle network disruptions and attempt reconnection (e.g., ClientDisconnectionRetrier).
  • Persistent Client Identity: Provides mechanisms for managing a consistent CallerId for clients.
  • Timestamp Synchronization: Offers tools for synchronizing timestamps between server and client instances.
  • Serialization Utilities: Includes helpers for serializing common data types to and from protobufs for gRPC transmission.
  • Process/Thread Isolation: Supports running communication logic in separate processes or threads, isolating it from the main application, particularly when using the RuntimeManager system.

Installation

You can install Tsercom using pip:

pip install tsercom

For development, clone the repository and install in editable mode with development dependencies:

git clone https://github.com/rwkeane/tsercom.git
cd tsercom
pip install -e .[dev]

This will also install tools like pytest, black, ruff, mypy, and pylint.

How It Works / The Idea

Tsercom simplifies building systems that exchange time-series data by providing a framework and tools for common networking tasks. The core philosophy is to:

  • Abstract Complexity: Hide the intricacies of network programming (gRPC setup, service discovery, reconnection logic) behind more straightforward APIs. This allows developers to focus on their application-specific data handling and business logic.
  • Promote Modularity: Encourage separation of concerns. Communication logic can be developed and managed independently of the core application (e.g., a machine learning model or data processing pipeline). Tsercom's RuntimeManager system (shown in older examples, and used internally for more complex scenarios) particularly facilitates running communication components in separate threads or processes, isolating them and improving robustness.
  • Ensure Robustness: Incorporate features like persistent client identifiers (CallerId) and utilities for automatic reconnections to help build more resilient distributed systems.
  • Facilitate Integration: Offer utilities for data serialization (especially for torch.Tensor if PyTorch is installed) and timestamp synchronization, which are common needs in time-series applications.

Typical Use Cases:

  • Distributed Machine Learning: Streaming inference requests to model servers or aggregating training data from multiple sources.
  • Sensor Networks: Collecting and processing data from many distributed sensors.
  • Real-time Data Pipelines: Building systems where components need to exchange data with low latency.

Basic Usage

The basic steps for using Tsercom for a gRPC backed client-server architecture are as follows:

  1. Define a simple gRPC service.
  2. Host this service using GrpcServicePublisher.
  3. Create a client that connects to the service.
  4. Send a request and receive a response.
  5. Manage Tsercom's global event loop.

For example useage, see the Quick Start Script in this repo.

To run this example, save it as quick_start_test.py and execute python quick_start_test.py.

Architectural Flexibility: While Tsercom provides components like GrpcServicePublisher for straightforward client-server setups (as shown in the Quick Start), it also supports more advanced architectures. For instance, the discovery module (using mDNS via zeroconf) allows for dynamic discovery of services. A common pattern in some Tsercom applications involves "client" processes (data sources) advertising themselves, and "server" processes (data aggregators) discovering and connecting to them. This can be useful for systems where data sources may join or leave the network dynamically. The library provides building blocks that can be composed to fit various distributed system designs.

That being said, there is a suggested architedture

Suggested Architecture: Maximizing Tsercom's Potential

While Tsercom supports straightforward client-server setups (as demonstrated in the Quick Start guide), its design truly shines in more dynamic, distributed environments. A powerful and recommended architecture involves a "client-advertises, server-discovers" model. This approach flips the traditional roles, offering significant flexibility and resilience. This "client-advertises, server-discovers" approach offers several advantages:

  • Dynamic Discovery: Data sources can join (or leave) the network, and the aggregator will automatically discover and connect to them (or handle their disappearance) without manual reconfiguration. This is ideal for environments with ephemeral or mobile nodes.
  • Resilience to Network Changes: Data sources can change IP addresses or ports (e.g., due to DHCP or dynamic port assignment). As long as they can re-advertise via mDNS, the aggregator can re-discover and reconnect to them.
  • Decoupling: Data producers (Tsercom "Clients") and consumers (Tsercom "Servers") are highly decoupled. They only need to agree on the service definition and the discovery mechanism, not on static network locations.
  • Scalability: New data sources can be easily added to the system. They simply start advertising themselves, and the aggregator(s) can discover and integrate them. Similarly, multiple aggregators can discover the same set of data sources.

For more details about this approach, see Suggested Architecture in this repo.

Simpler Models Still Viable:

It's important to note that Tsercom still fully supports traditional client-server models where the client initiates a connection to a well-known server address, as shown in the Quick Start. This is perfectly suitable for simpler applications or when dynamic discovery is not a requirement.

However, adopting the "client-advertises, server-discovers" architecture with RuntimeManager, InstancePublisher, and InstanceListener unlocks Tsercom's more advanced capabilities for building robust, scalable, and adaptive distributed systems for time-series data communication.

Real-World Examples:

Coming soon! These repos have not yet been made public!

But if you use this library, pleae submit a PR to add a link to your library here!

Dependencies

Tsercom relies on several key libraries:

  • grpcio, grpcio-status, grpcio-tools: For the core gRPC communication framework.
  • protobuf: For working with Protocol Buffers, the data serialization format used by gRPC.
  • zeroconf: For mDNS-based service discovery (used by the tsercom.discovery module).
  • ntplib: Used by the tsercom.timesync module for network time synchronization.
  • psutil: For system utilities, which can be used internally for process management or monitoring.
  • typing-extensions: Provides access to newer typing features for older Python versions.

Optional Dependencies:

  • pytorch: If PyTorch is installed, Tsercom provides utilities for serializing and deserializing torch.Tensor objects.

If you encounter issues with gRPC versions, you might need to regenerate the protobuf-generated Python files. If you have the Tsercom repository cloned, you can do this by running the scripts/generate_protos.py script. This may require installing mypy-protobuf (pip install mypy-protobuf) and ensuring protoc-gen-mypy is in your PATH.

Contributing

Contributions are welcome! Whether it's bug reports, feature requests, documentation improvements, or code contributions, please feel free to open an issue or submit a pull request on the GitHub repository.

When contributing code, please ensure that:

  • Your changes pass all existing tests.
  • You add new tests for any new functionality.
  • The code adheres to our style guidelines. We use black for formatting, ruff for linting, mypy for type checking, and pylint for further static analysis. Please run these tools locally before submitting your changes.
    • black .
    • ruff check . --fix
    • mypy .
    • pylint tsercom quick_start_test.py (or specify relevant modules/files)

License

This project is licensed under the Apache License, Version 2.0. See the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tsercom-1.0.0.tar.gz (307.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tsercom-1.0.0-py3-none-any.whl (425.3 kB view details)

Uploaded Python 3

File details

Details for the file tsercom-1.0.0.tar.gz.

File metadata

  • Download URL: tsercom-1.0.0.tar.gz
  • Upload date:
  • Size: 307.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for tsercom-1.0.0.tar.gz
Algorithm Hash digest
SHA256 696054c2992d39f0ece91a5cfc6e5167104a959b57049403fbae4623ca7a4dc0
MD5 67ee5c1e881f2e0bfb866da57874d22e
BLAKE2b-256 f6ec7cf60430e8ae7ebdea458e49e8de8fc174b13f76b571426c847007004944

See more details on using hashes here.

Provenance

The following attestation bundles were made for tsercom-1.0.0.tar.gz:

Publisher: publish_to_pypi.yml on rwkeane/tsercom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tsercom-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: tsercom-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 425.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for tsercom-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7e28b8e54073c0a1ab58d40c6529be98e3c8a0de0db64b4a2d8e3bfcd899f185
MD5 60b631b53bd2bcae6540914948b234ec
BLAKE2b-256 6b4a29ab258632720c66d9eb8c79bf4432ce4f6c6c37191653d92605bc5d67a8

See more details on using hashes here.

Provenance

The following attestation bundles were made for tsercom-1.0.0-py3-none-any.whl:

Publisher: publish_to_pypi.yml on rwkeane/tsercom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page