Sharded queue implementation

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Sharded queue

Introduction

A sharded job queue is a distributed queue that enables processing of large-scale jobs across a network of worker nodes. Each queue shard is handled by a separate node, which allows for parallel processing of jobs and efficient resource utilization. This can be achieved with handlers that contains logic for routing and performing a job. Any handler split requests to any number of threads. In advance, route can define processing order value.

Installation

Install using pip

pip install sharded-queue

Getting started

First of all you need to define your handler. Handler methods are written using batch approach to reduce io latency per each message. Let's start with a simple notification task.

from sharded_queue import Handler, Queue, Route

class NotifyRequest:
    '''
    In this example we have simple notify request containing user identifier
    In addition, the value is used to shard requests over worker threads
    '''
    user_id: int

class NotifyHandler(Handler):
    @classmethod
    async def route(cls, *requests: NotifyRequest) -> list[Route]:
        '''
        Spread requests by 3 threads that can be concurrently processed
        '''
        return [
            Route(thread=str(request.user_id % 3))
            for request in requests
        ]

    async def perform(self, *requests: NotifyRequest) -> None:
        '''
        Perform is called using configurable batch size
        This allows you to reduce io per single request
        '''
        # users = await UserRepository.find_all([r.user_id for r in requests])
        # await mailer.send_all([construct_message(user) for user in users])

Usage example

When a handler is described you can use queue and worker api to manage and process tasks.

from notifications import NotifyHandler, NotifyRequest


async def main():
    queue = Queue()

    # let's register notification for first 9 users
    await queue.register(NotifyHandler, *[NotifyRequest(n) for n in range(1, 9)])

    # now all requests are waiting for workers on 3 notify handler tubes
    # first tube contains notify request for users 1, 4, 7
    # second tube contains requests for 2, 5, 8 and so on
    # they were distributed using route handler method

    worker = Worker(queue)
    # we can run worker with processed message limit
    # in this example we will run three coroutines that will process messages
    # workers will bind to any tube and process all 3 messages
    # in advance, you can run workers on a distributed system
    futures = [
        worker.loop(3),
        worker.loop(3),
        worker.loop(3),
    ]

    # now all emails were send
    await gather(*futures)

Routes

route method returns an array of routes, each route is defind using:

thread - requests pipe that uses strict order prcessing
order - define priority for you requests inside a thread

Handlers

As you can notice, routing is made using static method, but perform is an instance method. When a worker start processing requests it can bootstrap and tear down the handler

class ParseEventRequest(NamedTuple):
    '''
    Event identifier should be enough to get it contents from storage
    '''
    event: int

class ParseEventHandler(Handler):
    @classmethod
    async def create(cls) -> Self:
        '''
        define your own handler and dependency factory
        '''
        return cls()

    @classmethod
    async def route(cls, *requests: ParseEventRequest) -> list[Route]:
        '''
        override default single thread tube
        '''
        return [
            Route(settings.default_thread, settings.default_order)
            for request in requests
        ]

    async def start(self):
        '''
        run any code on worker is bind to the queue
        '''

    async def perform(self, *requests: ParseEventRequest):
        '''
        the handler
        '''

    async def handle(self, *requests: ParseEventRequest) -> None:
        '''
        process requests batch
        ```

    async def stop(self):
        '''
        run any code when queue is empty and worker stops processing thread
        '''

Queue configuration

You can configure sharded queue using env.

QUEUE_COORDINATOR_DELAY=1 Coordinator delay in seconds on empty queues
QUEUE_DEFAULT_ORDER='0' Default queue order
QUEUE_DEFAULT_THREAD='0' Default queue thread
QUEUE_WORKER_BATCH_SIZE=128 Worker batch processing size
QUEUE_WORKER_EMPTY_LIMIT=16 Worker empty queue attempt limit berfore queue rebind
QUEUE_WORKER_EMPTY_PAUSE=0.1 Worker pause in seconds on empty queue

Or import and change settings object:

from sharded_queue import settings
settings.coordinator_delay = 5
settings.worker_batch_size = 64

worker = Worker(Queue())
await worker.loop()

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.2.10

Mar 18, 2024

0.2.9

Mar 18, 2024

0.2.8

Feb 16, 2024

0.2.7

Feb 13, 2024

0.2.6

Feb 8, 2024

0.2.5

Dec 6, 2023

0.2.4

Dec 6, 2023

0.2.3

Dec 1, 2023

0.2.2

Nov 22, 2023

0.2.1

Nov 21, 2023

0.2.0

Oct 9, 2023

0.1.11

Sep 25, 2023

0.1.10

Sep 25, 2023

0.1.9

Sep 25, 2023

0.1.8

Sep 23, 2023

0.1.7

Sep 23, 2023

0.1.6

Sep 23, 2023

0.1.5

Sep 23, 2023

0.1.4

Sep 23, 2023

0.1.3

Sep 23, 2023

0.1.2

Sep 19, 2023

0.1.1

Sep 18, 2023

0.1.0

Sep 18, 2023

0.0.8

Sep 15, 2023

0.0.7

Sep 13, 2023

0.0.6

Sep 13, 2023

0.0.5

Sep 12, 2023

0.0.4

Sep 12, 2023

0.0.3

Sep 12, 2023

This version

0.0.2

Sep 12, 2023

0.0.1

Sep 11, 2023

0.0.0

Sep 12, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sharded_queue-0.0.2.tar.gz (6.9 kB view hashes)

Uploaded Sep 12, 2023 Source

Built Distribution

sharded_queue-0.0.2-py3-none-any.whl (6.2 kB view hashes)

Uploaded Sep 12, 2023 Python 3

Hashes for sharded_queue-0.0.2.tar.gz

Hashes for sharded_queue-0.0.2.tar.gz
Algorithm	Hash digest
SHA256	`d0e3e15fef85cd3abcc064a85bda2514c48394f6f8632c45088e027060d56628`
MD5	`b65fece6936bd481c50d7dd141961514`
BLAKE2b-256	`cff73da2f23e5c5681910f8921b7e3ade6e121b91bff392be2666f5351ac947b`

Hashes for sharded_queue-0.0.2-py3-none-any.whl

Hashes for sharded_queue-0.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7224eb1851ec63d72c25cffa723ef438564dff6904ea973a0fffb6a426eda416`
MD5	`572f9140a06a3113c6af1bb68137937c`
BLAKE2b-256	`37085d211323d971357bf28dc534d1c1d3fccec973327dfe8b998bacb99c0d6c`