gcloud-aio-pubsub

Python Client for Google Cloud Pub/Sub

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 5 - Production/Stable
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
Topic
- Internet

Project description

This is a shared codebase for gcloud-aio-pubsub and gcloud-rest-pubsub

Installation

$ pip install --upgrade gcloud-{aio,rest}-pubsub

Usage

Subscriber

Currently we have only implemented an asyncio version of SubscriberClient as the subscription pattern does not work with asyncio by default. The official Google publisher returns a future which is mostly useable as-is. This patch is a noop under gcloud-rest (ie. when not using asyncio) – in that case, using the official library is preferred.

An HTTP-oriented version, in keeping with the other gcloud-aio-* libraries, will likely be coming soon – though our current approach works reasonably well for allowing the official grpc client to be used under asyncio, we continue to see threading oddities now and again which we’ve not been able to solve. As such, we do not wholeheartedly recommend using the SubscriberClient of this library in production, though a resilient enough environment for your use-case may be possible.

Here’s the rough usage pattern for subscribing:

from gcloud.aio.pubsub import SubscriberClient
from google.cloud.pubsub_v1.subscriber.message import Message

client = SubscriberClient()
# create subscription if it doesn't already exist
client.create_subscription('subscription_name', 'topic_name')

async def message_callback(message: Message) -> None:
    try:
        # just an example: process the message however you need to here...
        result = handle(message)
        await upload_result(result)
    except Exception:
        message.nack()
    else:
        message.ack()

# subscribe to the subscription, receiving a Future that acts as a keepalive
keep_alive = client.subscribe('subscription_name', message_callback)

# have the client run forever, pulling messages from this subscription,
# passing them to the specified callback function, and wrapping it in an
# asyncio task.
client.run_forever(keep_alive)

Configuration

Our create_subscription method is a thing wrapper and thus supports all keyword configuration arguments from the official pubsub client which you can find in the official Google documentation.

When subscribing to a subscription you can optionally pass in a FlowControl and/or Scheduler instance.

example_flow_control = FlowControl(
    max_messages=1,
    resume_threshold=0.8,
    max_request_batch_size=1,
    max_request_batch_latency=0.1,
    max_lease_duration=10,
)

keep_alive = client.subscribe(
    'subscription_name',
    message_callback,
    flow_control=example_flow_control
)

Understanding how modifying FlowControl affects how your pubsub runtime will operate can be confusing so here’s a handy dandy guide!

Welcome to @TheKevJames’s guide to configuring Google Pubsub Subscription policies! Settle in, grab a drink, and stay a while.

The Subscriber is controlled by a FlowControl configuration tuple defined here: that configuration object f gets used by the Subscriber in the following ways:

Max Concurrency

The subscriber is allowed to lease new tasks whenever its currently leased tasks x satisfy:

(
    (len(x) < f.resume_threshold * f.max_messages)
    and (sum(x.bytes) < f.resume_threshold * f.max_bytes)
)

In practice, this means we should set these values with the following restrictions:

the maximum number of concurrently leased tasks at peak is: = (f.max_messages * f.resume_threshold) + f.max_request_batch_size
the maximum memory usage of our leased tasks at peak is: = (f.max_bytes * f.resume_threshold) + (f.max_request_batch_size * bytes_per_task)
these values are constrain each other, ie. we limit ourselves to the lesser of these values given: max_tasks * bytes_per_task <> max_memory

Aside: it seems like OCNs on Pubsub are ~1538 bytes each

Leasing Requests

When leasing new tasks, the Subscriber uses the following algorithm:

def lease_more_tasks():
    start = time.now()
    yield queue.Queue.get(block=True)  # always returns >=1

    for _ in range(f.max_request_batch_size - 1):
        elapsed = time.now() - start
        yield queue.Queue.get(
            block=False,
            timeout=f.max_request_batch_latency-elapsed)
        if elapsed >= f.max_request_batch_latency:
            break

In practice, this means we should set f.max_request_batch_size given the above concurrent concerns and set f.max_request_batch_latency given whatever latency ratio we are willing to accept.

The expected best-case time for Queue.get() off a full queue is no worse than 0.3ms. This Queue should be filling up as fast as grpc can make requests to Google Pubsub, which should be Fast Enough(tm) to keep it filled, given those requests are batched.

Therefore, we can expect:

avg_lease_latency: ~= f.max_request_batch_size * 0.0003
worst_case_latency: ~= f.max_request_batch_latency

Note that leasing occurs based on f.resume_threshold, so some of this latency is concurrent with task execution.

Task Expiry

Any task which has not been acked or nacked counts against the current leased task count. Our worker thread should ensure all tasks are acked or nacked, but the FlowControl config allows us to handle any other cases. Note that leasing works as follows:

When a subscriber leases a task, Google Pubsub will not re-lease that task until subscription.ack_deadline_seconds = 10 (configurable per-subscription) seconds have passed.
If a client calls ack() on a task, it is immediately removed from Google Pubsub.
If a client calls nack() on a task, it immediately allows Google Pubsub to re-lease that task to a new client. The client drops the task from its memory.
If f.max_lease_duration passes between a message being leased and acked, the client will send a nack (see above workflow). It will NOT drop the task from its memory – eg. the worker(task) process may still be run.

Notes:

all steps are best-effort, eg. read “a task will be deleted” as “a task will probably get deleted, if the distributed-system luck is with you”
in the above workflow “Google Pubsub” refers to the server-side system, eg. managed by Google where the tasks are actually stored.

In practice, we should thus set f.max_lease_duration to no lower than our 95% percentile task latency at high load. The lower this value is, the better our throughput will be in extreme cases.

Confusion

f.max_requests is defined, but seems to be unused.

Publisher

The PublisherClient is a dead-simple alternative to the official Google Cloud Pub/Sub publisher client. The main design goal was to eliminate all the additional gRPC overhead implemented by the upstream client.

If migrating between this library and the official one, the main difference is this: the gcloud-aio-pubsub publisher’s .publish() method immediately publishes the messages you’ve provided, rather than maintaining our own publishing queue, implementing batching and flow control, etc. If you’re looking for a full-featured publishing library with all the bells and whistles built in, you may be interested in the upstream provider. If you’re looking to manage your own batching / timeouts / retry / threads / etc, this library should be a bit easier to work with.

Sample usage:

from gcloud.aio.pubsub import PubsubMessage
from gcloud.aio.pubsub import PublisherClient

async with aiohttp.ClientSession() as session:
    client = PublisherClient(session=session)

    topic = client.topic_path('my-gcp-project', 'my-topic-name')

    messages = [
        PubsubMessage(b'payload', attribute='value'),
        PubsubMessage(b'other payload', other_attribute='whatever',
                      more_attributes='something else'),
    ]
    response = await client.publish(topic, messages)
    # response == {'messageIds': ['1', '2']}

Emulators

For testing purposes, you may want to use gcloud-aio-pubsub along with a local GCS emulator. Setting the $PUBSUB_EMULATOR_HOST environment variable to the local address of your emulator should be enough to do the trick.

For example, using the official Google Pubsub emulator:

gcloud beta emulators pubsub start --host-port=0.0.0.0:8681
export PUBSUB_EMULATOR_HOST='0.0.0.0:8681'

Any gcloud-aio-pubsub Publisher requests made with that environment variable set will query the emulator instead of the official GCS APIs.

For easier ergonomics, you may be interested in messagebird/gcloud-pubsub-emulator.

Contributing

Please see our contributing guide.

Project details

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 5 - Production/Stable
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
Topic
- Internet

Release history Release notifications | RSS feed

6.3.0

Jul 17, 2025

6.3.0a0 pre-release

Jul 3, 2025

6.2.0

Jun 30, 2025

6.1.0

Mar 19, 2025

6.0.1

Mar 14, 2024

6.0.0

Sep 13, 2023

5.4.0

Apr 5, 2023

5.3.0

Mar 21, 2023

5.2.0

Dec 23, 2022

5.1.1

Oct 14, 2022

5.0.1

Jul 12, 2022

5.0.0

Apr 4, 2022

4.5.0

Dec 6, 2021

4.4.0

Apr 23, 2021

4.3.4

Mar 19, 2021

4.3.3

Mar 3, 2021

4.3.2

Feb 24, 2021

4.3.1

Feb 17, 2021

4.3.0

Feb 2, 2021

4.2.1

Jan 26, 2021

4.2.0

Jan 25, 2021

4.1.0

Jan 20, 2021

4.0.4

Jan 14, 2021

4.0.3

Jan 14, 2021

4.0.2

Jan 13, 2021

4.0.1

Jan 12, 2021

4.0.0

Jan 6, 2021

3.0.0

Oct 16, 2020

2.1.2

Oct 6, 2020

2.1.1

Oct 6, 2020

2.1.0

Oct 1, 2020

2.0.1

Sep 29, 2020

2.0.0

Sep 22, 2020

This version

1.2.3

Jul 13, 2020

1.2.2

Jul 10, 2020

1.2.0

Jul 10, 2020

1.1.2

Jun 29, 2020

1.1.1

Nov 26, 2019

1.1.0

Oct 18, 2019

1.0.2

Oct 15, 2019

1.0.1

Jul 17, 2019

1.0.0

Jun 21, 2019

0.5.1

Mar 22, 2018

0.5.0

Nov 14, 2017

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gcloud-aio-pubsub-1.2.3.tar.gz (15.6 kB view details)

Uploaded Jul 13, 2020 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

gcloud_aio_pubsub-1.2.3-py2.py3-none-any.whl (13.2 kB view details)

Uploaded Jul 13, 2020 Python 2Python 3

File details

Details for the file gcloud-aio-pubsub-1.2.3.tar.gz.

File metadata

Download URL: gcloud-aio-pubsub-1.2.3.tar.gz
Upload date: Jul 13, 2020
Size: 15.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/1.14.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.1 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.8.3

File hashes

Hashes for gcloud-aio-pubsub-1.2.3.tar.gz
Algorithm	Hash digest
SHA256	`fa9d093b8190c223fc55e2a3e5ca1f181d5f056c2adbc570da564bfb22ae76cb`
MD5	`0df91b7da110337a2e61499b682593e6`
BLAKE2b-256	`29715f22597b1ef38fbde9f8f0b720472617779b3ca21ca8c899694772f6ab7d`

See more details on using hashes here.

File details

Details for the file gcloud_aio_pubsub-1.2.3-py2.py3-none-any.whl.

File metadata

Download URL: gcloud_aio_pubsub-1.2.3-py2.py3-none-any.whl
Upload date: Jul 13, 2020
Size: 13.2 kB
Tags: Python 2, Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/1.14.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.1 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.8.3

File hashes

Hashes for gcloud_aio_pubsub-1.2.3-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`c216dcb6a59558f2d7b177f4036455784ff3c2f44c78ebe27266049704578b39`
MD5	`2ba50bcec9c612decd45062cdb65a4f2`
BLAKE2b-256	`ba0a156f245cf4a3cc32915fc5943d1174261219520ba940d2922666a9743726`

See more details on using hashes here.

gcloud-aio-pubsub 1.2.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Installation

Usage

Subscriber

Configuration

Max Concurrency

Leasing Requests

Task Expiry

Confusion

Publisher

Emulators

Contributing

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes