Skip to main content

Microsoft Azure Event Hubs Client Library for Python

Project description

Azure Event Hubs client library for Python

Azure Event Hubs is a highly scalable publish-subscribe service that can ingest millions of events per second and stream them to multiple consumers. This lets you process and analyze the massive amounts of data produced by your connected devices and applications. Once Event Hubs has collected the data, you can retrieve, transform, and store it by using any real-time analytics provider or with batching/storage adapters. If you would like to know more about Azure Event Hubs, you may wish to review: What is Event Hubs?

The Azure Event Hubs client library allows for publishing and consuming of Azure Event Hubs events and may be used to:

  • Emit telemetry about your application for business intelligence and diagnostic purposes.
  • Publish facts about the state of your application which interested parties may observe and use as a trigger for taking action.
  • Observe interesting operations and interactions happening within your business or other ecosystem, allowing loosely coupled systems to interact without the need to bind them together.
  • Receive events from one or more publishers, transform them to better meet the needs of your ecosystem, then publish the transformed events to a new stream for consumers to observe.

Source code | Package (PyPi) | API reference documentation | Product documentation

Getting started

Install the package

Install the Azure Event Hubs client library for Python with pip:

$ pip install --pre azure-eventhub

Prerequisites

  • Python 2.7, 3.5 or later.

  • Microsoft Azure Subscription: To use Azure services, including Azure Event Hubs, you'll need a subscription. If you do not have an existing Azure account, you may sign up for a free trial or use your MSDN subscriber benefits when you create an account.

  • Event Hubs namespace with an Event Hub: To interact with Azure Event Hubs, you'll also need to have a namespace and Event Hub available. If you are not familiar with creating Azure resources, you may wish to follow the step-by-step guide for creating an Event Hub using the Azure portal. There, you can also find detailed instructions for using the Azure CLI, Azure PowerShell, or Azure Resource Manager (ARM) templates to create an Event Hub.

Authenticate the client

Interaction with Event Hubs starts with an instance of the EventHubClient class. You need the host name, SAS/AAD credential and event hub name to instantiate the client object.

Obtain a connection string

For the Event Hubs client library to interact with an Event Hub, it will need to understand how to connect and authorize with it. The easiest means for doing so is to use a connection string, which is created automatically when creating an Event Hubs namespace. If you aren't familiar with shared access policies in Azure, you may wish to follow the step-by-step guide to get an Event Hubs connection string.

Create client

There are several ways to instantiate the EventHubClient object and the following code snippets demonstrate two ways:

Create client from connection string:

from azure.eventhub import EventHubClient

connection_str = '<< CONNECTION STRING FOR THE EVENT HUBS NAMESPACE >>'
event_hub_path = '<< NAME OF THE EVENT HUB >>'
client = EventHubClient.from_connection_string(connection_str, event_hub_path)
  • The from_connection_string method takes the connection string of the form Endpoint=sb://<yournamespace>.servicebus.windows.net/;SharedAccessKeyName=<yoursharedaccesskeyname>;SharedAccessKey=<yoursharedaccesskey> and entity name to your Event Hub instance. You can get the connection string from the Azure portal.

Create client using the azure-identity library:

from azure.eventhub import EventHubClient
from azure.identity import DefaultAzureCredential

credential = DefaultAzureCredential()

host = '<< HOSTNAME OF THE EVENT HUB >>'
event_hub_path = '<< NAME OF THE EVENT HUB >>'
client = EventHubClient(host, event_hub_path, credential)
  • This constructor takes the host name and entity name of your Event Hub instance and credential that implements the TokenCredential interface. There are implementations of the TokenCredential interface available in the azure-identity package. The host name is of the format <yournamespace.servicebus.windows.net>.

Key concepts

  • An Event Hub client is the primary interface for developers interacting with the Event Hubs client library, allowing for inspection of Event Hub metadata and providing a guided experience towards specific Event Hub operations such as the creation of producers and consumers.

  • An Event Hub producer is a source of telemetry data, diagnostics information, usage logs, or other log data, as part of an embedded device solution, a mobile device application, a game title running on a console or other device, some client or server based business solution, or a web site.

  • An Event Hub consumer picks up such information from the Event Hub and processes it. Processing may involve aggregation, complex computation, and filtering. Processing may also involve distribution or storage of the information in a raw or transformed fashion. Event Hub consumers are often robust and high-scale platform infrastructure parts with built-in analytics capabilities, like Azure Stream Analytics, Apache Spark, or Apache Storm.

  • A partition is an ordered sequence of events that is held in an Event Hub. Azure Event Hubs provides message streaming through a partitioned consumer pattern in which each consumer only reads a specific subset, or partition, of the message stream. As newer events arrive, they are added to the end of this sequence. The number of partitions is specified at the time an Event Hub is created and cannot be changed.

  • A consumer group is a view of an entire Event Hub. Consumer groups enable multiple consuming applications to each have a separate view of the event stream, and to read the stream independently at their own pace and from their own position. There can be at most 5 concurrent readers on a partition per consumer group; however it is recommended that there is only one active consumer for a given partition and consumer group pairing. Each active reader receives all of the events from its partition; if there are multiple readers on the same partition, then they will receive duplicate events.

For more concepts and deeper discussion, see: Event Hubs Features. Also, the concepts for AMQP are well documented in OASIS Advanced Messaging Queuing Protocol (AMQP) Version 1.0.

Examples

The following sections provide several code snippets covering some of the most common Event Hubs tasks, including:

Inspect an Event Hub

Get the partition ids of an Event Hub.

from azure.eventhub import EventHubClient

connection_str = '<< CONNECTION STRING FOR THE EVENT HUBS NAMESPACE >>'
event_hub_path = '<< NAME OF THE EVENT HUB >>'
client = EventHubClient.from_connection_string(connection_str, event_hub_path)
partition_ids = client.get_partition_ids()

Publish events to an Event Hub

Publish events to an Event Hub.

from azure.eventhub import EventHubClient, EventData

connection_str = '<< CONNECTION STRING FOR THE EVENT HUBS NAMESPACE >>'
event_hub_path = '<< NAME OF THE EVENT HUB >>'
client = EventHubClient.from_connection_string(connection_str, event_hub_path)
producer = client.create_producer(partition_id="0")

try:
 	event_list = []
 	for i in range(10):
 		event_list.append(EventData(b"A single event"))

 	with producer:
 	    producer.send(event_list)
except:
	raise
finally:
    pass

Consume events from an Event Hub

Consume events from an Event Hub.

import logging
from azure.eventhub import EventHubClient, EventData, EventPosition

connection_str = '<< CONNECTION STRING FOR THE EVENT HUBS NAMESPACE >>'
event_hub_path = '<< NAME OF THE EVENT HUB >>'
client = EventHubClient.from_connection_string(connection_str, event_hub_path)
consumer = client.create_consumer(consumer_group="$default", partition_id="0", event_position=EventPosition("-1"))

try:
    logger = logging.getLogger("azure.eventhub")
    with consumer:
        received = consumer.receive(max_batch_size=100, timeout=5)
        for event_data in received:
            logger.info("Message received:{}".format(event_data))
except:
    raise
finally:
    pass

Async publish events to an Event Hub

Publish events to an Event Hub asynchronously.

from azure.eventhub.aio import EventHubClient
from azure.eventhub import EventData

connection_str = '<< CONNECTION STRING FOR THE EVENT HUBS NAMESPACE >>'
event_hub_path = '<< NAME OF THE EVENT HUB >>'
client = EventHubClient.from_connection_string(connection_str, event_hub_path)
producer = client.create_producer(partition_id="0")

try:
 	event_list = []
 	for i in range(10):
 		event_list.append(EventData(b"A single event"))

	async with producer:
		await producer.send(event_list)
except:
	raise
finally:
    pass

Async consume events from an Event Hub

Consume events asynchronously from an EventHub.

import logging
from azure.eventhub.aio import EventHubClient
from azure.eventhub import EventData, EventPosition

connection_str = '<< CONNECTION STRING FOR THE EVENT HUBS NAMESPACE >>'
event_hub_path = '<< NAME OF THE EVENT HUB >>'
client = EventHubClient.from_connection_string(connection_str, event_hub_path)
consumer = client.create_consumer(consumer_group="$default", partition_id="0", event_position=EventPosition("-1"))

try:
    logger = logging.getLogger("azure.eventhub")
    async with consumer:
        received = await consumer.receive(max_batch_size=100, timeout=5)
        for event_data in received:
            logger.info("Message received:{}".format(event_data))
except:
    raise
finally:
    pass

Troubleshooting

General

The Event Hubs APIs generate the following exceptions.

  • AuthenticationError: Failed to authenticate because of wrong address, SAS policy/key pair, SAS token or azure identity.
  • ConnectError: Failed to connect to the EventHubs. The AuthenticationError is a type of ConnectError.
  • ConnectionLostError: Lose connection after a connection has been built.
  • EventDataError: The EventData to be sent fails data validation. For instance, this error is raised if you try to send an EventData that is already sent.
  • EventDataSendError: The Eventhubs service responds with an error when an EventData is sent.
  • EventHubError: All other Eventhubs related errors. It is also the root error class of all the above mentioned errors.

Next steps

Examples

These are the samples in our repo demonstraing the usage of the library.

Documentation

Reference documentation is available at https://azure.github.io/azure-sdk-for-python.

Logging

  • Enable azure.eventhub logger to collect traces from the library.
  • Enable uamqp logger to collect traces from the underlying uAMQP library.
  • Enable AMQP frame level trace by setting network_tracing=True when creating the client.

Provide Feedback

If you encounter any bugs or have suggestions, please file an issue in the Issues section of the project.

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.microsoft.com.

When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

Release History

5.0.0b1 (2019-06-25)

Version 5.0.0b1 is a preview of our efforts to create a client library that is user friendly and idiomatic to the Python ecosystem. The reasons for most of the changes in this update can be found in the Azure SDK Design Guidelines for Python. For more information, please visit https://aka.ms/azure-sdk-preview1-python.

New features

  • Added new configuration parameters for creating EventHubClient.
    • credential: The credential object used for authentication which implements TokenCredential interface of getting tokens.
    • transport_type: The type of transport protocol that will be used for communicating with the Event Hubs service.
    • max_retries: The max number of attempts to redo the failed operation when an error happened.
    • for detailed information about the configuration parameters, please read the reference documentation.
  • Added new methods get_partition_properties and get_partition_ids to EventHubClient.
  • Added support for http proxy.
  • Added support for authentication using azure-identity credential.
  • Added support for transport using AMQP over WebSocket.

Breaking changes

  • New error hierarchy
    • azure.error.EventHubError
    • azure.error.ConnectionLostError
    • azure.error.ConnectError
    • azure.error.AuthenticationError
    • azure.error.EventDataError
    • azure.error.EventDataSendError
  • Renamed Sender/Receiver to EventHubProducer/EventHubConsumer.
    • Renamed add_sender to create_producer and add_receiver to create_consumer in EventHubClient.
    • EventHubConsumer is now iterable.
  • Rename class azure.eventhub.Offset to azure.eventhub.EventPosition.
  • Rename method get_eventhub_info to get_properties of EventHubClient.
  • Reorganized connection management, EventHubClient is no longer responsible for opening/closing EventHubProducer/EventHubConsumer.
    • Each EventHubProducer/EventHubConsumer is responsible for its own connection management.
    • Added support for context manager on EventHubProducer and EventHubConsumer.
  • Reorganized async APIs into "azure.eventhub.aio" namespace and rename to drop the "_async" suffix.
  • Updated uAMQP dependency to 1.2.

1.3.1 (2019-02-28)

BugFixes

  • Fixed bug where datetime offset filter was using a local timestamp rather than UTC.
  • Fixed stackoverflow error in continuous connection reconnect attempts.

1.3.0 (2019-01-29)

BugFixes

  • Added support for auto reconnect on token expiration and other auth errors (issue #89).

Features

  • Added ability to create ServiceBusClient from an existing SAS auth token, including providing a function to auto-renew that token on expiry.
  • Added support for storing a custom EPH context value in checkpoint (PR #84, thanks @konstantinmiller)

1.2.0 (2018-11-29)

  • Support for Python 2.7 in azure.eventhub module (azure.eventprocessorhost will not support Python 2.7).
  • Parse EventData.enqueued_time as a UTC timestamp (issue #72, thanks @vjrantal)

1.1.1 (2018-10-03)

  • Fixed bug in Azure namespace package.

1.1.0 (2018-09-21)

  • Changes to AzureStorageCheckpointLeaseManager parameters to support other connection options (issue #61):

    • The storage_account_name, storage_account_key and lease_container_name arguments are now optional keyword arguments.
    • Added a sas_token argument that must be specified with storage_account_name in place of storage_account_key.
    • Added an endpoint_suffix argument to support storage endpoints in National Clouds.
    • Added a connection_string argument that, if specified, overrides all other endpoint arguments.
    • The lease_container_name argument now defaults to "eph-leases" if not specified.
  • Fix for clients failing to start if run called multipled times (issue #64).

  • Added convenience methods body_as_str and body_as_json to EventData object for easier processing of message data.

1.0.0 (2018-08-22)

  • API stable.
  • Renamed internal _async module to async_ops for docs generation.
  • Added optional auth_timeout parameter to EventHubClient and EventHubClientAsync to configure how long to allow for token negotiation to complete. Default is 60 seconds.
  • Added optional send_timeout parameter to EventHubClient.add_sender and EventHubClientAsync.add_async_sender to determine the timeout for Events to be successfully sent. Default value is 60 seconds.
  • Reformatted logging for performance.

0.2.0 (2018-08-06)

  • Stability improvements for EPH.

  • Updated uAMQP version.

  • Added new configuration options for Sender and Receiver; keep_alive and auto_reconnect. These flags have been added to the following:

    • EventHubClient.add_receiver
    • EventHubClient.add_sender
    • EventHubClientAsync.add_async_receiver
    • EventHubClientAsync.add_async_sender
    • EPHOptions.keey_alive_interval
    • EPHOptions.auto_reconnect_on_error

0.2.0rc2 (2018-07-29)

  • Breaking change EventData.offset will now return an object of type ~uamqp.common.Offset rather than str. The original string value can be retrieved from ~uamqp.common.Offset.value.
  • Each sender/receiver will now run in its own independent connection.
  • Updated uAMQP dependency to 0.2.0
  • Fixed issue with IoTHub clients not being able to retrieve partition information.
  • Added support for HTTP proxy settings to both EventHubClient and EPH.
  • Added error handling policy to automatically reconnect on retryable error.
  • Added keep-alive thread for maintaining an unused connection.

0.2.0rc1 (2018-07-06)

  • Breaking change Restructured library to support Python 3.7. Submodule async has been renamed and all classes from this module can now be imported from azure.eventhub directly.
  • Breaking change Removed optional callback argument from Receiver.receive and AsyncReceiver.receive.
  • Breaking change EventData.properties has been renamed to EventData.application_properties. This removes the potential for messages to be processed via callback for not yet returned in the batch.
  • Updated uAMQP dependency to v0.1.0
  • Added support for constructing IoTHub connections.
  • Fixed memory leak in receive operations.
  • Dropped Python 2.7 wheel support.

0.2.0b2 (2018-05-29)

  • Added namespace_suffix to EventHubConfig() to support national clouds.
  • Added device_id attribute to EventData to support IoT Hub use cases.
  • Added message header to workaround service bug for PartitionKey support.
  • Updated uAMQP dependency to vRC1.

0.2.0b1 (2018-04-20)

  • Updated uAMQP to latest version.
  • Further testing and minor bug fixes.

0.2.0a2 (2018-04-02)

  • Updated uAQMP dependency.

0.2.0a1 (unreleased)

  • Swapped out Proton dependency for uAMQP.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

azure-eventhub-5.0.0b1.zip (203.8 kB view hashes)

Uploaded Source

Built Distribution

azure_eventhub-5.0.0b1-py2.py3-none-any.whl (187.7 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page