Skip to main content

Prometheus instrumentation for Spark Streaming metrics.

Project description

Spark Streaming Metrics Prometheus Instrumentation

This project provides a seamless integration between PySpark and Prometheus for monitoring Spark Structured Streaming applications.

Note: this project focuses on better metrics for Spark Structured Streaming specifically. If you would like other Spark metrics such as executor memory, CPU, GC times, etc. in Prometheus please refer to Spark's monitoring guide and its support for Prometheus using JMX (Java Management Extensions).

Features

  • Collects metrics from PySpark Streaming Queries
  • Exposes metrics in Prometheus format
  • Easy integration with existing PySpark applications

Installation

To install the required dependencies, run:

pip install -r requirements.txt

Usage

  1. Import the necessary modules in your PySpark application:

    from pyspark_prometheus import with_prometheus_metrics
    
  2. Initialize the Prometheus metrics:

    spark = SparkSession.builder.master("local").appName("MySparkApp").getOrCreate()
    spark = with_prometheus_metrics(spark, 'http://localhost:9091')
    
  3. Start your PySpark job as usual. Metrics will be collected and exposed automatically.

Contributing

Contributions are welcome! Please submit a pull request or open an issue to discuss your ideas.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Contact

For any questions or support, please open an issue in the repository.


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyspark_prometheus-0.1.3.tar.gz (4.2 kB view details)

Uploaded Source

Built Distribution

pyspark_prometheus-0.1.3-py3-none-any.whl (4.8 kB view details)

Uploaded Python 3

File details

Details for the file pyspark_prometheus-0.1.3.tar.gz.

File metadata

  • Download URL: pyspark_prometheus-0.1.3.tar.gz
  • Upload date:
  • Size: 4.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.2 CPython/3.12.5 Darwin/23.2.0

File hashes

Hashes for pyspark_prometheus-0.1.3.tar.gz
Algorithm Hash digest
SHA256 03da747a4ae99f738da4e955d3fdb970e0915fbd3d4cfe47a4f20090baa5051a
MD5 f328eb6c6f5b3ceded1d7719fa71402b
BLAKE2b-256 8c43d6be6327602d7b519e03b13a01087641daf92ea4c7ea31e184ac149a2fea

See more details on using hashes here.

File details

Details for the file pyspark_prometheus-0.1.3-py3-none-any.whl.

File metadata

File hashes

Hashes for pyspark_prometheus-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 52f7d8b0e9aca3fe2be3e7cdf2ff7ed51a28802333e4a4a17b72d849bc348f87
MD5 a2dffd2533704eddfcf3f8357587ae72
BLAKE2b-256 2e14625acb3aa212e49adb4780368fb3fb87bb5ba741c9c62dc0c4ca94f6930a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page