Skip to main content

No project description provided

Project description

Spark Streaming Metrics Prometheus Instrumentation

This project provides a seamless integration between PySpark and Prometheus for monitoring Spark Structured Streaming applications.

Note: this project focuses on better metrics for Spark Structured Streaming specifically. If you would like other Spark metrics such as executor memory, CPU, GC times, etc. in Prometheus please refer to Spark's monitoring guide and its support for Prometheus using JMX (Java Management Extensions).

Features

  • Collects metrics from PySpark Streaming Queries
  • Exposes metrics in Prometheus format
  • Easy integration with existing PySpark applications

Installation

To install the required dependencies, run:

pip install -r requirements.txt

Usage

  1. Import the necessary modules in your PySpark application:

    from pyspark_prometheus import with_prometheus_metrics
    
  2. Initialize the Prometheus metrics:

    spark = SparkSession.builder.master("local").appName("MySparkApp").getOrCreate()
    spark = with_prometheus_metrics(spark, 'http://localhost:9091')
    
  3. Start your PySpark job as usual. Metrics will be collected and exposed automatically.

Contributing

Contributions are welcome! Please submit a pull request or open an issue to discuss your ideas.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Contact

For any questions or support, please open an issue in the repository.


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyspark_prometheus-0.1.2.tar.gz (4.1 kB view details)

Uploaded Source

Built Distribution

pyspark_prometheus-0.1.2-py3-none-any.whl (4.7 kB view details)

Uploaded Python 3

File details

Details for the file pyspark_prometheus-0.1.2.tar.gz.

File metadata

  • Download URL: pyspark_prometheus-0.1.2.tar.gz
  • Upload date:
  • Size: 4.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.2 CPython/3.12.5 Darwin/23.2.0

File hashes

Hashes for pyspark_prometheus-0.1.2.tar.gz
Algorithm Hash digest
SHA256 47a9fa1cc604bada70cb3bf9748db457c91870a830b6aabc5ccce481db7ee1b2
MD5 030c810c022f56912adf35d2c5daae50
BLAKE2b-256 cc6153f1e080c29c7de7851bb47d0260b5c27509bf3041185e4d59ca60700252

See more details on using hashes here.

File details

Details for the file pyspark_prometheus-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for pyspark_prometheus-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 0a8414e19f4df8e5147b0c89c7b0fef40e76a7d0e646599fa754ddcbafbf84b4
MD5 280bc4950764b64ce5e1fe28a39e0c25
BLAKE2b-256 1f6161e034d45fa4e4f547c1b4e07185fdd4519fdb580006f9c6bb805a833f1c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page