6 projects
sparkmeasure
Python API for sparkMeasure, a tool for performance troubleshooting of Apache Spark workloads.
pyspark-root-datasource
Python DataSource for Apache Spark 4 to read ROOT files (High Energy Physics, HEP) as DataFrames, powered by uproot, awkward, and PyArrow
PyLatencyMap
Visualize latency distributions over time in your terminal with frequency and intensity heat maps.
TPCDS-PySpark
TPCDS_PySpark is a TPC-DS workload generator implemented in Python designed to run at scale using Apache Spark.
sparkhistogram
Sparkhistogram contains helper functions for generating data histograms with the Spark DataFrame API.
Test-CPU-parallel
test-CPU-parallel is a basic CPU workload generator.