A python library to interact with the Spark History server
Project description
spark-monitoring
A python library to interact with the Spark History server.
Quickstart
Basic
$ pip install spark-monitoring
import sparkmonitoring as sparkmon
monitoring = sparkmon.client('my.history.server')
print(monitoring.list_applications())
Pandas
$ pip install spark-monitoring[pandas]
import sparkmonitoring as sparkmon
import matplotlib.pyplot as plt
monitoring = sparkmon.df('my.history.server')
apps = monitoring.list_applications()
apps['function'] = apps.name.str.split('(').str.get(0)
print(apps.head().stack())
plt.figure()
apps['duration'].hist(by=apps['function'], figsize=(40, 20))
plt.show()
jobs = monitoring.list_jobs(apps.iloc[0].id)
print(jobs.head().stack())
Reference
sparkmonitoring.client
Method to return a client to make calls to the spark history server with.
Arguments
Name | Type | Description | Default |
---|---|---|---|
server |
string |
Hostname or IP pointing to the spark history server | |
port |
int |
Port which the spark history server is exposed on | 18080 |
is_https |
bool |
Whether or not to use https to communicate with the spark server | False |
api_version |
int |
API Version to interact with. Currently only 1 is supported |
1 |
Response
Examples
Basic Endpoint
import sparkmonitoring as sparkmon
client = sparkmon.client('10.0.0.10')
Custom Endpoint
import sparkmonitoring as sparkmon
client = sparkmon.client('my-server', port=8080, is_https=True)
sparkmonitoring.df
Method to return a client to make calls to the spark history server with. This
client will return pandas dataframes, as opposed ot dictionaries in the
standard client. Can be used when the spark-monitoring[pandas]
extra is
installed.
Arguments
Name | Type | Description | Default |
---|---|---|---|
server |
string |
Hostname or IP pointing to the spark history server | |
port |
int |
Port which the spark history server is exposed on | 18080 |
is_https |
bool |
Whether or not to use https to communicate with the spark server | False |
api_version |
int |
API Version to interact with. Currently only 1 is supported |
1 |
Response
Examples
Basic Endpoint
import sparkmonitoring as sparkmon
client = sparkmon.df('10.0.0.10')
Custom Endpoint
import sparkmonitoring as sparkmon
client = sparkmon.df('my-server', port=8080, is_https=True)
sparkmonitoring.api.ClientV1
A client to interact with the Spark History Server.
Generally this class is not instantiated directly, and is accessed via
sparkmonitoring.client(...)
.
Arguments
Name | Type | Description | Default |
---|---|---|---|
server |
string |
Hostname or IP pointing to the spark history server | |
port |
int |
Port which the spark history server is exposed on | |
is_https |
bool |
Whether or not to use https to communicate with the spark server | |
api_version |
int |
API Version to interact with. Currently only 1 is supported |
Methods
list_applications(...)
get_application(...)
list_jobs(...)
get_job(...)
list_stages(...)
list_stage_attempts(...)
get_stage_attempt(...)
get_stage_attempt_summary(...)
get_stage_attempt_tasks(...)
list_active_executors(...)
list_executor_threads(...)
list_all_executors(...)
sparkmonitoring.dataframes.PandasClient
A client to interact with the Spark History Server, returning pandas
DataFrames.
Generally this class is not instantiated directly, and is accessed via
sparkmonitoring.df(...)
.
Arguments
Name | Type | Description | Default |
---|---|---|---|
server |
string |
Hostname or IP pointing to the spark history server | |
port |
int |
Port which the spark history server is exposed on | 18080 |
is_https |
bool |
Whether or not to use https to communicate with the spark server | False |
api_version |
int |
API Version to interact with. Currently only 1 is supported |
1 |
Methods
list_applications(...)
get_application(...)
list_jobs(...)
get_job(...)
list_stages(...)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for spark_monitoring-0.0.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c30f9de00cb030f4298db9b28483139598564f81473c333cbbddbff28f9f6fef |
|
MD5 | abeef79558bd6a0bc4e120c6d61b64d2 |
|
BLAKE2b-256 | 4f34c43def83a12b68826f2a5af6cadbf326f68864cdd4dd2b7ce46f08cb1a61 |