Query a Prometheus server and get a Pandas DataFrame
Project description
PromQL HTTP API
This python package provides a Prometheus HTTP API client library. It encapsulates and simplifies the collection of data from a Prometheus server. One major feature of this library is that responses to queries are returned as Pandas DataFrames.
Prometheus is an open-source system monitoring and alerting toolkit. It collects metrics from configured targets at given intervals, evaluates rule expressions, displays the results, and can trigger alerts if some condition is observed to be true. The Prometheus server exposes an HTTP API for querying the collected data, and a query language called PromQL.
This library is intended to help data scientists who would like to harvest data from a Prometheus server for analysis and visualization. The library is design to be simple to use, and to provide a convenient interface to the Prometheus HTTP API. It is also designed to be performant and scalable, by using the requests library and caching HTTP connections to the Prometheus server between API accesses.
For unstable connections, the library supports retrying failed requests. The user may specify the number of retries, the time-out between retries, and the back-off factor for the retry interval.
Version 0.2.0 improves timezone awareness. Starting in this version, the HTTP API query is issued with a timestamp value in the 'time', 'start', and 'end' fields. In addition, the query schema now supports a 'timezone' element (provided as a timezone object), which controls the timestamp column formatting in the returned dataframe.
Installation
To install as a root user:
python3 -m pip install promql-http-api
To install as a non-root user:
python3 -m pip install --user promql-http-api
To uninstall:
python3 -m pip uninstall promql-http-api
Usage Examples
Here is a basic usage example:
from promql_http_api import PromqlHttpApi
api = PromqlHttpApi('http://localhost:9090')
tz = pytz.timezone('UTC')
q_time = datetime.now()
q = api.query('up', tz.localize(q_time))
df = q.to_dataframe()
print(df)
On the first line we create a PromqlHttpApi object named api
. This example assumes that a Prometheus server is running on the local host, and it is listening to port 9090.
Replace this URL as needed with the appropriate URL for your server.
Next, we use the api
object to create a Query object named q
. The query()
function takes two parameters: a query string and a date-time string.
To execute the query explicitly, without converting the result to a DataFrame, you can use:
# Execute the query explicitly
promql_response_data = q()
# Convert the cached result to a DataFrame
df = q.to_dataframe()
Alternately, by calling the to_dataframe() method alone, we will implicitly execute the query.
# Execute the query implicitly
df = q.to_dataframe()
Adding retries and time-out to the query work only with explicit execution:
# Execute the query explicitly
# with 5 retries and retry intervals of 5, 10, 20, and 40 seconds
promql_response_data = q(retries=5, timeout=5, backoff=2)
# Convert the cached result to a DataFrame
df = q.to_dataframe()
HTTP Authentication (and other headers)
The PromqlHttpApi
object takes an optional headers
parameter. This parameter is a dictionary of HTTP headers to be included in the request. The headers
parameter is useful for including authentication information in the request. Here is an example of how to use the headers
parameter:
api = PromqlHttpApi('http://localhost:9090', headers={'Authorization': 'token 0123456789ABCDEF'})
Working with schemas
The to_dataframe()
method takes an optional schema
parameter. The schema is a dictionary that controls several elements of the query. A schema may include the following element keys: columns
, dtype
, and timezone
.
The columns
element controls the PromQL response elements to be included as columns in the returned DataFrame. The returned DataFrame will always include a timestamp column (in seconds since the epoch), and a value
column. If columns
is not provided, all the fields returned in a PromQL response will be included in the returned DataFrame. If columns
is provided, only the fields listed in columns
will be included in the returned DataFrame.
The dtype
element controls the data type of the value
column in the returned DataFrame. The default is str
. The dtype
element may be set to any valid Pandas data type.
The timezone
element allows the user to request an additional datetime
column which is formatted in the specified timezone. The timezone
element must be a timezone object from the pytz library. If the timezone
element is not provided, the returned DataFrame will not include a datetime
column.
Here is an example of how to use a schema:
schema = {
'columns': ['node', 'sensor'],
'dtype': float,
'timezone': pytz.timezone('US/Eastern')
}
df = q.to_dataframe(schema)
Debugging
If something goes wrong, you can look at the HTTP response and the PromQL response information. Here are some examples:
from promql_http_api import PromqlHttpApi
api = PromqlHttpApi('http://localhost:9090')
tz = pytz.timezone('UTC')
q_time = datetime.now()
q = api.query('up', tz.localize(q_time))
q()
promql_response = q.response
http_response = promql_response.response
print(f'HTTP response status code = {http_response.status_code}')
print(f'HTTP response encoding = {http_response.encoding}')
print(f'PromQL response status = {promql_response.status()}')
print(f'PromQL response data = {promql_response.data()}')
print(f'PromQL response error type = {promql_response.error_type()}')
print(f'PromQL response error = {promql_response.error()}')
List of Supported APIs
API | Method |
---|---|
/api/v1/query | query(query, time) |
/api/v1/query_range | query_range(query, start, end, step) |
/api/v1/format_query | format_query(query) |
/api/v1/series | series(match) |
/api/v1/labels | labels() |
/api/v1/label/<label_name>/values | label_values(label) |
/api/v1/targets | targets(state) |
/api/v1/rules | rules(type) |
/api/v1/alerts | alerts() |
/api/v1/alertmanagers | alertmanagers() |
/api/v1/status/config | config() |
/api/v1/status/flags | flags() |
/api/v1/status/runtimeinfo | runtimeinfo() |
/api/v1/status/buildinfo | buildinfo() |
Testing
The package contains limited unit testing. Run the tests from the package top folder using:
pytest
Future work
Implement a CI/CD pipeline with a Prometheus instance in a Docker container to test API accesses.
If you use this library and would like to help - please contact the author.
References
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file promql_http_api-0.3.4.tar.gz
.
File metadata
- Download URL: promql_http_api-0.3.4.tar.gz
- Upload date:
- Size: 15.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.11.0 Linux/5.15.0-112-generic
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | fc4921e88b1a36eb7f2e5f8d7c6646a4cc639f6e3657547f4d709001d786ffa5 |
|
MD5 | 624baedeeaf3cce33d51ce42f1932d8f |
|
BLAKE2b-256 | 69b90377ff32a7c7d42aab52d81904a0bc3ad1ba7bfa5707fc45578958567b5b |
File details
Details for the file promql_http_api-0.3.4-py3-none-any.whl
.
File metadata
- Download URL: promql_http_api-0.3.4-py3-none-any.whl
- Upload date:
- Size: 25.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.11.0 Linux/5.15.0-112-generic
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a5bfc4552f98642c62f4da3f33be44bb9a9e6e4aab683a1313a3f4364aeee00f |
|
MD5 | 42563afc9b061e5f75816b2b2fde8af1 |
|
BLAKE2b-256 | 250ea6e46a2f1bd0c3481738283710c79037d2d59ea42c5cebd0908e42ce05d4 |