SQL query and monitoring system metrics exporter for Prometheus
Project description
metREx
SQL query and monitoring system metrics exporter for Prometheus. A product of Reliability Engineering at The Home Depot.
Supported database engines include:
- Google BigQuery
- IBM DB2
- IBM Informix (experimental)
- Microsoft SQL Server
- MySQL
- Oracle
- PostgreSQL
- SQLite
Prometheus metrics can also be generated from the following monitoring systems:
Table of Contents
- Installation
- Local Environment Setup
- Configuration: Services
- Configuration: Jobs
- Pushing Metrics
- Running the Application
- Running in Docker
- Swagger
- Testing
- License
Installation
$ pip install metREx
Database Engines
The SQLAlchemy package used by this application enables connectivity to a variety of database engines. However, additional packages may need to be installed, depending on which engines are used.
A variation on the above pip install
command can be used to include specific SQLAlchemy driver packages as needed.*
In the following example, the dialect identifiers mysql
and postgres
, supplied in brackets, will install the pymysql
and pg8000
driver packages for MySQL and PostgreSQL database connectivity, respectively:
$ pip install metREx"[mysql,postgresql]"
For convenience, a selection of predefined dialects are available for commonly used SQLAlchemy driver packages:*
Dialect | Driver |
---|---|
bigquery |
pybigquery |
db2 |
ibm_db |
informix |
ifx_jdbc |
mssql |
pymssql |
mysql |
pymysql |
oracle |
cx_oracle |
postgresql |
pg8000 |
sqlite ** |
pysqlite |
To install any combination of these SQLAlchemy drivers, substitute their dialect identifiers, delimited by commas, within the brackets shown in the previous example (see Supported Drivers for details about other SQLAlchemy drivers not mapped to the predefined dialects above).
*Additional client software may also be required for certain database engines. Installation steps will vary depending on the host operating system.
**SQLite support is included by default; sqlite
dialect does not need to be installed as an extra.
Local Environment Setup
A .env
file can optionally be used to define environment variables available to the application at runtime. This can be useful when running in a local development environment.
The .env
file should be placed in the installed package root. Any of the .env.*
files included in the examples
folder can be renamed to .env
, as shown below:
$ cp examples/.env.macos-oracle metREx/.env
ENV Variables
Configuration: Services
This application was originally designed to run on Cloud Foundry (CF). However, it can also be configured for other environments.
When running in Cloud Foundry, the connection parameters used by metric exporter "jobs" are defined via user-provided service instances.
In non-CF environments, services can be defined via a YAML configuration file (default: env/services.yml
), as shown in the following example:
---
METREX_DB_MYSQL:
name: database_name
hostname: mysql.mydomain.com
port: 3306
username: user
password: password
encrypted: false
dialect: mysql
driver: pymysql
METREX_DB_BIGQUERY:
project: gcp-project-name
location: US
credentials_path: /path/to/service/account/credentials.json
dialect: bigquery
METREX_DB_SQLITE:
path: /path/to/database/file.db
dialect: sqlite
METREX_API_APPD:
hostname: appdynamics.mydomain.com
account: account
username: user
password: password
encrypted: false
vendor: appdynamics
METREX_API_EXTRAHOP:
hostname: extrahop.mydomain.com
apikey: extrahop_apikey
encrypted: false
vendor: extrahop
METREX_API_GITHUB:
apikey: github_apikey
encrypted: false
vendor: github
METREX_API_NEWRELIC:
hostname: api.newrelic.com
account_id: account_id
apikey: newrelic_apikey
encrypted: false
vendor: newrelic
Database Connection Parameters
The names assigned to services containing database connection details must begin with the prefix METREX_DB_
(or the value of the DB_PREFIX environment variable). The text that appears after this prefix will be prepended to the names of all Prometheus metrics generated from this service.
For database connections (except BigQuery and SQLite), the following parameters must be included:
name
: Database or service namehostname
: Server name or IP addressport
: Port numberusername
: User namepassword
: Passwordencrypted
: "true" (recommended) ifpassword
value is encrypted or "false" if it is not.dialect
: Dialect identifier (refer to table of Dialect/Driver mappings in Database Engines section above)driver
: SQLAlchemy driver (refer to table of Dialect/Driver mappings in Database Engines section above)timezone
: Localized database timezone to apply to "naive" (non-TZ-aware)timestamp_column
values returned by metric exporter jobs (optional, to override the DB_DEFAULT_LOCAL_TZ env variable)
In addition to the above, the following parameters for establishing SSL connections are supported for MySQL and PostgreSQL databases:
ssl_ca
: Full path to a file of concatenated CA certificates in PEM formatssl_cert
: Full path to a single file in PEM format containing the certificate as well as any number of CA certificates needed to establish the certificate’s authenticityssl_key
: Full path to a file containing the private key
For BigQuery connections, the following parameters are used:
project
: Name of the GCP project (defaults to the project specified in the credentials JSON file)location
: Specifies the dataset location (optional)credentials_path
: Full path to the service account credentials JSON file (optional)*credentials_info
: Contents of the service account credentials JSON file (optional)*encrypted
: "true" (recommended) if thecredentials_info
value is encrypted or "false" if it is not (optional, for use only withcredentials_info
)dialect
: "bigquery"
*NOTE: Either credentials_path
or credentials_info
can be used instead of or to override the GOOGLE_APPLICATION_CREDENTIALS ENV variable on a per-service basis. credentials_path
takes precedence if provided in combination with GOOGLE_APPLICATION_CREDENTIALS or credentials_info
.
For SQLite connections, the following parameters are used:
path
: Full path to the database file (optional)*dialect
: "sqlite"
*NOTE: If path
is not provided, an in-memory database will be used.
API Connection Parameters
The names assigned to services containing API connection details must begin with the prefix METREX_API_
(or the value of the API_PREFIX environment variable). Presently, connectivity to the following APIs is supported:
- AppDynamics
- ExtraHop
- GitHub*
- New Relic
- Prometheus-to-Graphite bridge**
- Prometheus Pushgateway**
- Wavefront**
*GitHub connectivity is provided only as a means to access metric exporter job definitions in a source repository (see Defining Jobs in a GitHub Repository section below). It cannot be used as a service
parameter in individual exporter job definitions.
**Prometheus-to-Graphite bridge, Pushgateway and Wavefront connectivity is provided to allow exporter job metrics to be sent to other services (see Pushing Metrics section below). These cannot be used as a service
parameter in individual exporter job definitions.
The parameters to be included for each API connection type are shown below.
AppDynamics API connections:
hostname
: AppD controller server name or IP addressport
: Port number (optional)account
: Account name (optional, default: "customer1")username
: User namepassword
: Passwordencrypted
: "true" (recommended) ifpassword
value is encrypted or "false" if it is notvendor
: "appdynamics"
ExtraHop API connections:
hostname
: ExtraHop server name or IP addressport
: Port number (optional)apikey
: ExtraHop API keyencrypted
: "true" (recommended) ifapikey
value is encrypted or "false" if it is notvendor
: "extrahop"
GitHub API connections:
hostname
: Custom GitHub Enterprise server name or IP address (only include if not using standard GitHub.com)port
: Port number (optional)apikey
: GitHub API keyencrypted
: "true" (recommended) ifapikey
value is encrypted or "false" if it is notvendor
: "github"
New Relic API connections:
hostname
: New Relic API server name ("api.newrelic.com" in most cases) or IP addressport
: Port number (optional)apikey
: New Relic user keyencrypted
: "true" (recommended) ifapikey
value is encrypted or "false" if it is notvendor
: "newrelic"
Prometheus-to-Graphite bridge API connections:
hostname
: Graphite bridge server name or IP addressport
: Port number (optional)prefix
: Prefix to use for all metric names - e.g., "dev.", "prod.system.", etc. (optional)vendor
: "graphite"
Prometheus Pushgateway API connections:
hostname
: Pushgateway server name or IP address (include "https://" for secure server connections)port
: Port number (optional)username
: User name (optional, for connections requiring Basic auth)password
: Password (optional, for connections requiring Basic auth)encrypted
: "true" (recommended) ifpassword
value is encrypted or "false" if it is not (optional, for connections requiring Basic auth)vendor
: "pushgateway"
Wavefront API connections:
hostname
: Wavefront server name or IP address (include "http://" or "https://" for direct ingestion, or "proxy://" for proxy server)port
: Port number (optional)apikey
: API token (optional)encrypted
: "true" (recommended) ifapikey
value is encrypted or "false" if it is notprefix
: Prefix to use for all metric names - e.g., "dev.", "prod.system.", etc. (optional)vendor
: "wavefront"
Encrypting Credentials
For security purposes, it is recommended that database passwords and API keys be encrypted, especially when stored in CF. This application supports the decryption of credentials encrypted via the cryptofy package.
To configure the application to decrypt a password or API key for a given connection at runtime, set the encrypted
property to "true", as shown in the following example:
METREX_DB_MYSQL:
name: database_name
hostname: mysql.mydomain.com
port: 3306
username: user
password: Zq4B/mjYV9a4sVBzWiCxz5+tjcXHK2yc4UDVhWJLP2g=
encrypted: true
dialect: mysql
driver: pymysql
METREX_API_EXTRAHOP:
hostname: extrahop.mydomain.com
apikey: V37REDWZithWkr9Qu2d/p6uurhpj6kShcYjX0Y6yBWI=
encrypted: true
vendor: extrahop
A "secret" must be supplied to the application in order to unlock encrypted credentials at runtime. This secret must be the same one that was used to encrypt each of the passwords/API keys (refer to the cryptofy documentation for instructions on generating secrets and encrypting credentials).
Secrets may be supplied either by means of the SECRET_KEY environment variable or via a YAML-formatted file, the location of which is referenced in the ENV variable SECRET_PATH.*
Method 1: Environment variable
SECRET_KEY=my_secret
Method 2: Secrets file (with file path environment variable)
ENV variable:
SECRET_PATH=secrets.yml
Contents of secrets.yml
:
secret-key: my_secret
*WARNING: Due to their sensitive nature, care should be taken to protect secret keys from being exposed. If defined via an ENV variable, make sure it is not visible to others; else if it is supplied as a secrets file, do NOT store the file in source code!
Configuration: Jobs
Metric exporter jobs can be defined via:
- Service definitions (user-provided services in CF or the services file)
- GitHub repository
Defining Jobs as Services
If running in Cloud Foundry, metric exporter jobs can be defined as user-provided services. All user-provided services containing exporter job definitions must begin with the prefix METREX_JOB_
(or the value of the JOB_PREFIX environment variable).
In non-CF environments, jobs can be added to the same YAML-formatted configuration file containing the connection services, as shown below:
METREX_JOB_QUERY_EXAMPLE:
services:
- METREX_DB_MYSQL
interval_minutes: 5
statement: SELECT col1, col2, col3 FROM my_table
value_columns:
- col1
- col2
static_labels:
label1: value
label2: value
METREX_JOB_APPD_EXAMPLE:
services:
- METREX_API_APPD
interval_minutes: 15
application: Application_Name
metric_path: Business Transaction Performance|Business Transactions|*|*|*
METREX_JOB_EXTRAHOP_EXAMPLE:
services:
- METREX_API_EXTRAHOP
interval_minutes: 60
metric_params:
cycle: auto
object_type: device
object_ids:
- 9363
metric_category: http_client
metric_specs:
- req
aggregation:
funcs:
- count
- sum
- 95
threshold:
operator: ">"
value: 10
METREX_JOB_NEWRELIC_EXAMPLE:
services:
- METREX_API_NEWRELIC
interval_minutes: 15
statement: FROM Transaction SELECT average(duration) AS 'avg_duration', appName FACET appName
value_attrs:
- avg_duration
Defining Jobs in a GitHub Repository
Metric exporter jobs can also be defined via a YAML-formatted config file hosted in GitHub like the one shown below:
---
QUERY_EXAMPLE:
services:
- METREX_DB_MYSQL
interval_minutes: 5
statement: SELECT col1, col2, col3 FROM my_table
value_columns:
- col1
- col2
static_labels:
label1: value
label2: value
APPD_EXAMPLE:
services:
- METREX_API_APPD
interval_minutes: 15
application: Application_Name
metric_path: Business Transaction Performance|Business Transactions|*|*|*
EXTRAHOP_EXAMPLE:
services:
- METREX_API_EXTRAHOP
interval_minutes: 60
metric_params:
cycle: auto
object_type: device
object_ids:
- 9363
metric_category: http_client
metric_specs:
- req
aggregation:
funcs:
- count
- sum
- 95
threshold:
operator: ">"
value: 10
NEWRELIC_EXAMPLE:
services:
- METREX_API_NEWRELIC
interval_minutes: 15
statement: FROM Transaction SELECT average(duration) AS 'avg_duration', appName FACET appName
value_attrs:
- avg_duration
There is no prefix requirement for GitHub-hosted job names. However, the name referenced in the job's service
parameter must match the full service name (prefix included) defined in CF.
In order for jobs defined in a GitHub repository to be consumed, the following environment variables must be provided:
JOBS_SOURCE_SERVICE=METREX_API_GITHUB
JOBS_SOURCE_ORG=org_name
JOBS_SOURCE_REPO=repo_name
JOBS_SOURCE_PATH=path/to/jobs.yml
Changes to the jobs config file referenced by these env variables will be picked up every x minutes, per the value assigned to JOBS_SOURCE_REFRESH_INTERVAL, and incorporated into the job scheduler. Unlike changes to services, no restart of the application is required for GitHub-hosted job file changes to take effect.
Job Parameters
The parameters to be included for each metric exporter job type are shown below.
Database queries:
services
: A list of one or more service names referencing database connections, from which the job metrics will be sourcedpush_services
: (optional) A list of one or more service names referencing push service API connections, to which the job metrics will be sentinterval_minutes
: The interval (in minutes) to wait between each execution of the jobstatement
: SELECT queryvalue_columns
: A list of one or more returned column names representing numeric metric values (any columns returned via the query which do not match the names in this list will be used as metric "labels")static_labels
: (optional) One or morekey: value
pairs to apply as static labels for all metrics (static label names must not conflict with returned column names)timestamp_column
: (optional) The name of a column returned by the SQL query to use as the metric timestamp (ignored when metrics are exposed via Pushgateway)
AppDynamics metrics:
services
: A list of one or more service names referencing AppDynamics API connections, from which the job metrics will be sourcedpush_services
: (optional) A list of one or more service names referencing push service API connections, to which the job metrics will be sentinterval_minutes
: The interval (in minutes) to wait between each execution of the jobapplication
: The Application name as it appears in AppD (for Database metrics, value is always "Database Monitoring")metric_path
: The AppD metrics path; may include wildcards (*
)static_labels
: (optional) One or morekey: value
pairs to apply as static labels for all metrics
ExtraHop metrics:
services
: A list of one or more service names referencing ExtraHop API connections, from which the job metrics will be sourcedpush_services
: (optional) A list of one or more service names referencing push service API connections, to which the job metrics will be sentinterval_minutes
: The interval (in minutes) to wait between each execution of the jobmetric_params
: The list of parameters passed to the ExtraHop/metrics
API endpointmetric_name
: The identifier to be used in the Prometheus metric collector nameaggregation
: The list of parameters used to aggregate the values returned from the ExtraHop API (see About Aggregation below)static_labels
: (optional) One or morekey: value
pairs to apply as static labels for all metrics
New Relic metrics:
services
: A list of one or more service names referencing New Relic API connections, from which the job metrics will be sourcedpush_services
: (optional) A list of one or more service names referencing push service API connections, to which the job metrics will be sentinterval_minutes
: The interval (in minutes) to wait between each execution of the jobstatement
: NRQL query (excluding SINCE/UNTIL and TIMESERIES clauses)value_attrs
: A list of one or more returned attribute labels representing numeric metric values (any attributes returned via the query which do not match the names in this list will be used as metric "labels")static_labels
: (optional) One or morekey: value
pairs to apply as static labels for all metrics (static label names must not conflict with returned attribute names)
About Aggregation
Certain job types (see "ExtraHop metrics" above) provide the ability to define aggregation functions to produce metrics from the result data returned by the API.
The aggregation
parameter group consists of two subgroups: funcs
and threshold
(see example in Defining Jobs as Services above)
The funcs
subgroup accepts any combination of the following aggregation functions:
avg
count
(default)min
max
sum
It can also take integer values between 0 and 100, representing the nth percentile.
The optional threshold
subgroup works much in the same way as a HAVING
clause in a SQL GROUP BY
: it filters by values matching defined criteria.* Here, the criteria are expressed via the operator
and value
params.
Supported operator
values are:
>
<
>=
<=
=
<>
!=
The value
param must be an integer.
Example: Match all values greater than 10
JOB_EXAMPLE:
aggregation:
threshold:
operator: ">"
value: 10
*NOTE: If no threshold
is specified, the aggregation function will include all returned values.
Pushing Metrics
By default, each metric exporter job exposes its own registry endpoint at /metrics/<job_id>
, which can be scraped by a Prometheus process. In addition, exporter job metrics can optionally be sent to one or more of the services detailed below.
Exposing Metrics to a Pushgateway
Metrics can be exposed to one or more Prometheus Pushgateway services, which may simplify the process of managing jobs in Prometheus.
Defining a Pushgateway service:
METREX_API_PUSHGATEWAY:
hostname: https://pushgateway.mydomain.com
vendor: pushgateway
Pushing Metrics to a Graphite Bridge
Metrics can also be pushed over TCP to a Graphite bridge service, which provides an option to expose metrics to systems other than Prometheus. If a Graphite bridge service is defined, all new exporter jobs will automatically be pushed to it.
Defining a Graphite bridge service:
METREX_API_GRAPHITE:
hostname: graphite.mydomain.com
port: 2003
prefix: my.prefix.
vendor: graphite
Pushing Metrics to Wavefront
As an alternative to Prometheus, metrics can be sent to one or more Wavefront services. If a Wavefront service is defined, all new exporter jobs will automatically be pushed to it.
Defining a Wavefront service:
METREX_API_WAVEFRONT:
hostname: mydomain.wavefront.com
port: 2878
prefix: my.prefix.
vendor: wavefront
Activating Push Services
Metrics can be configured to be sent to one or more push services on a per-job basis or on a global basis for all jobs.
To configure a specific job to send metrics to one or more push services, include the push_services
attribute in the job definition to list the services.
Example: Send metrics to push services for a job
JOB_EXAMPLE:
services:
- METREX_DB_MYSQL
push_services:
- METREX_API_GRAPHITE
interval_minutes: 5
statement: SELECT col1 FROM my_table
value_columns:
- col1
To send all exporter job metrics to one or more push services, list them (comma-delimited, if more than one) via the ENV variable SEND_ALL_JOBS_TO_SERVICES:
SEND_ALL_JOBS_TO_SERVICES=METREX_API_PUSHGATEWAY,METREX_API_WAVEFRONT
Running the Application
Execute the following command:
$ metrex
Running in Docker
This application can be run inside a Docker container. A prepackaged image is available on Docker Hub.
It was built with support for the following databases:
Sample docker-compose.yml
file:
version: "3.8"
services:
metrex:
image: homedepottech/metrex:latest
ports:
- 5000:5000
environment:
SECRET_PATH: /secrets.yml
SERVICES_PATH: /services.yml
GOOGLE_APPLICATION_CREDENTIALS: /gcp_credentials.json
volumes:
- /path/to/secrets.yml:/secrets.yml
- /path/to/services.yml:/services.yml
- /path/to/gcp_credentials.json:/gcp_credentials.json
Create and start the Docker container:
$ docker-compose up -d metrex
Swagger
Open the URL shown in the resulting output following application startup (default: http://127.0.0.1:5000) to access the Swagger UI.
Testing
Execute the following command from the top-level directory of the cloned repository:
$ python setup.py test
License
Distributed under the Apache, version 2.0 license.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.