Skip to main content

Simple Prometheus metrics exporter for Celery

Project description

https://img.shields.io/docker/automated/zerok/celery-prometheus-exporter.svg?maxAge=2592000

celery-prometheus-exporter is a little exporter for Celery related metrics in order to get picked up by Prometheus. As with other exporters like mongodb_exporter or node_exporter this has been implemented as a standalone-service to make reuse easier across different frameworks.

So far it provides access to the following metrics:

  • celery_tasks exposes the number of tasks currently known to the queue grouped by state (RECEIVED, STARTED, …).

  • celery_tasks_by_name exposes the number of tasks currently known to the queue grouped by name and state.

  • celery_workers exposes the number of currently probably alive workers

  • celery_task_latency exposes a histogram of task latency, i.e. the time until tasks are picked up by a worker

  • celery_tasks_runtime_seconds tracks the number of seconds tasks take until completed as histogram

How to use

There are multiple ways to install this. The obvious one is using pip install celery-prometheus-exporter and then using the celery-prometheus-exporter command:

$ celery-prometheus-exporter
Starting HTTPD on 0.0.0.0:8888

This package only depends on Celery directly, so you will have to install whatever other dependencies you will need for it to speak with your broker 🙂

Celery workers have to be configured to send task-related events: http://docs.celeryproject.org/en/latest/userguide/configuration.html#worker-send-task-events.

Running celery-prometheus-exporter with the --enable-events argument will periodically enable events on the workers. This is useful because it allows running celery workers with events disabled, until celery-prometheus-exporter is deployed, at which time events get enabled on the workers.

Alternatively, you can use the bundle Makefile and Dockerfile to generate a Docker image.

By default, the HTTPD will listen at 0.0.0.0:8888. If you want the HTTPD to listen to another port, use the --addr option or the environment variable DEFAULT_ADDR.

By default, this will expect the broker to be available through redis://redis:6379/0, although you can change via environment variable BROKER_URL. If you’re using AMQP or something else other than Redis, take a look at the Celery documentation and install the additioinal requirements 😊 Also use the --broker option to specify a different broker URL.

If you need to pass additional options to your broker’s transport use the --transport-options option. It tries to read a dict from a JSON object. E.g. to set your master name when using Redis Sentinel for broker discovery: --transport-options '{"master_name": "mymaster"}'

Use --tz to specify the timezone the Celery app is using. Otherwise the systems local time will be used.

By default, buckets for histograms are the same as default ones in the prometheus client: https://github.com/prometheus/client_python#histogram. It means they are intended to cover typical web/rpc requests from milliseconds to seconds, so you may want to customize them. It can be done via environment variable RUNTIME_HISTOGRAM_BUCKETS for tasks runtime and via environment variable LATENCY_HISTOGRAM_BUCKETS for tasks latency. Buckets should be passed as a list of float values separated by a comma. E.g. ".005, .05, 0.1, 1.0, 2.5".

Use --queue-list to specify the list of queues that will have its length monitored (Automatic Discovery of queues isn’t supported right now, see limitations/ caveats. You can use the QUEUE_LIST environment variable as well.

If you then look at the exposed metrics, you should see something like this:

$ http get http://localhost:8888/metrics | grep celery_
# HELP celery_workers Number of alive workers
# TYPE celery_workers gauge
celery_workers 1.0
# HELP celery_tasks Number of tasks per state
# TYPE celery_tasks gauge
celery_tasks{state="RECEIVED"} 3.0
celery_tasks{state="PENDING"} 0.0
celery_tasks{state="STARTED"} 1.0
celery_tasks{state="RETRY"} 2.0
celery_tasks{state="FAILURE"} 1.0
celery_tasks{state="REVOKED"} 0.0
celery_tasks{state="SUCCESS"} 8.0
# HELP celery_tasks_by_name Number of tasks per state
# TYPE celery_tasks_by_name gauge
celery_tasks_by_name{name="my_app.tasks.calculate_something",state="RECEIVED"} 0.0
celery_tasks_by_name{name="my_app.tasks.calculate_something",state="PENDING"} 0.0
celery_tasks_by_name{name="my_app.tasks.calculate_something",state="STARTED"} 0.0
celery_tasks_by_name{name="my_app.tasks.calculate_something",state="RETRY"} 0.0
celery_tasks_by_name{name="my_app.tasks.calculate_something",state="FAILURE"} 0.0
celery_tasks_by_name{name="my_app.tasks.calculate_something",state="REVOKED"} 0.0
celery_tasks_by_name{name="my_app.tasks.calculate_something",state="SUCCESS"} 1.0
celery_tasks_by_name{name="my_app.tasks.fetch_some_data",state="RECEIVED"} 3.0
celery_tasks_by_name{name="my_app.tasks.fetch_some_data",state="PENDING"} 0.0
celery_tasks_by_name{name="my_app.tasks.fetch_some_data",state="STARTED"} 1.0
celery_tasks_by_name{name="my_app.tasks.fetch_some_data",state="RETRY"} 2.0
celery_tasks_by_name{name="my_app.tasks.fetch_some_data",state="FAILURE"} 1.0
celery_tasks_by_name{name="my_app.tasks.fetch_some_data",state="REVOKED"} 0.0
celery_tasks_by_name{name="my_app.tasks.fetch_some_data",state="SUCCESS"} 7.0
# HELP celery_task_latency Seconds between a task is received and started.
# TYPE celery_task_latency histogram
celery_task_latency_bucket{le="0.005"} 2.0
celery_task_latency_bucket{le="0.01"} 3.0
celery_task_latency_bucket{le="0.025"} 4.0
celery_task_latency_bucket{le="0.05"} 4.0
celery_task_latency_bucket{le="0.075"} 5.0
celery_task_latency_bucket{le="0.1"} 5.0
celery_task_latency_bucket{le="0.25"} 5.0
celery_task_latency_bucket{le="0.5"} 5.0
celery_task_latency_bucket{le="0.75"} 5.0
celery_task_latency_bucket{le="1.0"} 5.0
celery_task_latency_bucket{le="2.5"} 8.0
celery_task_latency_bucket{le="5.0"} 11.0
celery_task_latency_bucket{le="7.5"} 11.0
celery_task_latency_bucket{le="10.0"} 11.0
celery_task_latency_bucket{le="+Inf"} 11.0
celery_task_latency_count 11.0
celery_task_latency_sum 16.478713035583496
celery_queue_length{queue_name="queue1"} 35.0
celery_queue_length{queue_name="queue2"} 0.0

Limitations

  • Among tons of other features celery-prometheus-exporter doesn’t support stats for multiple queues. As far as I can tell, only the routing key is exposed through the events API which might be enough to figure out the final queue, though.

  • This has only been tested with Redis so far.

  • At this point, you should specify the queues that will be monitored using an environment variable or an arg (–queue-list).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

celery-prometheus-exporter-1.7.0.tar.gz (11.4 kB view details)

Uploaded Source

Built Distribution

File details

Details for the file celery-prometheus-exporter-1.7.0.tar.gz.

File metadata

  • Download URL: celery-prometheus-exporter-1.7.0.tar.gz
  • Upload date:
  • Size: 11.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.19.1 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.30.0 CPython/3.7.3

File hashes

Hashes for celery-prometheus-exporter-1.7.0.tar.gz
Algorithm Hash digest
SHA256 8fc2d5909921c44f01c8c1b7d956d92e6966f2e14eec196bf60735e39a0e0991
MD5 7a191a0e26eb3d2eefe5e5e3462085dd
BLAKE2b-256 c53fc7ae53ad33c736c48067730d184f27d58cbd0e60031f321d5787534751d7

See more details on using hashes here.

File details

Details for the file celery_prometheus_exporter-1.7.0-py2-none-any.whl.

File metadata

  • Download URL: celery_prometheus_exporter-1.7.0-py2-none-any.whl
  • Upload date:
  • Size: 9.0 kB
  • Tags: Python 2
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.19.1 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.30.0 CPython/3.7.3

File hashes

Hashes for celery_prometheus_exporter-1.7.0-py2-none-any.whl
Algorithm Hash digest
SHA256 a3ba0d3340b546ae82b36fef7645ccbc54c2b696fc3df05bb9ee28a402e710e1
MD5 aa842d5414f24c0f09b2de12b2b96955
BLAKE2b-256 bd5d71ff513923ed105303b4697b9f982fcfc16ca29b3b27557005a7d001a346

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page