Skip to main content

Scrape only relevant metrics in Prometheus, according to your Grafana dashboards

Project description

frigga

Scrape only relevant metrics in Prometheus, according to your Grafana dashboards, see the before and after snapshot.

This tool extermely useful for Grafana Cloud customers, since the pricing is per DataSeries ingestions.

Requirements

Python 3.6.7+

Installation

$ pip install frigga

Getting Started

  1. Grafana - Import the dashboard frigga - Jobs Usage (ID: 12537) to Grafana, and check out your current number of DataSeries

  2. Grafana - Generate an API Key for Viewer

  3. Get the list of metrics that are in use in your dasboards

    $ frigga gl # gl is grafana-list, or good luck :)
    
    Grafana url [http://localhost:3000]: http://my-grafana.grafana.net
    Grafana api key: (hidden)
    >> [LOG] Getting the list of words to ignore when scraping from Grafana
    ...
    >> [LOG] Found a total of 269 unique metrics to keep
    

    .metrics.json - automatically generated in pwd

    {
        "all_metrics": [
            "cadvisor_version_info",
            "container_cpu_usage_seconds_total",
            "container_last_seen",
            "container_memory_max_usage_bytes",
            ...
        ]
    }
    
  4. Edit your prometheus.yml file, add the following snippet to the bottom of the file. Check the example in docker-swarm/prometheus-original.yml

     ---
     name: frigga
     exclude_jobs: []
    
  5. Use the .metrics.json file to apply the rules to your existing prometheus.yml

    $ frigga pa # pa is prometheus-apply, or pam-tada-dam
    
    Prom yaml path [docker-swarm/prometheus.yml]: /etc/prometheus/prometheus.yml
    Metrics json path [./.metrics.json]: /home/willywonka/.metrics.json
    >> [LOG] Reading documents from docker-swarm/prometheus.yml
    ...
    >> [LOG] Done! Now reload docker-swarm/prometheus.yml with 'docker exec $PROM_CONTAINER_NAME kill -HUP 1'
    
  6. As mentioned in the previous step, reload the prometheus.yml file, here are two ways of doing it

    • Killing it
      $ docker exec $PROM_CONTAINER_NAME kill -HUP 1
      
    • Sending a POST request to /-/reload - this requires prometheus to be loaded with --web.enable-lifecycle, for example, see docker-stack.yml
      $ curl -X POST http://localhost:9090/-/reload
      
  7. Make sure the prometheus.yml was loaded properly

    $ docker logs --tail 10 $PROM_CONTAINER_NAME
    
     level=info ts=2020-06-27T15:45:34.514Z caller=main.go:799 msg="Loading configuration file" filename=/etc/prometheus/prometheus.yml
     level=info ts=2020-06-27T15:45:34.686Z caller=main.go:827 msg="Completed loading of configuration file" filename=/etc/prometheus/prometheus.yml
    
  8. Grafana - Now check frigga - Jobs Usage dashboard, the numbers should be signifcantly lower (up to 60% or even more)

Test it locally

Requirements

  1. Docker
  2. docker-compose
  3. jq

Getting Started

  1. git clone this repository

  2. Deploy locally the services: Prometheus, Grafana, node-exporter and cadvisor

    $ bash docker-swarm/deploy_stack.sh
    
    Creating network frigga_net1
    ...
    >> Grafana - Generating API Key - for Viewer
    eyJrIjoiT29hNGxGZjAwT2hZcU1BSmpPRXhndXVwUUE4ZVNFcGQiLCJuIjoibG9jYWwiLCJpZCI6MX0=
    # Save this key ^^^
    
  3. Open your browser, navigate to http://localhost:3000

    • Username and password are admin:admin
    • You'll be prompted to update your password, so just keep use admin or hit Skip
  4. Go to Jobs Usage dashboard, you'll see that Prometheus is processing ~2800 DataSeries

  5. Let's change that! First get all the metrics that are used in your dasboards

    $ frigga gl -gurl http://localhost:3000 -gkey $GRAFANA_API_KEY
    
    >> [LOG] Getting the list of words to ignore when scraping from Grafana
    ...
    >> [LOG] Found a total of 269 unique metrics to keep
    # Generated .metrics.json in pwd
    
  6. Apply the rules to prometheus.yml, keep the defaults

    $ frigga pa # prometheus-apply
    
    Prom yaml path [docker-swarm/prometheus.yml]:
    Metrics json path [./.metrics.json]:
    ...
    >> [LOG] Done! Now reload docker-swarm/prometheus.yml with 'docker exec $PROM_CONTAINER_NAME kill -HUP 1'
    
  7. Reload prometheus.yml to Prometheus

    $ bash docker-swarm/reload_prom_config.sh show
    
    >> Reloading prometheus.yml configuration file
    ...
    level=info ts=2020-06-27T16:25:17.656Z caller=main.go:827 msg="Completed loading of configuration file" filename=/etc/prometheus/prometheus.yml
    
  8. Go to Jobs Usage, you'll see that Prometheus is processing only ~1000 DataSeries (previously ~2800)

    • In case you don't see the change, don't forget to hit the refersh button
  9. Cleanup

    $ docker stack rm frigga
    

Pros and Cons of this tool

Pros

  1. Grafana-Cloud - the main reason for writing this tool, which lowers the costs due to minimizing the number of active DataSeries
  2. Saves disk-space on the machine running Prometheus
  3. Reduces network traffic when using remote_write
  4. Improves PromQL performance by querying less metrics

Cons

  1. After applying the rules in prometheus.yml, it makes the file less readable. Due to the fact it's not a file that you play with on a daily basis, it's okayish
  2. The memory usage of Prometheus increases slightly, around ~30MB, not critical, but I'm obligated to point it out

Authors

Created and maintained by Meir Gabay

License

This project is licensed under the MIT License - see the LICENSE file for details

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

frigga-1.0.2.tar.gz (8.1 kB view hashes)

Uploaded Source

Built Distribution

frigga-1.0.2-py3-none-any.whl (9.9 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page