Monitoring and Alert management tricks using Promtool and amtool

These days I am busy and exploring the most critical stuff of the cloud that is Monitoring. There were many ways to monitor the 'Kubernetes Cluster', CNCF recommendation on 'metric' data is 'Prometheus' stack that is using Grafana, AlertManager.

Install Prometheus your laptop

I've chosen the Prometheus for Windows amd64 option as my system with Windows 7.
download link

Validating Prometheus alert rules

After creating the Alert rule how do I check it is correct syntax or not. There is a nice tool that comes with Prometheus installation. Let me demonstrate the scenario where I've selected 7 important alert rules for my project.

set PROMETHEUS_HOME=C:\Users\pavan\myprom\prometheus-2.8.1.windows-amd64
set AM_HOME=C:\Users\pavan\myprom\alertmanager-0.16.1.windows-amd64
set PATH=%PROMETHEUS_HOME%;%AM_HOME%;%PATH%

After Setting the PATH you can validate the alerting rules with the following command:
promtool check rules my-mon_alertrules.yml
Checking my-mon_alertrules.yml
  SUCCESS: 7 rules found

Here is my rules sample I've defined the 7 alerts and it validated them the count and checks the YAML syntax.

Check Prometheus Scraping from targets

Every target Container will be pushing the metrics through the cluster Port that exposed, then the URL will be having context path as /metrics. To check the metrics are active or not by using simple 'curl' command followed by the exposed IP:Port/metrics executed inside container. Same should work from the Prometheus container as well. You will see the outcome of the corresponding metrics if it is working, otherwise, connection refused error will be thrown.

curl localhost:10413/metrics

There could be possibility of misconfiguration either in the target Container or else Prometheus not in healthy condition.



Checking AlertManager configuration with amtool


To know about what all the configurations are defined in current AlertManager working use the following command :
 ./amtool --alertmanager.url http://localhost:9501/alertmanager config
Note that hostname, port will be changed according to your configurations. Check for the silence query
./amtool --alertmanager.url http://localhost:7501/alertmanager silence query

Comments

Popular posts from this blog

Ansible Jinja2 Templates: A Complete Guide with Examples

Ansible 11 The uri module with examples

Jenkins Active choices parameter - Dynamic input