Deploy prometheusalert

1
2
3
4
git clone https://github.com/feiyu563/PrometheusAlert.git
cd PrometheusAlert/example/helm/prometheusalert
#Update config/app.conf to set login user information and configure database information
helm install -n monitoring .

Create an enterprise WeChat group robot

After the enterprise WeChat group, click the enterprise WeChat group, right-click “Add group robot”, and create a group robot to get a robot webhook address, record the address for backup.

Prometheus access configuration

Configure alertmanager

Since I use https://github.com/prometheus-operator/kube-prometheus.git for prometheus here You can directly modify the alertmanager-secret.yaml configuration file and configure the above webhook in receivers:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
apiVersion: v1
data: {}
kind: Secret
type: Opaque
metadata:
name: alertmanager-main
namespace: monitoring
stringData:
alertmanager.yaml: |-
global:
resolve_timeout: "5m"
#email
smtp_smarthost: 'smtp.exmail.qq.com:465' # Email smtp server proxy
smtp_from: 'devops_alert@xxx.com' # Send email name
smtp_auth_username: 'devops_alert@xxx.com' # Email name
smtp_auth_password: '1DevSop#' # Authorization password
smtp_require_tls: false # Do not enable tls (enabled by default)
inhibit_rules:
- equal:
- namespace
- alertname
- instance
source_match:
severity: critical
target_match_re:
severity: warning|info
- equal:
- namespace
- alertname
- instance
source_match:
severity: warning
target_match_re:
severity: info
receivers:
- name: default
email_configs: # Email configuration
- to: "xxxx@163.com" # Email configuration for receiving alerts
- name: jf-k8s-node
webhook_configs:
- url: "http://palert.con.sdi/prometheusalert?type=wx&tpl=jf-k8s-node&wxurl=https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key=0da1f944-9a4e-4221-8036-f322351f7bd8"

Configure Prometheus rules

Here is an example of disk alert rules

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
labels:
prometheus: k8s
role: alert-rules
name: external-node.rules
namespace: monitoring
spec:
groups:
-name: k8s-node.rules
rules:
-alert: disk capacity
expr: 100-(node_filesystem_avail_bytes{instance=~'39.100.*',fstype=~"ext4|xfs|fuse.glusterfs"}/node_filesystem_size_bytes {fstype=~"ext4|xfs|fuse.glusterfs"}*100) > 96
for: 1m
labels:
severity: critical
app: node-alert
annotations:
summary: "{{$labels.mountpoint}} Disk partition usage is too high! "
description: "{{$labels.mountpoint }} Disk partition usage is greater than 90% (currently using: {{$value}}%)"

Custom message template

For the above alarm rules, by default, alertmanger has its own alarm message template which is not very useful. Here we use the alarm template of prometheusalert for configuration.

According to the official documentation, we create a new template and specify the template type as WeChat for Enterprise, the template uses Prometheus, and the template content is as follows:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
{{ $var := .externalURL}}{{ range $k,$v:=.alerts }}
{{if eq $v.status "resolved"}}
#### [Prometheus recovery information]({{$v.generatorURL}})
##### <font color="#FF0000">Alarm name</font>: [{{$v.labels.alertname}}]({{$var}})
##### <font color="#FF0000">Alarm level</font>: {{$v.labels.severity}}
##### <font color="#FF0000">Trigger time</font>: {{GetCSTtime $v.startsAt}}
##### <font color="#02b340">Recovery time</font>: {{GetCSTtime $v.endsAt}}
##### <font color="#FF0000">Fault instance</font>: {{$v.labels.instance}}
##### <font color="#FF0000">Alarm details</font>: {{$v.annotations.description}}
{{else}}
#### [Prometheus alarm information]({{$v.generatorURL}})
##### <font color="#FF0000">Alarm name</font>: [{{$v.labels.alertname}}]({{$var}})
##### <font color="#FF0000">Alarm level</font>: {{$v.labels.severity}}
##### <font color="#FF0000">Trigger time</font>: {{GetCSTtime $v.startsAt}}
##### <font color="#FF0000">Fault instance</font>: {{$v.labels.instance}}
##### <font color="#FF0000">Alarm details</font>: {{$v.annotations.description}}
{{end}}
{{ end }}

After the disk space alarm, the actual effect is as follows:

terraform operation flow chart

So far, the configuration of using prometheusalert to receive Prometheus alarm messages and forward them to the enterprise WeChat group robot has been completed.

In addition, for multiple clusters and multiple message sources, the power of prometheusalert is reflected.