侧边栏壁纸
博主头像
惬意小蜗牛博主等级

海内存知己,天涯若比邻!

  • 累计撰写 54 篇文章
  • 累计创建 143 个标签
  • 累计收到 63 条评论

目 录CONTENT

文章目录

Breeze 实战系列 之 修改 prometheus 配置文件

惬意小蜗牛
2021-07-19 / 0 评论 / 0 点赞 / 3,212 阅读 / 1,292 字 / 正在检测是否收录...

我们浏览 Prometheus Dashboard 上的 Configuration 页面,会有个疑问,这些配置内容是从哪里来的呢?

prometheus-operator 和直接部署 prometheus 区别是 operator 把 prometheus , alertmanager server 的配置, 还有 scape config , record / alert rule 包装成了 k8s 中的 CRD

[root@master01 ~]# kubectl get crd | grep  monitoring
alertmanagers.monitoring.coreos.com     2019-10-14T10:19:34Z
podmonitors.monitoring.coreos.com       2019-10-14T10:19:34Z
prometheuses.monitoring.coreos.com      2019-10-14T10:19:35Z
prometheusrules.monitoring.coreos.com   2019-10-14T10:19:35Z
servicemonitors.monitoring.coreos.com   2019-10-14T10:19:35Z

修改 CRD 之后,operator 监控到 CRD 的修改,生成一份prometheus 的配置文件,gzip 压缩后存成 k8s Secret。

那么如何获取配置并修改这个文件呢?

我们先看一下 monitoring 空间下都有哪些证书

[root@master01 ~]# kubectl get secret -n monitoring
NAME                              TYPE                                  DATA   AGE
alertmanager-main                 Opaque                                1      295d
alertmanager-main-token-fpw52     kubernetes.io/service-account-token   3      295d
default-token-dbggv               kubernetes.io/service-account-token   3      295d
etcd-certs                        Opaque                                3      295d
grafana-datasources               Opaque                                1      295d
grafana-token-b2vkn               kubernetes.io/service-account-token   3      295d
istio.alertmanager-main           istio.io/key-and-cert                 3      295d
istio.default                     istio.io/key-and-cert                 3      295d
istio.grafana                     istio.io/key-and-cert                 3      295d
istio.kube-state-metrics          istio.io/key-and-cert                 3      295d
istio.node-exporter               istio.io/key-and-cert                 3      295d
istio.prometheus-adapter          istio.io/key-and-cert                 3      295d
istio.prometheus-k8s              istio.io/key-and-cert                 3      295d
istio.prometheus-operator         istio.io/key-and-cert                 3      295d
kube-state-metrics-token-24vd9    kubernetes.io/service-account-token   3      295d
node-exporter-token-m8gbn         kubernetes.io/service-account-token   3      295d
prometheus-adapter-token-gz56f    kubernetes.io/service-account-token   3      295d
prometheus-k8s                    Opaque                                1      295d
prometheus-k8s-token-bwq27        kubernetes.io/service-account-token   3      295d
prometheus-operator-token-bgds6   kubernetes.io/service-account-token   3      295d

读取 prometheus-k8s secret 内容为 json , 主要是看一下 data 中 xxx.xxx.gz 属性的值

[root@master01 ~]# kubectl get secret -n monitoring prometheus-k8s -o json
{
    "apiVersion": "v1",
    "data": {
        "prometheus.yaml.gz": "xxxxxxxxx"
    },
    "kind": "Secret",
    "metadata": {
        "annotations": {
            "generated": "true"
        },
        "creationTimestamp": "2019-10-14T10:19:55Z",
        "labels": {
            "managed-by": "prometheus-operator"
        },
        "name": "prometheus-k8s",
        "namespace": "monitoring",
        "ownerReferences": [
            {
                "apiVersion": "monitoring.coreos.com/v1",
                "blockOwnerDeletion": true,
                "controller": true,
                "kind": "Prometheus",
                "name": "k8s",
                "uid": "41a949b2-bc18-43d7-b87d-db6e8990c27f"
            }
        ],
        "resourceVersion": "1953",
        "selfLink": "/api/v1/namespaces/monitoring/secrets/prometheus-k8s",
        "uid": "beda93d3-b123-4085-acea-30225f899f5a"
    },
    "type": "Opaque"
}

取出 data.“xxx.xxx.gz” 的值,再做 base64 解密和 gzip 还原, 得到最终配置文件

[root@master01 tmp]# kubectl get secret -n monitoring prometheus-k8s -o json | jq -r '.data."prometheus.yaml.gz"' | base64 -d | gzip -d
或
[root@master01 tmp]# echo -n " data.‘xxx.xxx.gz’ 值 " | base64 -d | gzip -d

global:
  evaluation_interval: 30s
  scrape_interval: 30s
  external_labels:
    prometheus: monitoring/k8s
    prometheus_replica: $(POD_NAME)
rule_files:
- /etc/prometheus/rules/prometheus-k8s-rulefiles-0/*.yaml
scrape_configs:
- job_name: monitoring/alertmanager/0
  ......
- job_name: monitoring/coredns/0
  ......
- job_name: monitoring/etcd-k8s/0
  ......
- job_name: monitoring/grafana/0
  ......
- job_name: monitoring/kube-apiserver/0
  ......
- job_name: monitoring/kube-controller-manager/0
  ......
- job_name: monitoring/kube-scheduler/0
  ......
- job_name: monitoring/kube-state-metrics/0
  ......
- job_name: monitoring/kube-state-metrics/1
  ......
- job_name: monitoring/kubelet/0
  ......
- job_name: monitoring/kubelet/1
  ......
- job_name: monitoring/node-exporter/0
  ......
- job_name: monitoring/prometheus/0
  ......
- job_name: monitoring/prometheus-operator/0
  ......
alerting:
  alert_relabel_configs:
  - action: labeldrop
    regex: prometheus_replica
  alertmanagers:
  - path_prefix: /
    scheme: http
    kubernetes_sd_configs:
    - role: endpoints
      namespaces:
        names:
        - monitoring
    relabel_configs:
    - action: keep
      source_labels:
      - __meta_kubernetes_service_name
      regex: alertmanager-main
    - action: keep
      source_labels:
      - __meta_kubernetes_endpoint_port_name
      regex: web

同理可得 alertmanage 的配置内容

[root@master01 ~]# kubectl get secret -n monitoring alertmanager-main -o json | jq -r '.data."alertmanager.yaml"' | base64 -d
"global":
  "resolve_timeout": "5m"
"receivers":
- "name": "null"
"route":
  "group_by":
  - "job"
  "group_interval": "5m"
  "group_wait": "30s"
  "receiver": "null"
  "repeat_interval": "12h"
  "routes":
  - "match":
      "alertname": "Watchdog"
    "receiver": "null"

那么如何修改这个文件呢?

先说说稍微简单的 alertmanager-main secret 如何修改
1.将获取到的 alertmanager-main secret 的内容

"global":
  "resolve_timeout": "5m"
"receivers":
- "name": "null"
"route":
  "group_by":
  - "job"
  "group_interval": "5m"
  "group_wait": "30s"
  "receiver": "null"
  "repeat_interval": "12h"
  "routes":
  - "match":
      "alertname": "Watchdog"
    "receiver": "null"

2.调整为以下内容,并保存至 alertmanage-main-upd-v1.yaml 文件中

global:
  resolve_timeout: 5m
  smtp_auth_username: info@jz-ins.com
  smtp_auth_password: Dgn1JfL4oBfHTWPE
  smtp_from: info@jz-ins.com
  smtp_require_tls: false
  smtp_smarthost: smtp.qq.com:465
receivers:
- email_configs:
  - headers:
      Subject: '[ERROR] prometheus............'
    to: 517469812@qq.com,snail@jz-ins.com,dracula@jz-ins.com
  name: team-X-mails
- name: "null"
route:
  group_by:
  - alertname
  - cluster
  - service
  group_interval: 5m
  group_wait: 60s
  receiver: team-X-mails
  repeat_interval: 24h
  routes:
  - match:
      alertname: Watchdog
    receiver: "null"

3.将调整后的 alertmanage-main-upd-v1.yaml 文件通过命令方式将配置通过base64编码并将编码结果输出到 alertmanage-main-upd-v1.yaml.txt 文件中, Base64 加解密也可以使用在线工具 https://www.sojson.com/base64.html

base64 alertmanage-main-upd-v1.yaml > alertmanage-main-upd-v1.yaml.txt

4.然后进入 alertmanage-main-upd-v1.yaml.txt 文件中复制编码之后的字符串, 执行命令 kubectl edit secrets -n monitoring alertmanager-main 然后data下面的alertmanager.yaml的值替换为刚才复制的字符串,保存并退出(esc–>:wq)就可以了。在完成更新之后可以访问alertmanager的界面,查看配置是否已经生效

0

评论区