小言_互联网的博客

Kubernetes监控平台搭建

419人阅读  评论(0)

写在前面

   k8s是目前最流行的容器集群管理基础组件,是当下微服务盛行的互联网时代产物。关于k8s概念、部署、实战方面可以阅读本号前面发布的文章。下面是相关链接:

kubernetes架构原理和核心概念

部署kubernetes集群

实战kubenertes

本章将继续介绍k8s的监控平台搭建,监控平台的作用相信对于每个经历过排查生产问题的程序员来说已经不需要多讲,搭建一套完善的k8s监控平台可以帮助我们任何时候通过监控平台来观察生产服务器的运行资源使用情况,例如:CPU、内存、磁盘、网络IO等,除了资源可视化之外,还可以设置监控预警功能,在生产服务资源使用超过预期的设置时,可以实时通过发送邮件等方式告知相关责任人,提前新增资源或排查代码问题,提高生产环境可用性和稳定性,提高用户对应用的信赖,减少加班排查问题的概率。

下面介绍基于prometheus + grafana 的方式搭建一套k8s监控平台。prometheus + grafana的方式中其中prometheus就类似ELK架构下的logstash(采集) + elasticsearch(数据存储),  grafana 就是kabana的角色。

Prometheus简介

官方地址:https://prometheus.io/docs/

Prometheus是最初在SoundCloud上构建的开源系统监视和警报工具包 。自2012年成立以来,许多公司和组织都采用了Prometheus,该项目拥有非常活跃的开发人员和用户社区。现在,它是一个独立的开源项目,并且独立于任何公司进行维护。为了强调这一点并阐明项目的治理结构,Prometheus在2016年加入了 Cloud Native Computing Foundation,这是继Kubernetes之后的第二个托管项目。

Prometheus特点

  • 一个多维数据模型,其中包含通过度量标准名称和键/值对标识的时间序列数据

  • PromQL,一种灵活的查询语言 ,可利用此维度

  • 不依赖分布式存储;单服务器节点是自治的

  • 时间序列收集通过HTTP上的拉模型进行

  • 通过中间网关支持推送时间序列

  • 通过服务发现或静态配置发现目标

  • 多种图形和仪表板支持模式

Prometheus生态系统组件

  • prometheus(主服务组件)

  • clientlib(客户端组件)

  • pushgateway(推送网关组件)

  • exporters(对外暴露组件)

  • alertmanager(告警组件)

  • 其它工具的支撑

Prometheus架构图

Prometheus直接或通过中间推送网关从已检测作业中删除指标,以用于短期作业。它在本地存储所有报废的样本,并对这些数据运行规则,以汇总和记录现有数据中的新时间序列,或生成警报。Grafana或其他API使用者可以用来可视化收集的数据。

Grafana简介

官方文档地址:https://grafana.com/docs/

简而言之,这里我们选择Grafana的作用就是从Prometheus中读取数据,生成报表的形式进行数据可视化的功能。搭建完成后的效果图如下:

开始部署Prometheus

这里不推荐你完整阅读官方文档,因为通常程序员的时间非常有限,所以本文不会具体介绍每个配置的具体作用,你只需要按照文档一步步去操作即可,等最后搭建出来体验过之后,等未来有时间再去细读官方文档也不迟。

第一步:在在k8s-master节点上创建一个目录,例如:/k8smonitor,后面所有的配置文件均统一放在这个目录进行管理。

第二步:进入/k8smonitor目录,创建node-exporter.yaml文件,文件内容如下:


   
  1. ---
  2. apiVersion: apps/v1
  3. kind: DaemonSet
  4. metadata:
  5. name: node-exporter
  6. namespace: kube-system
  7. labels:
  8. k8s-app: node-exporter
  9. spec:
  10. selector:
  11. matchLabels:
  12. k8s-app: node-exporter
  13. template:
  14. metadata:
  15. labels:
  16. k8s-app: node-exporter
  17. spec:
  18. containers:
  19. - image: prom/node-exporter
  20. name: node-exporter
  21. ports:
  22. - containerPort: 9100
  23. protocol: TCP
  24. name: http
  25. ---
  26. apiVersion: v1
  27. kind: Service
  28. metadata:
  29. labels:
  30. k8s-app: node-exporter
  31. name: node-exporter
  32. namespace: kube-system
  33. spec:
  34. ports:
  35. - name: http
  36. port: 9100
  37. nodePort: 31672
  38. protocol: TCP
  39. type: NodePort
  40. selector:
  41. k8s-app: node-exporter

第三步:创建rbac-setup.yaml文件,文件内容如下:


   
  1. apiVersion: rbac.authorization.k8s.io/v1
  2. kind: ClusterRole
  3. metadata:
  4. name: prometheus
  5. rules:
  6. - apiGroups: [""]
  7. resources:
  8. - nodes
  9. - nodes/proxy
  10. - services
  11. - endpoints
  12. - pods
  13. verbs: ["get", "list", "watch"]
  14. - apiGroups:
  15. - extensions
  16. resources:
  17. - ingresses
  18. verbs: ["get", "list", "watch"]
  19. - nonResourceURLs: ["/metrics"]
  20. verbs: ["get"]
  21. ---
  22. apiVersion: v1
  23. kind: ServiceAccount
  24. metadata:
  25. name: prometheus
  26. namespace: kube-system
  27. ---
  28. apiVersion: rbac.authorization.k8s.io/v1
  29. kind: ClusterRoleBinding
  30. metadata:
  31. name: prometheus
  32. roleRef:
  33. apiGroup: rbac.authorization.k8s.io
  34. kind: ClusterRole
  35. name: prometheus
  36. subjects:
  37. - kind: ServiceAccount
  38. name: prometheus
  39. namespace: kube-system

第四步:创建configmap.yaml文件,文件内容如下:


   
  1. apiVersion: v1
  2. kind: ConfigMap
  3. metadata:
  4. name: prometheus-config
  5. namespace: kube-system
  6. data:
  7. prometheus.yml: |
  8. global:
  9. scrape_interval: 15s
  10. evaluation_interval: 15s
  11. scrape_configs:
  12. - job_name: 'kubernetes-apiservers'
  13. kubernetes_sd_configs:
  14. - role: endpoints
  15. scheme: https
  16. tls_config:
  17. ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
  18. bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  19. relabel_configs:
  20. - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
  21. action: keep
  22. regex: default;kubernetes;https
  23. - job_name: 'kubernetes-nodes'
  24. kubernetes_sd_configs:
  25. - role: node
  26. scheme: https
  27. tls_config:
  28. ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
  29. bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  30. relabel_configs:
  31. - action: labelmap
  32. regex: __meta_kubernetes_node_label_(.+)
  33. - target_label: __address__
  34. replacement: kubernetes.default.svc:443
  35. - source_labels: [__meta_kubernetes_node_name]
  36. regex: (.+)
  37. target_label: __metrics_path__
  38. replacement: /api/v1/nodes/${1}/proxy/metrics
  39. - job_name: 'kubernetes-cadvisor'
  40. kubernetes_sd_configs:
  41. - role: node
  42. scheme: https
  43. tls_config:
  44. ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
  45. bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  46. relabel_configs:
  47. - action: labelmap
  48. regex: __meta_kubernetes_node_label_(.+)
  49. - target_label: __address__
  50. replacement: kubernetes.default.svc:443
  51. - source_labels: [__meta_kubernetes_node_name]
  52. regex: (.+)
  53. target_label: __metrics_path__
  54. replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
  55. - job_name: 'kubernetes-service-endpoints'
  56. kubernetes_sd_configs:
  57. - role: endpoints
  58. relabel_configs:
  59. - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
  60. action: keep
  61. regex: true
  62. - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
  63. action: replace
  64. target_label: __scheme__
  65. regex: (https?)
  66. - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
  67. action: replace
  68. target_label: __metrics_path__
  69. regex: (.+)
  70. - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
  71. action: replace
  72. target_label: __address__
  73. regex: ([^:]+)(?::\d+)?;(\d+)
  74. replacement: $1:$2
  75. - action: labelmap
  76. regex: __meta_kubernetes_service_label_(.+)
  77. - source_labels: [__meta_kubernetes_namespace]
  78. action: replace
  79. target_label: kubernetes_namespace
  80. - source_labels: [__meta_kubernetes_service_name]
  81. action: replace
  82. target_label: kubernetes_name
  83. - job_name: 'kubernetes-services'
  84. kubernetes_sd_configs:
  85. - role: service
  86. metrics_path: /probe
  87. params:
  88. module: [http_2xx]
  89. relabel_configs:
  90. - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_probe]
  91. action: keep
  92. regex: true
  93. - source_labels: [__address__]
  94. target_label: __param_target
  95. - target_label: __address__
  96. replacement: blackbox-exporter.example.com:9115
  97. - source_labels: [__param_target]
  98. target_label: instance
  99. - action: labelmap
  100. regex: __meta_kubernetes_service_label_(.+)
  101. - source_labels: [__meta_kubernetes_namespace]
  102. target_label: kubernetes_namespace
  103. - source_labels: [__meta_kubernetes_service_name]
  104. target_label: kubernetes_name
  105. - job_name: 'kubernetes-ingresses'
  106. kubernetes_sd_configs:
  107. - role: ingress
  108. relabel_configs:
  109. - source_labels: [__meta_kubernetes_ingress_annotation_prometheus_io_probe]
  110. action: keep
  111. regex: true
  112. - source_labels: [__meta_kubernetes_ingress_scheme,__address__,__meta_kubernetes_ingress_path]
  113. regex: (.+);(.+);(.+)
  114. replacement: ${1}://${2}${3}
  115. target_label: __param_target
  116. - target_label: __address__
  117. replacement: blackbox-exporter.example.com:9115
  118. - source_labels: [__param_target]
  119. target_label: instance
  120. - action: labelmap
  121. regex: __meta_kubernetes_ingress_label_(.+)
  122. - source_labels: [__meta_kubernetes_namespace]
  123. target_label: kubernetes_namespace
  124. - source_labels: [__meta_kubernetes_ingress_name]
  125. target_label: kubernetes_name
  126. - job_name: 'kubernetes-pods'
  127. kubernetes_sd_configs:
  128. - role: pod
  129. relabel_configs:
  130. - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
  131. action: keep
  132. regex: true
  133. - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
  134. action: replace
  135. target_label: __metrics_path__
  136. regex: (.+)
  137. - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
  138. action: replace
  139. regex: ([^:]+)(?::\d+)?;(\d+)
  140. replacement: $1:$2
  141. target_label: __address__
  142. - action: labelmap
  143. regex: __meta_kubernetes_pod_label_(.+)
  144. - source_labels: [__meta_kubernetes_namespace]
  145. action: replace
  146. target_label: kubernetes_namespace
  147. - source_labels: [__meta_kubernetes_pod_name]
  148. action: replace
  149. target_label: kubernetes_pod_name

第五步:创建prometheus.deploy.yaml文件,文件内容如下:


   
  1. ---
  2. apiVersion: apps/v1
  3. kind: Deployment
  4. metadata:
  5. labels:
  6. name: prometheus-deployment
  7. name: prometheus
  8. namespace: kube-system
  9. spec:
  10. replicas: 1
  11. selector:
  12. matchLabels:
  13. app: prometheus
  14. template:
  15. metadata:
  16. labels:
  17. app: prometheus
  18. spec:
  19. containers:
  20. - image: prom/prometheus:v2.0.0
  21. name: prometheus
  22. command:
  23. - "/bin/prometheus"
  24. args:
  25. - "--config.file=/etc/prometheus/prometheus.yml"
  26. - "--storage.tsdb.path=/prometheus"
  27. - "--storage.tsdb.retention=24h"
  28. ports:
  29.         - containerPort:  9090
  30. protocol: TCP
  31. volumeMounts:
  32. - mountPath: "/prometheus"
  33. name: data
  34. - mountPath: "/etc/prometheus"
  35. name: config-volume
  36. resources:
  37. requests:
  38. cpu: 100m
  39. memory: 100Mi
  40. limits:
  41. cpu: 500m
  42. memory: 2500Mi
  43. serviceAccountName: prometheus
  44. volumes:
  45. - name: data
  46. emptyDir: {}
  47. - name: config-volume
  48. configMap:
  49. name: prometheus-config

第六步:创建prometheus.svc.yaml文件,文件内容如下:


   
  1. ---
  2. kind: Service
  3. apiVersion: v1
  4. metadata:
  5. labels:
  6. app: prometheus
  7. name: prometheus
  8. namespace: kube-system
  9. spec:
  10. type: NodePort
  11. ports:
  12.   - port:  9090
  13.     targetPort:  9090
  14.     nodePort:  30006
  15. selector:
  16. app: prometheus

第七步:在k8s-master节点上进入/k8smonitor目录依次执行以下命令部署prometheus:


   
  1. kubectl apply -f node-exporter .yaml
  2. kubectl apply -f rbac-setup .yaml
  3. kubectl apply -f configmap .yaml
  4. kubectl apply -f prometheus .deploy .yaml
  5. kubectl  apply  -f  prometheus .svc .yaml

第八步:在k8s-master节点上通过执行以下命令查看启动情况:


   
  1. kubectl get pods -n kube- system | grep prometheus
  2. kubectl get deploy -n kube- system |  grep prometheus
  3. kubectl get svc -n kube- system | grep prometheu
  4. kubectl get DaemonSet -n kube- system | grep node-exporter

开始部署Grafana

第一步:继续进入k8s-master节点的/k8smonitor目录,创建grafana-deploy.yaml文件,文件内容如下:


   
  1. apiVersion: apps/v1
  2. kind: Deployment
  3. metadata:
  4. name: grafana-core
  5. namespace: kube-system
  6. labels:
  7. app: grafana
  8. component: core
  9. spec:
  10. replicas: 1
  11. selector:
  12. matchLabels:
  13. app: grafana
  14. template:
  15. metadata:
  16. labels:
  17. app: grafana
  18. component: core
  19. spec:
  20. containers:
  21. - image: grafana/grafana:4.2.0
  22. name: grafana-core
  23. imagePullPolicy: IfNotPresent
  24. # env:
  25. resources:
  26. # keep request = limit to keep this container in guaranteed class
  27. limits:
  28. cpu: 100m
  29. memory: 100Mi
  30. requests:
  31. cpu: 100m
  32. memory: 100Mi
  33. env:
  34. # The following env variables set up basic auth twith the default admin user and admin password.
  35. - name: GF_AUTH_BASIC_ENABLED
  36. value: "true"
  37. - name: GF_AUTH_ANONYMOUS_ENABLED
  38. value: "false"
  39. # - name: GF_AUTH_ANONYMOUS_ORG_ROLE
  40. # value: Admin
  41. # does not really work, because of template variables in exported dashboards:
  42. # - name: GF_DASHBOARDS_JSON_ENABLED
  43. # value: "true"
  44. readinessProbe:
  45. httpGet:
  46. path: /login
  47.             port:  3000
  48. # initialDelaySeconds: 30
  49. # timeoutSeconds: 1
  50. volumeMounts:
  51. - name: grafana-persistent-storage
  52. mountPath: /var
  53. volumes:
  54. - name: grafana-persistent-storage
  55. emptyDir: {}

第二步:创建grafana-svc.yaml文件,文件内容如下:


   
  1. apiVersion: v1
  2. kind: Service
  3. metadata:
  4. name: grafana
  5. namespace: kube-system
  6. labels:
  7. app: grafana
  8. component: core
  9. spec:
  10. type: NodePort
  11. ports:
  12.     - port:  3000
  13.       targetPort:  3000
  14. nodePort: 30003
  15. selector:
  16. app: grafana
  17. component: core

第三步:创建grafana-ing.yaml文件,文件内容如下:


   
  1. apiVersion:  apps/v1
  2. kind: Ingress
  3. metadata:
  4. name: grafana
  5. namespace: kube-system
  6. spec:
  7. rules:
  8. - host: k8s.grafana
  9. http:
  10. paths:
  11. - path: /
  12. backend:
  13. serviceName: grafana
  14. servicePort: 3000

第四步:在k8s-master节点上进入/k8smonitor目录依次执行以下命令部署grafana:


   
  1. kubectl apply -f grafana-deploy .yaml
  2. kubectl  apply  -f  grafana-deploy .yaml
  3. kubectl  apply  -f  grafana-ing .yaml

第五步:在k8s-master节点上通过执行以下命令查看启动情况:


   
  1. kubectl get pods -n kube- system | grep grafana
  2. kubectl get deploy -n kube- system |  grep grafana
  3. kubectl get svc -n kube- system | grep grafana
  4. kubectl get ing -n kube- system | grep grafana

至此,prometheus 和 grafana部署完毕!

浏览器访问Grafana

第一步:查看grafana-svc创建后生成的Service端口号:

kubectl get svc -n kube-system

第二步:浏览器访问Grafana:

http://10.68.212.104:30003/login

初始化默认用户和密码统一为:admin/admin。

第三步:添加数据源(Add datasource):

第四步:查到Prometheus的集群IP和端口,注意,这里一定要用ClusterIP,和代理转发到容器的端口(对应svc的port配置值):

如上图所示,CLUSTER-IP:10.98.71.71, 代理端口为:9090。

第五步:继续配置Prometheus数据源:

最后点击Add按钮添加数据源,并保证Testing通过。

第六步:导入内置报表模板:

第七步:输入Prometheus网络模板ID,这里选择ID为315的模板进行统计:

第八步:选择数据源并点击导入模板进行数据可视化:

第九步:大功告成,效果图如下:


   
  1. ---------- 正文结束 ----------
  2. 长按扫码关注微信公众号
  3. Java软件编程之家

转载:https://blog.csdn.net/lzy_zhi_yuan/article/details/112001166
查看评论
* 以上用户言论只代表其个人观点,不代表本网站的观点或立场