一、需求(requirement)
华为云CCE提供了prometheus插件,可以很方便的进行安装,不过其没有提供grafana的安装插件。同时一些情况下,想通过自定义的方式安装,这篇内容就记录下如何使用 helm 在k8s CCE上进行prometheus + grafana的安装。
HUAWEI CLOUD CCE provides the prometheus Add-ons, which can be easily installed, but it does not provide the grafana installation plug-in. At the same time, in some cases, if you want to install in a custom way, this article records how to use helm to install prometheus + grafana on k8s CCE.
二、helm环境准备(helm environment preparation)
helm可以安装在任一台可以管理cce k8s集群的主机上。也可以参考huaweicloud的官方指导:https://support.huaweicloud.com/intl/en-us/usermanual-cce/cce_10_0144.html 。
Helm can be installed on any host that can manage the cce k8s cluster. You can also refer to the official guidance of huaweicloud: https://support.huaweicloud.com/intl/en-us/usermanual-cce/cce_10_0144.html .
安装操作指令如下(The installation commands are as follows):
# configure kubeconfig
[root@ccetest-87180 .kube]# mv /tmp/kubeconfig.json config
[root@ccetest-87180 .kube]# ll
total 8
-rw------- 1 root root 5759 Aug 29 22:26 config
[root@ccetest-87180 .kube]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
192.168.0.249 Ready 16m v1.19.16-r1-CCE22.5.1
# install helm
wget https://get.helm.sh/helm-v3.3.0-linux-amd64.tar.gz
tar -xzvf helm-v3.3.0-linux-amd64.tar.gz
mv linux-amd64/helm /usr/local/bin/helm
helm version
三、安装Prometheus套件(Install the Prometheus suite)
prometheus-community提供了套件kube-prometheus-stack,包含prometheus、grafana、alertmanager软件。可以很方便的通过helm完成安装。
prometheus-community provides the suite kube-prometheus-stack, which contains prometheus, grafana, and alertmanager software. Installation can be easily done through helm.
[root@ccetest-87180 ~]# helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
"prometheus-community" has been added to your repositories
[root@ccetest-87180 ~]# helm repo update
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "prometheus-community" chart repository
Update Complete. ⎈ Happy Helming!⎈
[root@ccetest-87180 ~]# helm install prometheus prometheus-community/kube-prometheus-stack
NAME: prometheus
LAST DEPLOYED: Mon Aug 29 22:30:15 2022
NAMESPACE: default
STATUS: deployed
REVISION: 1
NOTES:
kube-prometheus-stack has been installed. Check its status by running:
kubectl --namespace default get pods -l "release=prometheus"
Visit https://github.com/prometheus-operator/kube-prometheus for instructions on how to create & configure Alertmanager and Prometheus instances using the Operator.
[root@ccetest-87180 ~]# kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
alertmanager-operated ClusterIP None 9093/TCP,9094/TCP,9094/UDP 1s
kubernetes ClusterIP 10.247.0.1 443/TCP 21m
prometheus-grafana ClusterIP 10.247.193.101 80/TCP 26s
prometheus-kube-prometheus-alertmanager ClusterIP 10.247.105.191 9093/TCP 26s
prometheus-kube-prometheus-operator ClusterIP 10.247.198.72 443/TCP 26s
prometheus-kube-prometheus-prometheus ClusterIP 10.247.28.103 9090/TCP 26s
prometheus-kube-state-metrics ClusterIP 10.247.125.106 8080/TCP 26s
prometheus-operated ClusterIP None 9090/TCP 1s
prometheus-prometheus-node-exporter ClusterIP 10.247.104.255 9100/TCP 26s
[root@ccetest-87180 ~]# kubectl expose deployment prometheus-grafana --target-port=3000 --type=NodePort --name=prometheus-grafana-ext
or
kubectl port-forward deployment/prometheus-grafana 3000
username: admin
password: prom-operator
[root@ccetest-87180 ~]# vim values.yaml
[root@ccetest-87180 ~]# helm upgrade --install prometheus prometheus-community/kube-prometheus-stack -f values.yaml
Release "prometheus" has been upgraded. Happy Helming!
NAME: prometheus
LAST DEPLOYED: Tue Aug 30 01:16:25 2022
NAMESPACE: default
STATUS: deployed
REVISION: 2
NOTES:
kube-prometheus-stack has been installed. Check its status by running:
kubectl --namespace default get pods -l "release=prometheus"
Visit https://github.com/prometheus-operator/kube-prometheus for instructions on how to create & configure Alertmanager and Prometheus instances using the Operator.
这里只是安装prometheus套件,如果需要安装其他软件,也可以使用helm仓库里的charts安装,命令如下(This is just to install the prometheus suite. If you need to install other software, you can also use the charts in the helm repository to install, the command is as follows):
helm repo add stable https://charts.helm.sh/stable
helm search repo software-name
# list install software and uninstall software
helm list
helm uninstall software-name
# list the repos
helm repo list
完成安装后,可以使用kubectl get svc
查询到的grafana对应的IP + 端口进行访问(After the installation is complete, you can use the IP + port corresponding to grafana queried by kubectl get svc
to access)。
四、使用EVS或SFStrubo(Use EVS or SFStrubo)
按照上面的默认安装,在k8s中prometheus、grafana、altermanager的数据存储类型是EmptyDir,这样就导致容器出现异常重建时,之前的数据就不存在了。所以要做数据的持久化,可以选择使用Huaweicloud的EVS、OBS、SFS三个类型的存储服务,不过由于OBS对象存储速度较慢,不太适合该场景,这里就排除掉了。
According to the default installation above, the data storage type of prometheus, grafana and altermanager in k8s is EmptyDir, so that when the container is rebuilt abnormally, the previous data does not exist. Therefore, for data persistence, you can choose to use three types of Huaweicloud storage services: EVS, OBS, and SFS. However, because OBS object storage is slow, it is not suitable for this scenario, so it is excluded here.
由于使用数据存储方式的定义是在values里定义的,所以可以先通过 helm inspect values
指令获取当前的默认配置,在修改里面的相关配置后,通过Helm进行安装或更新。
Since the definition of the data storage method is defined in values, you can first obtain the current default configuration through the helm inspect values
command, and then install or update it through Helm after modifying the relevant configuration in it.
[root@ccetest-87180 ~]# helm inspect values prometheus-community/prometheus > prometheus.values.yaml
[root@ccetest-87180 ~]# helm install prometheus prometheus-community/kube-prometheus-stack --values /root/prometheus.values.yaml
1. 使用EVS存储数据(Storing data with EVS)
在使用prometheus.values.yaml部署之前,可以修改其中的三个服务的存储部分的定义,具体修改内容如下(Before deploying with prometheus.values.yaml, you can modify the definition of the storage part of the three services. The specific modifications are as follows):
## Prometheus StorageSpec for persistent data
## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/user-guides/storage.md
##
#storageSpec: {}
storageSpec:
## Using PersistentVolumeClaim
##
volumeClaimTemplate:
spec:
storageClassName: csi-disk
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 80Gi
# selector: {}
在使用EVS存储时,不需要提前创建PV\PVC,使用该配置后,系统会自动创建。(When using EVS storage, you do not need to create PV\PVC in advance. After using this configuration, the system will create it automatically.)
更新安装(upgrade install)
[root@ccetest-87180 ~]# helm upgrade --install prometheus prometheus-community/kube-prometheus-stack -f values.yaml
2. 使用SFS共享存储数据(Store data using SFS shares)
使用SFS共享存储,需要提前创建好SFS、PV、PVC,再在values文件中更新相关内容。(To use SFS shared storage, you need to create SFS、PV and PVC in advance, and then update the relevant content in the values file.)
[root@ccetest-87180 ~]# cat pv-sfsturbo.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-sfsturbo-example
annotations:
pv.kubernetes.io/provisioned-by: everest-csi-provisioner
spec:
mountOptions:
- hard
- timeo=600
- nolock
accessModes:
- ReadWriteMany
capacity:
storage: 500Gi
claimRef:
apiVersion: v1
kind: PersistentVolumeClaim
name: prometheus-prometheus-kube-prometheus-prometheus-db-prometheus-prometheus-kube-prometheus-prometheus-0
namespace: default
csi:
driver: sfsturbo.csi.everest.io
fsType: nfs
volumeAttributes:
everest.io/share-export-location: 192.168.0.236:/
storage.kubernetes.io/csiProvisionerIdentity: everest-csi-provisioner
volumeHandle: 65aac901-875f-40fa-961f-40e50c5a46f8
persistentVolumeReclaimPolicy: Retain
storageClassName: csi-sfsturbo
[root@ccetest-87180 ~]# cat pvc-sfsturbo.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
annotations:
volume.beta.kubernetes.io/storage-provisioner: everest-csi-provisioner
name: prometheus-prometheus-kube-prometheus-prometheus-db-prometheus-prometheus-kube-prometheus-prometheus-0
namespace: default
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 500Gi
storageClassName: csi-sfsturbo
volumeName: pv-sfsturbo-example
[root@ccetest-87180 ~]# kubectl create -f pv-sfsturbo.yaml -f pvc-sfsturbo.yaml
对应的values文件中的内容为(The content in the corresponding values file is):
## Prometheus StorageSpec for persistent data
## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/user-guides/storage.md
##
#storageSpec: {}
storageSpec:
## Using PersistentVolumeClaim
##
volumeClaimTemplate:
spec:
storageClassName: csi-sfsturbo
volumeName: pv-sfsturbo-example
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 500Gi
# selector: {}
同样的可以使用如下命令安装或更新(The same can be installed or updated using the following commands):
[root@ccetest-87180 ~]# helm install prometheus prometheus-community/kube-prometheus-stack --values /root/values.yaml
or
[root@ccetest-87180 ~]# helm upgrade --install prometheus prometheus-community/kube-prometheus-stack -f values.yaml
这部分可以参考官方文档:https://support.huaweicloud.com/intl/zh-cn/usermanual-cce/cce_01_0272.html
This part can refer to the official documentation: https://support.huaweicloud.com/intl/zh-cn/usermanual-cce/cce_01_0272.html
storageClassName的类型比较重要,每家云厂商关于这部分的定义是不同,具体可以使用以下指令确认相关信息。(The type of storageClassName is more important. Each cloud vendor defines this part differently. You can use the following commands to confirm the relevant information.)
[root@ccetest-87180 ~]# kubectl get sc
[root@ccetest-87180 ~]# kubectl get pv
[root@ccetest-87180 ~]# kubectl get pvc