Huaweicloud CCE helm部署Prometheus

一、需求(requirement)

华为云CCE提供了prometheus插件,可以很方便的进行安装,不过其没有提供grafana的安装插件。同时一些情况下,想通过自定义的方式安装,这篇内容就记录下如何使用 helm 在k8s CCE上进行prometheus + grafana的安装。

HUAWEI CLOUD CCE provides the prometheus Add-ons, which can be easily installed, but it does not provide the grafana installation plug-in. At the same time, in some cases, if you want to install in a custom way, this article records how to use helm to install prometheus + grafana on k8s CCE.

二、helm环境准备(helm environment preparation)

helm可以安装在任一台可以管理cce k8s集群的主机上。也可以参考huaweicloud的官方指导:https://support.huaweicloud.com/intl/en-us/usermanual-cce/cce_10_0144.html 。

Helm can be installed on any host that can manage the cce k8s cluster. You can also refer to the official guidance of huaweicloud: https://support.huaweicloud.com/intl/en-us/usermanual-cce/cce_10_0144.html .

安装操作指令如下(The installation commands are as follows):

# configure kubeconfig
[root@ccetest-87180 .kube]# mv /tmp/kubeconfig.json config
[root@ccetest-87180 .kube]# ll
total 8
-rw------- 1 root root 5759 Aug 29 22:26 config
[root@ccetest-87180 .kube]# kubectl get nodes
NAME            STATUS   ROLES    AGE   VERSION
192.168.0.249   Ready    <none>   16m   v1.19.16-r1-CCE22.5.1
# install helm
wget https://get.helm.sh/helm-v3.3.0-linux-amd64.tar.gz
tar -xzvf helm-v3.3.0-linux-amd64.tar.gz
mv linux-amd64/helm /usr/local/bin/helm
helm version

三、安装Prometheus套件(Install the Prometheus suite)

prometheus-community提供了套件kube-prometheus-stack,包含prometheus、grafana、alertmanager软件。可以很方便的通过helm完成安装。

prometheus-community provides the suite kube-prometheus-stack, which contains prometheus, grafana, and alertmanager software. Installation can be easily done through helm.

[root@ccetest-87180 ~]# helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
"prometheus-community" has been added to your repositories
[root@ccetest-87180 ~]# helm repo update
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "prometheus-community" chart repository
Update Complete. ⎈ Happy Helming!⎈
[root@ccetest-87180 ~]# helm install prometheus prometheus-community/kube-prometheus-stack
NAME: prometheus
LAST DEPLOYED: Mon Aug 29 22:30:15 2022
NAMESPACE: default
STATUS: deployed
REVISION: 1
NOTES:
kube-prometheus-stack has been installed. Check its status by running:
  kubectl --namespace default get pods -l "release=prometheus"
Visit https://github.com/prometheus-operator/kube-prometheus for instructions on how to create & configure Alertmanager and Prometheus instances using the Operator.
[root@ccetest-87180 ~]# kubectl get svc
NAME                                      TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                      AGE
alertmanager-operated                     ClusterIP   None             <none>        9093/TCP,9094/TCP,9094/UDP   1s
kubernetes                                ClusterIP   10.247.0.1       <none>        443/TCP                      21m
prometheus-grafana                        ClusterIP   10.247.193.101   <none>        80/TCP                       26s
prometheus-kube-prometheus-alertmanager   ClusterIP   10.247.105.191   <none>        9093/TCP                     26s
prometheus-kube-prometheus-operator       ClusterIP   10.247.198.72    <none>        443/TCP                      26s
prometheus-kube-prometheus-prometheus     ClusterIP   10.247.28.103    <none>        9090/TCP                     26s
prometheus-kube-state-metrics             ClusterIP   10.247.125.106   <none>        8080/TCP                     26s
prometheus-operated                       ClusterIP   None             <none>        9090/TCP                     1s
prometheus-prometheus-node-exporter       ClusterIP   10.247.104.255   <none>        9100/TCP                     26s
[root@ccetest-87180 ~]# kubectl expose deployment prometheus-grafana --target-port=3000 --type=NodePort --name=prometheus-grafana-ext
or
kubectl port-forward deployment/prometheus-grafana 3000
username: admin
password: prom-operator
[root@ccetest-87180 ~]# vim values.yaml
[root@ccetest-87180 ~]# helm upgrade --install prometheus prometheus-community/kube-prometheus-stack -f values.yaml
Release "prometheus" has been upgraded. Happy Helming!
NAME: prometheus
LAST DEPLOYED: Tue Aug 30 01:16:25 2022
NAMESPACE: default
STATUS: deployed
REVISION: 2
NOTES:
kube-prometheus-stack has been installed. Check its status by running:
  kubectl --namespace default get pods -l "release=prometheus"
Visit https://github.com/prometheus-operator/kube-prometheus for instructions on how to create & configure Alertmanager and Prometheus instances using the Operator.

这里只是安装prometheus套件,如果需要安装其他软件,也可以使用helm仓库里的charts安装,命令如下(This is just to install the prometheus suite. If you need to install other software, you can also use the charts in the helm repository to install, the command is as follows):

helm repo add stable https://charts.helm.sh/stable
helm search repo software-name
# list install software and uninstall software
helm list
helm uninstall software-name
# list the repos
helm repo list

完成安装后,可以使用kubectl get svc查询到的grafana对应的IP + 端口进行访问(After the installation is complete, you can use the IP + port corresponding to grafana queried by kubectl get svc to access)。

cce-prometheus

四、使用EVS或SFStrubo(Use EVS or SFStrubo)

按照上面的默认安装,在k8s中prometheus、grafana、altermanager的数据存储类型是EmptyDir,这样就导致容器出现异常重建时,之前的数据就不存在了。所以要做数据的持久化,可以选择使用Huaweicloud的EVS、OBS、SFS三个类型的存储服务,不过由于OBS对象存储速度较慢,不太适合该场景,这里就排除掉了。

According to the default installation above, the data storage type of prometheus, grafana and altermanager in k8s is EmptyDir, so that when the container is rebuilt abnormally, the previous data does not exist. Therefore, for data persistence, you can choose to use three types of Huaweicloud storage services: EVS, OBS, and SFS. However, because OBS object storage is slow, it is not suitable for this scenario, so it is excluded here.

由于使用数据存储方式的定义是在values里定义的,所以可以先通过 helm inspect values 指令获取当前的默认配置,在修改里面的相关配置后,通过Helm进行安装或更新。

Since the definition of the data storage method is defined in values, you can first obtain the current default configuration through the helm inspect values command, and then install or update it through Helm after modifying the relevant configuration in it.

[root@ccetest-87180 ~]# helm inspect values prometheus-community/prometheus > prometheus.values.yaml
[root@ccetest-87180 ~]# helm install prometheus prometheus-community/kube-prometheus-stack --values /root/prometheus.values.yaml

1. 使用EVS存储数据(Storing data with EVS)

在使用prometheus.values.yaml部署之前,可以修改其中的三个服务的存储部分的定义,具体修改内容如下(Before deploying with prometheus.values.yaml, you can modify the definition of the storage part of the three services. The specific modifications are as follows):

## Prometheus StorageSpec for persistent data
## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/user-guides/storage.md
##
#storageSpec: {}
storageSpec:
## Using PersistentVolumeClaim
##
    volumeClaimTemplate:
    spec:
        storageClassName: csi-disk
        accessModes: ["ReadWriteOnce"]
        resources:
        requests:
            storage: 80Gi
#    selector: {}

在使用EVS存储时,不需要提前创建PV\PVC,使用该配置后,系统会自动创建。(When using EVS storage, you do not need to create PV\PVC in advance. After using this configuration, the system will create it automatically.)

更新安装(upgrade install)

[root@ccetest-87180 ~]# helm upgrade --install prometheus prometheus-community/kube-prometheus-stack -f values.yaml

2. 使用SFS共享存储数据(Store data using SFS shares)

使用SFS共享存储,需要提前创建好SFS、PV、PVC,再在values文件中更新相关内容。(To use SFS shared storage, you need to create SFS、PV and PVC in advance, and then update the relevant content in the values file.)

[root@ccetest-87180 ~]# cat pv-sfsturbo.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-sfsturbo-example
  annotations:
    pv.kubernetes.io/provisioned-by: everest-csi-provisioner
spec:
  mountOptions:
  - hard
  - timeo=600
  - nolock
  accessModes:
  - ReadWriteMany
  capacity:
    storage: 500Gi
  claimRef:
    apiVersion: v1
    kind: PersistentVolumeClaim
    name: prometheus-prometheus-kube-prometheus-prometheus-db-prometheus-prometheus-kube-prometheus-prometheus-0
    namespace: default
  csi:
    driver: sfsturbo.csi.everest.io
    fsType: nfs
    volumeAttributes:
      everest.io/share-export-location: 192.168.0.236:/
      storage.kubernetes.io/csiProvisionerIdentity: everest-csi-provisioner
    volumeHandle: 65aac901-875f-40fa-961f-40e50c5a46f8
  persistentVolumeReclaimPolicy: Retain
  storageClassName: csi-sfsturbo
[root@ccetest-87180 ~]# cat pvc-sfsturbo.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  annotations:
    volume.beta.kubernetes.io/storage-provisioner: everest-csi-provisioner
  name: prometheus-prometheus-kube-prometheus-prometheus-db-prometheus-prometheus-kube-prometheus-prometheus-0
  namespace: default
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 500Gi
  storageClassName: csi-sfsturbo
  volumeName: pv-sfsturbo-example
[root@ccetest-87180 ~]# kubectl create -f pv-sfsturbo.yaml -f pvc-sfsturbo.yaml

对应的values文件中的内容为(The content in the corresponding values file is):

## Prometheus StorageSpec for persistent data
## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/user-guides/storage.md
##
#storageSpec: {}
storageSpec:
## Using PersistentVolumeClaim
##
    volumeClaimTemplate:
    spec:
        storageClassName: csi-sfsturbo
        volumeName: pv-sfsturbo-example
        accessModes: ["ReadWriteOnce"]
        resources:
        requests:
            storage: 500Gi
#    selector: {}

同样的可以使用如下命令安装或更新(The same can be installed or updated using the following commands):

[root@ccetest-87180 ~]# helm install prometheus prometheus-community/kube-prometheus-stack --values /root/values.yaml
or
[root@ccetest-87180 ~]# helm upgrade --install prometheus prometheus-community/kube-prometheus-stack -f values.yaml

cloud-volume-sfs-cce

这部分可以参考官方文档:https://support.huaweicloud.com/intl/zh-cn/usermanual-cce/cce_01_0272.html

This part can refer to the official documentation: https://support.huaweicloud.com/intl/zh-cn/usermanual-cce/cce_01_0272.html

storageClassName的类型比较重要,每家云厂商关于这部分的定义是不同,具体可以使用以下指令确认相关信息。(The type of storageClassName is more important. Each cloud vendor defines this part differently. You can use the following commands to confirm the relevant information.)

[root@ccetest-87180 ~]# kubectl get sc
[root@ccetest-87180 ~]# kubectl get pv
[root@ccetest-87180 ~]# kubectl get pvc

发表回复

您的电子邮箱地址不会被公开。