K8S中使用自定义指标进行HPA扩缩容

k8s的默认扩缩容使用的是metrics-server来实现的,一般自指义指标(custom HPA)通过prometheus和prometheus-adapter来实现的。Prometheus 用于监控应用的负载和集群本身的各种指标,Prometheus Adapter 可以帮我们使用 Prometheus 收集的指标并使用它们来制定扩展策略,这些指标都是通过 APIServer 暴露的。
prometheus-hpa

一、环境准备

这里还是选用的华为CCE,免去安装k8s、Prometheus、prometheus-adapter的过程中(虽然并不复杂,但是和点击几下更省事),装完勾选上Prometheus插件,两者就都装好了。
prometheus-adapter

这里选用的测试镜像使用的是nginx和nginx-exporter,这个可以参看之前的《K8S中使用Prometheus监控nginx指标》。

通过以下指令确认相关信息:

# 确认nginx应用已经运行
[root@testcce-68506-l3jp4 nginx]# kubectl get pods -o wide
NAME                             READY   STATUS    RESTARTS   AGE    IP             NODE            NOMINATED NODE   READINESS GATES
nginx-exporter-5dc4dcd94-6tdp7   2/2     Running   0          37m    172.16.0.135   192.168.0.211   <none>           <none>
# 确认可以通过链接可以正常获取metrics数据
[root@testcce-68506-l3jp4 nginx]# curl 172.16.0.135:9113/metrics
# TYPE nginx_http_requests_total counter
nginx_http_requests_total 198  //可以看到该项配置

上面nginx_http_requests_total这项就是后面我们HPA要用的指标。

确认k8s支持自定义指标:

# kubectl get apiservices
# kubectl get apiservices v1beta1.custom.metrics.k8s.io
# kubectl api-resources
# kubectl api-resources|grep metrics.k8s.io

二、自定义策略、配置HPA

配置自定义prometheus-adapter-config配置

获取当前adapter-config的策略配置

kubectl -n monitoring get configmaps adapter-config -o yaml > rule.yaml

编辑该配置,增加自定义的参数配置,在rules下面增加自定义的配置部分:

apiVersion: v1
data:
  config.yaml: |-
    rules:
    - seriesQuery: '{__name__=~"^http_requests_.*",kubernetes_pod_name!="",kubernetes_namespace!=""}'  //这里也可以精确使用nginx_http_requests_total
      resources:
        overrides:
          kubernetes_namespace:
            resource: namespace
          kubernetes_pod_name:
            resource: pod
      name:
        matches: ^(.*)_total$
        as: "${1}_per_second"
      metricsQuery: (sum(rate(<<.Series>>{<<.LabelMatchers>>}[1m])) by (<<.GroupBy>>))

使用的时候需要删除以下四行内容,不然会报错Operation cannot be fulfilled on configmaps "ads-central-configuration": the object has been modified; please apply your changes to the latest version and try again**

creationTimestamp:
resourceVersion:
selfLink:
uid:

获取所有的自定义指标,理论应该能看到nginx_http_requests_per_second(因为matches as进行了替换,不过在CCE上比较奇怪的是后面一部分没替换上,显示的名字是nginx_http_requests,只正则了前面一部分:broken_heart:):

kubectl get --raw="/apis/custom.metrics.k8s.io/v1beta1"

上面指令输出比较多,使用这个命令可以格式化输出(后面也可以加管道jq. 或 python -m json.tool格式化查看)。

kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/*/nginx_http_requests" | python -m json.tool

配置HPA

[root@testcce-68506-l3jp4 nginx]# more hpa.yaml
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: nginx-custom-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: nginx-exporter
  minReplicas: 2
  maxReplicas: 5
  metrics:
  - type: Pods
    pods:
      metricName: nginx_http_requests
      targetAverageValue: 10

配置HPA策略,应用后可以通过watch 'kubectl get hpa'kubectl describe hpa nginx-custom-hpa 查看详细信息。

[root@testcce-68506-l3jp4 nginx]# kubectl get hpa
NAME               REFERENCE                   TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
nginx-custom-hpa   Deployment/nginx-exporter   88m/10    2         5         2          3h39m

注意上面targets里的88m部分,1000m代表1,这个是和平时配置CPU使用配额部分是一样的,这个代码每秒一次请求。

验证测试

可以通过kubectl expose deployment nginx-exporter --type=NodePort --name=nginx-nodeport --port=80进行服务暴漏,通过以下方式进行访问:

[root@testcce-68506-l3jp4 nginx]# kubectl describe svc nginx-nodeport
Name:                     nginx-nodeport
Namespace:                default
Labels:                   <none>
Annotations:              <none>
Selector:                 app=nginx-exporter
Type:                     NodePort
IP:                       10.247.52.252
Port:                     <unset>  80/TCP
TargetPort:               80/TCP
NodePort:                 <unset>  31029/TCP
Endpoints:                172.16.0.130:80,172.16.0.131:80
Session Affinity:         None
External Traffic Policy:  Cluster
Events:                   <none>
[root@testcce-68506-l3jp4 nginx]# kubectl get nodes
NAME            STATUS   ROLES    AGE     VERSION
192.168.0.211   Ready    <none>   5h11m   v1.19.10-r0-CCE21.11.1.B005-21.11.1.B005
192.168.0.241   Ready    <none>   5h10m   v1.19.10-r0-CCE21.11.1.B005-21.11.1.B005
[root@testcce-68506-l3jp4 nginx]# curl 192.168.0.211:31029
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
html { color-scheme: light dark; }
body { width: 35em; margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif; }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>
<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>
<p><em>Thank you for using nginx.</em></p>
</body>
</html>

接下来写一个简单的while循环进行压测:

[root@testcce-68506-l3jp4 nginx]# while true;do curl 192.168.0.211:31029;done
# 另开一个终端,可以通过watch 'kubectl get hpa'查看变化过程
[root@testcce-68506-l3jp4 ~]# kubectl get hpa
NAME               REFERENCE                   TARGETS     MINPODS   MAXPODS   REPLICAS   AGE
nginx-custom-hpa   Deployment/nginx-exporter   25757m/10   2         5         2          3h49m
[root@testcce-68506-l3jp4 ~]# kubectl get hpa
NAME               REFERENCE                   TARGETS     MINPODS   MAXPODS   REPLICAS   AGE
nginx-custom-hpa   Deployment/nginx-exporter   55741m/10   2         5         4          3h49m
[root@testcce-68506-l3jp4 ~]# kubectl get hpa
NAME               REFERENCE                   TARGETS     MINPODS   MAXPODS   REPLICAS   AGE
nginx-custom-hpa   Deployment/nginx-exporter   55741m/10   2         5         4          3h49m
[root@testcce-68506-l3jp4 ~]# kubectl get pods
NAME                             READY   STATUS    RESTARTS   AGE
nginx-exporter-5dc4dcd94-6tdp7   2/2     Running   2          3h49m
nginx-exporter-5dc4dcd94-99vq5   2/2     Running   2          3h47m
nginx-exporter-5dc4dcd94-kzcnx   2/2     Running   0          35s
nginx-exporter-5dc4dcd94-lfwvp   2/2     Running   0          35s
nginx-exporter-5dc4dcd94-mmmsv   2/2     Running   0          20s
web-terminal-6f975b97d7-6qrrf    1/1     Running   1          5h9m

最后可以看到nginx-exporter变成了5个pod后就不再增加了。

不过需要注意的是,缩容没那么快,需要等5分钟后(300秒),这个是由behavior字段控制的:

behavior:
  scaleDown:
    stabilizationWindowSeconds: 300
    policies:
    - type: Percent
      value: 100
      periodSeconds: 15
  scaleUp:
    stabilizationWindowSeconds: 0
    policies:
    - type: Percent
      value: 100
      periodSeconds: 15
    - type: Pods
      value: 4
      periodSeconds: 15
    selectPolicy: Max

想要调整快速回收,也可以通过配置该项内容进行控制。具体也可以参看官方文档知乎上的说明

后记:

解决配置自定义prometheus-adapter-config配置 部分中正则不生效的问题。
通过查询了比较多的资料并频繁测试,总终发现每次更新完rule文件后,需要重启custom-metrics-apiserver服务才可以生效

1、我们可以在这里配置rule

cce-hpa-rule

更新rule配置:

update-configmap-rule

重启apiserver并生效,由于是容器化部署,可以通过删除容器,由k8s自行重建完成重启操作:

custom-metrics-apiserver

执行查看,结果如下:

➜  ~ kubectl get --raw="/apis/custom.metrics.k8s.io/v1beta1" | jq |grep nginx|grep per
      "name": "pods/nginx_http_requests_per_second",
      "name": "namespaces/nginx_http_requests_per_second",

发表回复

您的电子邮箱地址不会被公开。