1、简介
k8s自带的默认调度器为 default scheduler ,调度是根据Request来计算的,并不是根据实际的使用率,所以就会出现k8s的node节点的资源使用率不均衡的情况,有的节点资源使用率很高,有的节点资源使用率很低,导致资源浪费的情况,那么有没有一种根据实际使用率的调度器呢?
那就是我们今天讲的重点:Crane-scheduler
Crane-scheduler 是一个基于调度器框架的调度器插件集合,包括:
- 动态调度器:负载感知调度器插件
2、前提条件
Crane-scheduler动态调度依赖Prometheus,所以需要确保k8s集群中安装了Prometheus,如果没有安装的话,可以参考如下方式安装:
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm install prometheus -n crane-system --set pushgateway.enabled=false --set alertmanager.enabled=false --set server.persistentVolume.enabled=false -f https://raw.githubusercontent.com/gocrane/helm-charts/main/integration/prometheus/override_values.yaml --create-namespace prometheus-community/prometheus
如果你的k8s集群中,已经安装过Prometheus,则需要创建如下规则,以获取预期的聚合数据:
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: example-record
spec:
groups:
- name: cpu_mem_usage_active
interval: 30s
rules:
- record: cpu_usage_active
expr: 100 - (avg by (instance) (irate(node_cpu_seconds_total{mode="idle"}[30s])) * 100)
- record: mem_usage_active
expr: 100*(1-node_memory_MemAvailable_bytes/node_memory_MemTotal_bytes)
- name: cpu-usage-5m
interval: 5m
rules:
- record: cpu_usage_max_avg_1h
expr: max_over_time(cpu_usage_avg_5m[1h])
- record: cpu_usage_max_avg_1d
expr: max_over_time(cpu_usage_avg_5m[1d])
- name: cpu-usage-1m
interval: 1m
rules:
- record: cpu_usage_avg_5m
expr: avg_over_time(cpu_usage_active[5m])
- name: mem-usage-5m
interval: 5m
rules:
- record: mem_usage_max_avg_1h
expr: max_over_time(mem_usage_avg_5m[1h])
- record: mem_usage_max_avg_1d
expr: max_over_time(mem_usage_avg_5m[1d])
- name: mem-usage-1m
interval: 1m
rules:
- record: mem_usage_avg_5m
expr: avg_over_time(mem_usage_active[5m])
注意: Prometheus 的采样间隔必须小于 30 秒,否则上述规则(如 cpu_usage_active)可能不会生效。
3、安装Crane-scheduler
安装Crane-scheduler有两种选择:
- 安装 Crane-scheduler 作为第二个调度器
- 用 Crane-scheduler 替换原生 Kube-scheduler
3.1 安装 Crane-scheduler 作为第二个调度器
prometheusAddr的地址根据实际情况修改
helm repo add crane https://gocrane.github.io/helm-charts
helm install scheduler -n crane-system --create-namespace --set global.prometheusAddr="http://prometheus-server.crane-system.svc:8080" crane/scheduler
测试Crane-scheduler,我们使用nginx作为测试对象
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
annotations:
configmap.reloader.stakater.com/reload: "nginx-html"
spec:
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
schedulerName: crane-scheduler # 指定调度器使用crane-scheduler
containers:
- name: nginx
image: nginx:alpine
ports:
- containerPort: 80
测试结果:
kubectl describe pod nginx-85d5bf75d4-5zhjp
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 62s crane-scheduler Successfully assigned default/nginx-85d5bf75d4-5zhjp to 192.168.96.53
Normal Pulled 62s kubelet Container image "nginx:alpine" already present on machine
Normal Created 62s kubelet Created container nginx
Normal Started 61s kubelet Started container nginx
结果:测试使用Crane-scheduler作为第二调度器成功!
3.2 用 Crane-scheduler 替换原生 Kube-scheduler
备份/etc/kubernetes/manifests/kube-scheduler.yaml
cp /etc/kubernetes/manifests/kube-scheduler.yaml /etc/kubernetes/
修改 kube-scheduler( scheduler-config.yaml) 的配置文件以启用动态调度程序插件并配置插件参数:
apiVersion: kubescheduler.config.k8s.io/v1beta2
kind: KubeSchedulerConfiguration
...
profiles:
- schedulerName: default-scheduler
plugins:
filter:
enabled:
- name: Dynamic
score:
enabled:
- name: Dynamic
weight: 3
pluginConfig:
- name: Dynamic
args:
policyConfigPath: /etc/kubernetes/policy.yaml
...
创建/etc/kubernetes/policy.yaml,用作动态插件的调度策略:
apiVersion: scheduler.policy.crane.io/v1alpha1
kind: DynamicSchedulerPolicy
spec:
syncPolicy:
##cpu usage
- name: cpu_usage_avg_5m
period: 3m
- name: cpu_usage_max_avg_1h
period: 15m
- name: cpu_usage_max_avg_1d
period: 3h
##memory usage
- name: mem_usage_avg_5m
period: 3m
- name: mem_usage_max_avg_1h
period: 15m
- name: mem_usage_max_avg_1d
period: 3h
predicate:
##cpu usage
- name: cpu_usage_avg_5m
maxLimitPecent: 0.65
- name: cpu_usage_max_avg_1h
maxLimitPecent: 0.75
##memory usage
- name: mem_usage_avg_5m
maxLimitPecent: 0.65
- name: mem_usage_max_avg_1h
maxLimitPecent: 0.75
priority:
##cpu usage
- name: cpu_usage_avg_5m
weight: 0.2
- name: cpu_usage_max_avg_1h
weight: 0.3
- name: cpu_usage_max_avg_1d
weight: 0.5
##memory usage
- name: mem_usage_avg_5m
weight: 0.2
- name: mem_usage_max_avg_1h
weight: 0.3
- name: mem_usage_max_avg_1d
weight: 0.5
hotValue:
- timeRange: 5m
count: 5
- timeRange: 1m
count: 2
用 Crane-scheduler修改kube-scheduler.yaml替换 kube-scheduler 镜像:
...
image: docker.io/gocrane/crane-scheduler:0.0.23
...
评论区