09-3.部署 metrics-server 插件
- 如果没有特殊指明,本文档的所有操作均在 zhangjun-k8s01 节点上执行;
- kuberntes 自带插件的 manifests yaml 文件使用 gcr.io 的 docker registry,国内被墙,需要手动替换为其它 registry 地址(本文档未替换);
- 可以从微软中国提供的 gcr.io 免费代理下载被墙的镜像;
metrics-server 通过 kube-apiserver 发现所有节点,然后调用 kubelet APIs(通过 https 接口)获得各节点(Node)和 Pod 的 CPU、Memory 等资源使用情况。
从 Kubernetes 1.12 开始,kubernetes 的安装脚本移除了 Heapster,从 1.13 开始完全移除了对 Heapster 的支持,Heapster 不再被维护。
替代方案如下:
- 用于支持自动扩缩容的 CPU/memory HPA metrics:metrics-server;
- 通用的监控方案:使用第三方可以获取 Prometheus 格式监控指标的监控系统,如 Prometheus Operator;
- 事件传输:使用第三方工具来传输、归档 kubernetes events;
Kubernetes Dashboard 还不支持 metrics-server(PR:),如果使用 metrics-server 替代 Heapster,将无法在 dashboard 中以图形展示 Pod 的内存和 CPU 情况,需要通过 Prometheus、Grafana 等监控方案来弥补。
安装 metrics-server
从 github clone 源码:
修改 文件,为 metrics-server 添加三个命令行参数:
$ diff metrics-server-deployment.yaml.orig metrics-server-deployment.yaml
32a33,36
> args:
> - --metric-resolution=30s
> - --kubelet-preferred-address-types=InternalIP,Hostname,InternalDNS,ExternalDNS,ExternalIP
- —metric-resolution=30s:从 kubelet 采集数据的周期;
- —kubelet-preferred-address-types:优先使用 InternalIP 来访问 kubelet,这样可以避免节点名称没有 DNS 解析记录时,通过节点名称调用节点 kubelet API 失败的情况(未配置时默认的情况);
部署 metrics-server:
$ kubectl -n kube-system get pods -l k8s-app=metrics-server
NAME READY STATUS RESTARTS AGE
metrics-server-7cffff65bc-hkfr7 1/1 Running 0 56s
$ kubectl get svc -n kube-system metrics-server
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
metrics-server ClusterIP 10.254.37.139 <none> 443/TCP 94s
metrics-server 的命令行参数
-
https://172.27.137.240:6443/apis/metrics.k8s.io/v1beta1/nodes
https://172.27.137.240:6443/apis/metrics.k8s.io/v1beta1/pods /pods/ 直接使用 kubectl 命令访问:
kubectl get —raw apis/metrics.k8s.io/v1beta1/nodes kubectl get —raw apis/metrics.k8s.io/v1beta1/pods kubectl get —raw apis/metrics.k8s.io/v1beta1/nodes/
kubectl get —raw apis/metrics.k8s.io/v1beta1/namespace/ /pods/
$ kubectl get --raw "/apis/metrics.k8s.io/v1beta1" | jq .
{
"kind": "APIResourceList",
"apiVersion": "v1",
"groupVersion": "metrics.k8s.io/v1beta1",
"resources": [
{
"name": "nodes",
"namespaced": false,
"kind": "NodeMetrics",
"verbs": [
"get",
"list"
]
},
{
"singularName": "",
"namespaced": true,
"kind": "PodMetrics",
"verbs": [
"get",
"list"
]
}
]
}
$ kubectl get --raw "/apis/metrics.k8s.io/v1beta1/nodes" | jq .
{
"kind": "NodeMetricsList",
"apiVersion": "metrics.k8s.io/v1beta1",
"metadata": {
"selfLink": "/apis/metrics.k8s.io/v1beta1/nodes"
},
"items": [
{
"metadata": {
"name": "zhangjun-k8s01",
"selfLink": "/apis/metrics.k8s.io/v1beta1/nodes/zhangjun-k8s01",
"creationTimestamp": "2019-05-26T10:55:10Z"
},
"timestamp": "2019-05-26T10:54:52Z",
"window": "30s",
"usage": {
}
},
{
"metadata": {
"name": "zhangjun-k8s02",
"selfLink": "/apis/metrics.k8s.io/v1beta1/nodes/zhangjun-k8s02",
"creationTimestamp": "2019-05-26T10:55:10Z"
},
"timestamp": "2019-05-26T10:54:54Z",
"window": "30s",
"usage": {
"cpu": "253796835n",
"memory": "1028836Ki"
}
},
{
"metadata": {
"name": "zhangjun-k8s03",
"selfLink": "/apis/metrics.k8s.io/v1beta1/nodes/zhangjun-k8s03",
"creationTimestamp": "2019-05-26T10:55:10Z"
},
"timestamp": "2019-05-26T10:54:54Z",
"window": "30s",
"usage": {
"cpu": "280441339n",
"memory": "1072772Ki"
}
}
]
- /apis/metrics.k8s.io/v1beta1/nodes 和 /apis/metrics.k8s.io/v1beta1/pods 返回的 usage 包含 CPU 和 Memory;
使用 kubectl top 命令查看集群节点资源使用情况
kubectl top 命令从 metrics-server 获取集群节点基本的指标信息:
- metrics-server RBAC:https://github.com/kubernetes-incubator/metrics-server/issues/40
- metrics-server 参数:
- https://kubernetes.io/docs/tasks/debug-application-cluster/core-metrics-pipeline/
- metrics-server 的 。