高负载

    节点高负载会导致进程无法获得足够的 cpu 时间片来运行,通常表现为网络 timeout,健康检查失败,服务不可用。

    有时候即便 cpu ‘us’ (user) 不高但 cpu ‘id’ (idle) 很高的情况节点负载也很高,这是为什么呢?通常是文件 IO 性能达到瓶颈导致 IO WAIT 过多,从而使得节点整体负载升高,影响其它进程的性能。

    使用 命令看下当前负载:

    1. top - 19:42:08 up 23:59, 2 users, load average: 34.64, 35.80, 35.76
    2. Tasks: 679 total, 1 running, 678 sleeping, 0 stopped, 0 zombie
    3. Cpu0 : 29.5%us, 3.7%sy, 0.0%ni, 48.7%id, 17.9%wa, 0.0%hi, 0.1%si, 0.0%st
    4. Cpu1 : 29.3%us, 3.7%sy, 0.0%ni, 48.9%id, 17.9%wa, 0.0%hi, 0.1%si, 0.0%st
    5. Cpu2 : 26.1%us, 3.1%sy, 0.0%ni, 64.4%id, 6.0%wa, 0.0%hi, 0.3%si, 0.0%st
    6. Cpu3 : 25.9%us, 3.1%sy, 0.0%ni, 65.5%id, 5.4%wa, 0.0%hi, 0.1%si, 0.0%st
    7. Cpu4 : 24.9%us, 3.0%sy, 0.0%ni, 66.8%id, 5.0%wa, 0.0%hi, 0.3%si, 0.0%st
    8. Cpu5 : 24.9%us, 2.9%sy, 0.0%ni, 67.0%id, 4.8%wa, 0.0%hi, 0.3%si, 0.0%st
    9. Cpu6 : 24.2%us, 2.7%sy, 0.0%ni, 68.3%id, 4.5%wa, 0.0%hi, 0.3%si, 0.0%st
    10. Cpu7 : 24.3%us, 2.6%sy, 0.0%ni, 68.5%id, 4.2%wa, 0.0%hi, 0.3%si, 0.0%st
    11. Cpu8 : 23.8%us, 2.6%sy, 0.0%ni, 69.2%id, 4.1%wa, 0.0%hi, 0.3%si, 0.0%st
    12. Cpu9 : 23.9%us, 2.5%sy, 0.0%ni, 69.3%id, 4.0%wa, 0.0%hi, 0.3%si, 0.0%st
    13. Cpu10 : 23.3%us, 2.4%sy, 0.0%ni, 68.7%id, 5.6%wa, 0.0%hi, 0.0%si, 0.0%st
    14. Cpu11 : 23.3%us, 2.4%sy, 0.0%ni, 69.2%id, 5.1%wa, 0.0%hi, 0.0%si, 0.0%st
    15. Cpu12 : 21.8%us, 2.4%sy, 0.0%ni, 60.2%id, 15.5%wa, 0.0%hi, 0.0%si, 0.0%st
    16. Cpu13 : 21.9%us, 2.4%sy, 0.0%ni, 60.6%id, 15.2%wa, 0.0%hi, 0.0%si, 0.0%st
    17. Cpu14 : 21.4%us, 2.3%sy, 0.0%ni, 72.6%id, 3.7%wa, 0.0%hi, 0.0%si, 0.0%st
    18. Cpu15 : 21.5%us, 2.2%sy, 0.0%ni, 73.2%id, 3.1%wa, 0.0%hi, 0.0%si, 0.0%st
    19. Cpu16 : 21.2%us, 2.2%sy, 0.0%ni, 73.6%id, 3.0%wa, 0.0%hi, 0.0%si, 0.0%st
    20. Cpu17 : 21.2%us, 2.1%sy, 0.0%ni, 73.8%id, 2.8%wa, 0.0%hi, 0.0%si, 0.0%st
    21. Cpu19 : 21.0%us, 2.1%sy, 0.0%ni, 74.4%id, 2.5%wa, 0.0%hi, 0.0%si, 0.0%st
    22. Cpu21 : 20.8%us, 2.0%sy, 0.0%ni, 73.9%id, 3.2%wa, 0.0%hi, 0.0%si, 0.0%st
    23. Cpu22 : 20.8%us, 2.0%sy, 0.0%ni, 74.4%id, 2.8%wa, 0.0%hi, 0.0%si, 0.0%st
    24. Cpu23 : 20.8%us, 1.9%sy, 0.0%ni, 74.4%id, 2.8%wa, 0.0%hi, 0.0%si, 0.0%st
    25. Mem: 32865032k total, 30209248k used, 2655784k free, 370748k buffers
    26. Swap: 8388604k total, 5440k used, 8383164k free, 7986552k cached

    wa 通常是 0%,如果经常在 1 之上,说明存储设备的速度已经太慢,无法跟上 cpu 的处理速度。

    使用 atop 看下当前磁盘 IO 状态:

    在本例中磁盘 sda 已经 100% busy,已经严重达到性能瓶颈。按 ‘d’ 看下是哪些进程在使用磁盘IO:

    1. ATOP - lemp 2017/01/23 19:42:46 --------- 2s elapsed
    2. PRC | sys 0.24s | user 1.99s | #proc 679 | #tslpu 54 | #zombie 0 | #exit 0 |
    3. CPU | sys 11% | user 101% | irq 1% | idle 2089% | wait 208% | curscal 63% |
    4. CPL | avg1 38.49 | avg5 36.48 | avg15 35.98 | csw 4654 | intr 6876 | numcpu 24 |
    5. MEM | tot 31.3G | free 2.2G | cache 7.6G | dirty 48.7M | buff 362.1M | slab 1.2G |
    6. SWP | tot 8.0G | free 8.0G | | | vmcom 23.9G | vmlim 23.7G |
    7. DSK | sda | busy 100% | read 2 | write 362 | MBw/s 2.28 | avio 5.49 ms |
    8. NET | transport | tcpi 1031 | tcpo 968 | udpi 0 | udpo 0 | tcpao 45 |
    9. NET | network | ipi 1031 | ipo 968 | ipfrw 0 | deliv 1031 | icmpo 0 |
    10. NET | eth0 1% | pcki 558 | pcko 508 | si 762 Kbps | so 1077 Kbps | erro 0 |
    11. NET | lo ---- | pcki 406 | pcko 406 | si 2273 Kbps | so 2273 Kbps | erro 0 |
    12. PID TID RDDSK WRDSK WCANCL DSK CMD 1/5
    13. 9783 - 0K 468K 16K 40% mysqld
    14. 1930 - 0K 212K 0K 18% flush-8:0
    15. 5896 - 0K 152K 0K 13% nginx
    16. 5909 - 0K 60K 0K 5% nginx
    17. 5906 - 0K 36K 0K 3% nginx
    18. 5907 - 16K 8K 0K 2% nginx
    19. 5903 - 20K 0K 0K 2% nginx
    20. 5901 - 0K 12K 0K 1% nginx
    21. 5908 - 0K 8K 0K 1% nginx
    22. 5894 - 0K 8K 0K 1% nginx
    23. 5911 - 0K 8K 0K 1% nginx
    24. 5900 - 0K 4K 4K 0% nginx
    25. 5551 - 0K 4K 0K 0% php-fpm
    26. 5913 - 0K 4K 0K 0% nginx
    27. 5895 - 0K 4K 0K 0% nginx
    28. 6133 - 0K 0K 0K 0% php-fpm
    29. 5780 - 0K 0K 0K 0% php-fpm
    30. 6675 - 0K 0K 0K 0% atop

    通过 man iotop 可以看下这几个参数的含义:

    1. -o, --only
    2. Only show processes or threads actually doing I/O, instead of showing all processes or threads. This can be dynamically toggled by pressing o.
    3. -P, --processes
    4. Only show processes. Normally iotop shows all threads.
    5. -a, --accumulated
    6. Show accumulated I/O instead of bandwidth. In this mode, iotop shows the amount of I/O processes have done since iotop started.

    TODO 优化

    比如在节点上装了数据库,但不被 K8S 所管理,这是用法不正确,不建议在 K8S 节点上部署其它进程。