F. 部署 docker 组件

    注意:

    1. 如果没有特殊指明,本文档的所有操作均在 zhangjun-k8s01 节点上执行,然后远程分发文件和执行命令;
    2. 需要先安装 flannel,请参考附件 ;

    参考 07-0.部署worker节点.md

    下载和分发 docker 二进制文件

    到 下载最新发布包:

    分发二进制文件到所有 worker 节点:

    1. source /opt/k8s/bin/environment.sh
    2. for node_ip in ${NODE_IPS[@]}
    3. do
    4. echo ">>> ${node_ip}"
    5. scp docker/* root@${node_ip}:/opt/k8s/bin/
    6. ssh root@${node_ip} "chmod +x /opt/k8s/bin/*"
    7. done

    创建和分发 systemd unit 文件

    1. cd /opt/k8s/work
    2. cat > docker.service <<"EOF"
    3. [Unit]
    4. Description=Docker Application Container Engine
    5. Documentation=http://docs.docker.io
    6. [Service]
    7. WorkingDirectory=##DOCKER_DIR##
    8. Environment="PATH=/opt/k8s/bin:/bin:/sbin:/usr/bin:/usr/sbin"
    9. EnvironmentFile=-/run/flannel/docker
    10. ExecStart=/opt/k8s/bin/dockerd $DOCKER_NETWORK_OPTIONS
    11. ExecReload=/bin/kill -s HUP $MAINPID
    12. Restart=on-failure
    13. RestartSec=5
    14. LimitNOFILE=infinity
    15. LimitNPROC=infinity
    16. LimitCORE=infinity
    17. Delegate=yes
    18. KillMode=process
    19. [Install]
    20. WantedBy=multi-user.target
    21. EOF
    • EOF 前后有双引号,这样 bash 不会替换文档中的变量,如 $DOCKER_NETWORK_OPTIONS (这些环境变量是 systemd 负责替换的。);
    • dockerd 运行时会调用其它 docker 命令,如 docker-proxy,所以需要将 docker 命令所在的目录加到 PATH 环境变量中;
    • flanneld 启动时将网络配置写入 /run/flannel/docker 文件中,dockerd 启动前读取该文件中的环境变量 DOCKER_NETWORK_OPTIONS ,然后设置 docker0 网桥网段;
    • 如果指定了多个 EnvironmentFile 选项,则必须将 /run/flannel/docker 放在最后(确保 docker0 使用 flanneld 生成的 bip 参数);
    • docker 需要以 root 用于运行;
      1. $ sudo iptables -P FORWARD ACCEPT

      并且把以下命令写入 /etc/rc.local 文件中,防止节点重启iptables FORWARD chain的默认策略又还原为DROP

      1. /sbin/iptables -P FORWARD ACCEPT

    分发 systemd unit 文件到所有 worker 机器:

    1. source /opt/k8s/bin/environment.sh
    2. sed -i -e "s|##DOCKER_DIR##|${DOCKER_DIR}|" docker.service
    3. for node_ip in ${NODE_IPS[@]}
    4. do
    5. echo ">>> ${node_ip}"
    6. scp docker.service root@${node_ip}:/etc/systemd/system/

    使用国内的仓库镜像服务器以加快 pull image 的速度,同时增加下载的并发数 (需要重启 dockerd 生效):

    分发 docker 配置文件到所有 worker 节点:

    1. cd /opt/k8s/work
    2. source /opt/k8s/bin/environment.sh
    3. for node_ip in ${NODE_IPS[@]}
    4. do
    5. echo ">>> ${node_ip}"
    6. ssh root@${node_ip} "mkdir -p /etc/docker/ ${DOCKER_DIR}/{data,exec}"
    7. scp docker-daemon.json root@${node_ip}:/etc/docker/daemon.json
    8. done

    启动 docker 服务

    1. source /opt/k8s/bin/environment.sh
    2. for node_ip in ${NODE_IPS[@]}
    3. do
    4. echo ">>> ${node_ip}"
    5. ssh root@${node_ip} "systemctl daemon-reload && systemctl enable docker && systemctl restart docker"
    6. done

    检查服务运行状态

    1. source /opt/k8s/bin/environment.sh
    2. for node_ip in ${NODE_IPS[@]}
    3. do
    4. echo ">>> ${node_ip}"
    5. ssh root@${node_ip} "systemctl status docker|grep Active"
    6. done
    1. journalctl -u docker
    1. source /opt/k8s/bin/environment.sh
    2. for node_ip in ${NODE_IPS[@]}
    3. do
    4. echo ">>> ${node_ip}"
    5. ssh root@${node_ip} "/usr/sbin/ip addr show flannel.1 && /usr/sbin/ip addr show docker0"
    6. done

    确认各 worker 节点的 docker0 网桥和 flannel.1 接口的 IP 处于同一个网段中(如下 172.30.80.0/32 位于 172.30.80.1/21 中):

    注意: 如果您的服务安装顺序不对或者机器环境比较复杂, docker服务早于flanneld服务安装,此时 worker 节点的 docker0 网桥和 flannel.1 接口的 IP可能不会同处同一个网段下,这个时候请先停止docker服务, 手工删除docker0网卡,重新启动docker服务后即可修复:

    1. systemctl stop docker
    2. ip link delete docker0
    3. systemctl start docker

    查看 docker 的状态信息

    1. $ ps -elfH|grep docker
    2. 4 S root 116590 1 0 80 0 - 131420 futex_ 11:22 ? 00:00:01 /opt/k8s/bin/dockerd --bip=172.30.80.1/21 --ip-masq=false --mtu=1450
    3. 4 S root 116668 116590 1 80 0 - 161643 futex_ 11:22 ? 00:00:03 containerd --config /data/k8s/docker/exec/containerd/containerd.toml --log-level debug
    1. $ docker info
    2. Containers: 0
    3. Running: 0
    4. Paused: 0
    5. Stopped: 0
    6. Images: 0
    7. Server Version: 18.09.6
    8. Storage Driver: overlay2
    9. Backing Filesystem: extfs
    10. Supports d_type: true
    11. Logging Driver: json-file
    12. Plugins:
    13. Volume: local
    14. Network: bridge host macvlan null overlay
    15. Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
    16. Swarm: inactive
    17. Runtimes: runc
    18. Default Runtime: runc
    19. Init Binary: docker-init
    20. containerd version: bb71b10fd8f58240ca47fbb579b9d1028eea7c84
    21. runc version: 2b18fe1d885ee5083ef9f0838fee39b62d653e30
    22. init version: fec3683
    23. Security Options:
    24. apparmor
    25. seccomp
    26. Profile: default
    27. Kernel Version: 4.14.110-0.el7.4pd.x86_64
    28. Operating System: CentOS Linux 7 (Core)
    29. OSType: linux
    30. Architecture: x86_64
    31. CPUs: 8
    32. Total Memory: 15.64GiB
    33. Name: zhangjun-k8s01
    34. ID: VJYK:3T6T:EPHU:65SM:3OZD:DMNE:MT5J:O22I:TCG2:F3JR:MZ76:B3EF
    35. Docker Root Dir: /data/k8s/docker/data
    36. Debug Mode (client): false
    37. Debug Mode (server): true
    38. File Descriptors: 22
    39. Goroutines: 43
    40. System Time: 2019-05-26T11:26:21.2494815+08:00
    41. EventsListeners: 0
    42. Registry: https://index.docker.io/v1/
    43. Labels:
    44. Experimental: false
    45. Insecure Registries:
    46. docker02:35000
    47. 127.0.0.0/8
    48. Registry Mirrors:
    49. https://docker.mirrors.ustc.edu.cn/
    50. https://hub-mirror.c.163.com/
    51. Live Restore Enabled: true
    52. Product License: Community Engine
    53. WARNING: No swap limit support

    更新 kubelet 配置并重启服务(每个节点上都操作)

    需要删除 kubelet 的 systemd unit 文件(/etc/systemd/system/kubelet.service),删除下面 4 行:

    1. --network-plugin=cni \\
    2. --cni-conf-dir=/etc/cni/net.d \\
    3. --container-runtime=remote \\

    然后重启 kubelet 服务:

    1. systemctl restart kubelet