原教程来自 github/opsnull, 现在此基础上记录自己搭建遇到的问题

kubernetes worker 节点运行如下组件:

  • docker
  • kubelet
  • kube-proxy
  • flanneld
  • kube-nginx

添加三台work节点

  • slave-34 192.168.1.34
  • slave-35 192.168.1.35
  • slave-36 192.168.1.36

重新执行 系统初始化和全局变量

在slave-31将生成的 CA 证书、秘钥文件、配置文件拷贝到所有worker节点的 /etc/kubernetes/cert 目录下:

cd /opt/k8s/work
source /opt/k8s/bin/environment.sh
for node_ip in 192.168.1.34 192.168.1.35 192.168.1.36
  do
    echo ">>> ${node_ip}"
    ssh root@${node_ip} "mkdir -p /etc/kubernetes/cert"
    scp ca*.pem ca-config.json root@${node_ip}:/etc/kubernetes/cert
  done

分发二进制文件到集群所有节点:

cd /opt/k8s/work
source /opt/k8s/bin/environment.sh
for node_ip in 192.168.1.34 192.168.1.35 192.168.1.36
  do
    echo ">>> ${node_ip}"
    scp flannel/{flanneld,mk-docker-opts.sh} root@${node_ip}:/opt/k8s/bin/
    ssh root@${node_ip} "chmod +x /opt/k8s/bin/*"
  done

将生成的证书和私钥分发到所有work节点

cd /opt/k8s/work
source /opt/k8s/bin/environment.sh
for node_ip in 192.168.1.34 192.168.1.35 192.168.1.36
  do
    echo ">>> ${node_ip}"
    ssh root@${node_ip} "mkdir -p /etc/flanneld/cert"
    scp flanneld*.pem root@${node_ip}:/etc/flanneld/cert
  done

分发 flanneld systemd unit 文件到所有节点

cd /opt/k8s/work
source /opt/k8s/bin/environment.sh
for node_ip in 192.168.1.34 192.168.1.35 192.168.1.36
  do
    echo ">>> ${node_ip}"
    scp flanneld.service root@${node_ip}:/etc/systemd/system/
  done

启动 flanneld 服务

source /opt/k8s/bin/environment.sh
for node_ip in 192.168.1.34 192.168.1.35 192.168.1.36
  do
    echo ">>> ${node_ip}"
    ssh root@${node_ip} "systemctl daemon-reload && systemctl enable flanneld && systemctl restart flanneld"
  done

检查启动结果

source /opt/k8s/bin/environment.sh
for node_ip in 192.168.1.34 192.168.1.35 192.168.1.36
  do
    echo ">>> ${node_ip}"
    ssh root@${node_ip} "systemctl status flanneld|grep Active"
  done

在各节点上部署 flannel 后,检查是否创建了 flannel 接口(名称可能为 flannel0、flannel.0、flannel.1 等):

source /opt/k8s/bin/environment.sh
for node_ip in 192.168.1.31 192.168.1.32 192.168.1.33 192.168.1.34 192.168.1.35 192.168.1.36
  do
    echo ">>> ${node_ip}"
    ssh ${node_ip} "/usr/sbin/ip addr show flannel.1|grep -w inet"
  done

在各节点上 ping 所有 flannel 接口 IP,确保能通:

source /opt/k8s/bin/environment.sh
for node_ip in 192.168.1.31 192.168.1.32 192.168.1.33 192.168.1.34 192.168.1.35 192.168.1.36
  do
    echo ">>> ${node_ip}"
    ssh ${node_ip} "ping -c 1 172.30.192.0"
    ssh ${node_ip} "ping -c 1 172.30.96.0"
    ssh ${node_ip} "ping -c 1 172.30.184.0"
    ssh ${node_ip} "ping -c 1 172.30.168.0"
    ssh ${node_ip} "ping -c 1 172.30.136.0"
    ssh ${node_ip} "ping -c 1 172.30.72.0"
  done

部署 docker 组件

docker 运行和管理容器,kubelet 通过 Container Runtime Interface (CRI) 与它进行交互。

下载和分发 docker 二进制文件

到 docker 下载页面 下载最新发布包

cd /opt/k8s/work
wget https://download.docker.com/linux/static/stable/x86_64/docker-18.09.6.tgz
tar -xvf docker-18.09.6.tgz

分发二进制文件到所有 worker 节点

cd /opt/k8s/work
source /opt/k8s/bin/environment.sh
for node_ip in ${NODE_IPS[@]}
  do
    echo ">>> ${node_ip}"
    scp docker/*  root@${node_ip}:/opt/k8s/bin/
    ssh root@${node_ip} "chmod +x /opt/k8s/bin/*"
  done

创建和分发 systemd unit 文件

cd /opt/k8s/work
cat > docker.service <<"EOF"
[Unit]
Description=Docker Application Container Engine
Documentation=http://docs.docker.io

[Service]
WorkingDirectory=##DOCKER_DIR##
Environment="PATH=/opt/k8s/bin:/bin:/sbin:/usr/bin:/usr/sbin"
EnvironmentFile=-/run/flannel/docker
ExecStart=/opt/k8s/bin/dockerd $DOCKER_NETWORK_OPTIONS
ExecReload=/bin/kill -s HUP $MAINPID
Restart=on-failure
RestartSec=5
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
Delegate=yes
KillMode=process

[Install]
WantedBy=multi-user.target
EOF
  • EOF 前后有双引号,这样 bash 不会替换文档中的变量,如 $DOCKER_NETWORK_OPTIONS (这些环境变量是 systemd 负责替换的。);
  • dockerd 运行时会调用其它 docker 命令,如 docker-proxy,所以需要将 docker 命令所在的目录加到 PATH 环境变量中;
  • flanneld 启动时将网络配置写入 /run/flannel/docker 文件中,dockerd 启动前读取该文件中的环境变量 DOCKER_NETWORK_OPTIONS ,然后设置 docker0 网桥网段;
  • 如果指定了多个 EnvironmentFile 选项,则必须将 /run/flannel/docker 放在最后(确保 docker0 使用 flanneld 生成的 bip 参数);
  • docker 需要以 root 用于运行;
  • docker 从 1.13 版本开始,可能将 iptables FORWARD chain的默认策略设置为DROP,从而导致 ping 其它 Node 上的 Pod IP 失败,遇到这种情况时,需要手动设置策略为 ACCEPT:
sudo iptables -P FORWARD ACCEPT

并且把以下命令写入 /etc/rc.local 文件中,防止节点重启iptables FORWARD chain的默认策略又还原为DROP

/sbin/iptables -P FORWARD ACCEPT

分发 systemd unit 文件到所有 worker 机器

cd /opt/k8s/work
source /opt/k8s/bin/environment.sh
sed -i -e "s|##DOCKER_DIR##|${DOCKER_DIR}|" docker.service
for node_ip in ${NODE_IPS[@]}
  do
    echo ">>> ${node_ip}"
    scp docker.service root@${node_ip}:/etc/systemd/system/
  done

配置和分发 docker 配置文件

使用国内的仓库镜像服务器以加快 pull image 的速度,同时增加下载的并发数 (需要重启 dockerd 生效)

cd /opt/k8s/work
source /opt/k8s/bin/environment.sh
cat > docker-daemon.json <<EOF
{
    "registry-mirrors": ["https://docker.mirrors.ustc.edu.cn","https://hub-mirror.c.163.com"],
    "insecure-registries": ["docker02:35000"],
    "max-concurrent-downloads": 20,
    "live-restore": true,
    "max-concurrent-uploads": 10,
    "debug": true,
    "data-root": "${DOCKER_DIR}/data",
    "exec-root": "${DOCKER_DIR}/exec",
    "log-opts": {
      "max-size": "100m",
      "max-file": "5"
    }
}
EOF

分发 docker 配置文件到所有 worker 节点

cd /opt/k8s/work
source /opt/k8s/bin/environment.sh
for node_ip in ${NODE_IPS[@]}
  do
    echo ">>> ${node_ip}"
    ssh root@${node_ip} "mkdir -p  /etc/docker/ ${DOCKER_DIR}/{data,exec}"
    scp docker-daemon.json root@${node_ip}:/etc/docker/daemon.json
  done

启动 docker 服务

source /opt/k8s/bin/environment.sh
for node_ip in ${NODE_IPS[@]}
  do
    echo ">>> ${node_ip}"
    ssh root@${node_ip} "systemctl daemon-reload && systemctl enable docker && systemctl restart docker"
  done

检查服务运行状态

source /opt/k8s/bin/environment.sh
for node_ip in ${NODE_IPS[@]}
  do
    echo ">>> ${node_ip}"
    ssh root@${node_ip} "systemctl status docker|grep Active"
  done

确保状态为 active (running),否则查看日志,确认原因

journalctl -u docker

检查 docker0 网桥

source /opt/k8s/bin/environment.sh
for node_ip in ${NODE_IPS[@]}
  do
    echo ">>> ${node_ip}"
    ssh root@${node_ip} "/usr/sbin/ip addr show flannel.1 && /usr/sbin/ip addr show docker0"
  done

确认各 worker 节点的 docker0 网桥和 flannel.1 接口的 IP 处于同一个网段中(如下 172.30.80.0/32 位于 172.30.80.1/21 中):


>>> 192.168.1.34
4: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN 
    link/ether da:58:42:64:96:4d brd ff:ff:ff:ff:ff:ff
    inet 172.30.168.0/32 scope global flannel.1
       valid_lft forever preferred_lft forever
5: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN 
    link/ether 02:42:cf:23:c3:d4 brd ff:ff:ff:ff:ff:ff
    inet 172.30.168.1/21 brd 172.30.175.255 scope global docker0
       valid_lft forever preferred_lft forever

查看 docker 的状态信息

ps -elfH|grep docker
[root@ _84_ /opt/k8s/work]# ps -elfH|grep docker
0 S root      20596   1892  0  80   0 - 28165 -      17:09 pts/0    00:00:00         grep --color=auto docker
4 S root      20257      1  0  80   0 - 97530 futex_ 17:07 ?        00:00:00   /opt/k8s/bin/dockerd --bip=172.30.168.1/21 --ip-masq=false --mtu=1450
4 S root      20263  20257  0  80   0 - 93227 futex_ 17:07 ?        00:00:00     containerd --config /data/k8s/docker/exec/containerd/containerd.toml --log-level debug
docker info
Containers: 0
 Running: 0
 Paused: 0
 Stopped: 0
Images: 0
Server Version: 18.09.6
Storage Driver: vfs
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: bb71b10fd8f58240ca47fbb579b9d1028eea7c84
runc version: 2b18fe1d885ee5083ef9f0838fee39b62d653e30
init version: fec3683
Security Options:
 seccomp
  Profile: default
Kernel Version: 3.10.0-327.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 977.9MiB
Name: slave-34
ID: 7COV:LIMW:WCAL:OKNO:AOMH:VDTT:BYYT:ETDJ:ZE75:HC7S:IBGJ:VWEA
Docker Root Dir: /data/k8s/docker/data
Debug Mode (client): false
Debug Mode (server): true
 File Descriptors: 23
 Goroutines: 43
 System Time: 2019-08-04T17:10:21.262056663+08:00
 EventsListeners: 0
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 docker02:35000
 127.0.0.0/8
Registry Mirrors:
 https://docker.mirrors.ustc.edu.cn/
 https://hub-mirror.c.163.com/
Live Restore Enabled: true
Product License: Community Engine
最后修改:2019 年 08 月 05 日 05 : 08 PM