k8s组件和网络插件挂掉，演示已有的pod是否正常运行

环境

03 master ,05 06是node

[root@mcwk8s03 mcwtest]# kubectl get nodes -o wide

NAME       STATUS   ROLES    AGE    VERSION    INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION          CONTAINER-RUNTIME

mcwk8s05   Ready    <none>   581d   v1.15.12   10.0.0.35     <none>        CentOS Linux 7 (Core)   3.10.0-693.el7.x86_64   docker://20.10.21

mcwk8s06   Ready    <none>   581d   v1.15.12   10.0.0.36     <none>        CentOS Linux 7 (Core)   3.10.0-693.el7.x86_64   docker://20.10.21

[root@mcwk8s03 mcwtest]#

[root@mcwk8s03 mcwtest]# kubectl get svc

NAME          TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)          AGE

kubernetes    ClusterIP   10.2.0.1     <none>        443/TCP          584d

mcwtest-svc   NodePort    10.2.0.155   <none>        2024:33958/TCP   26h

nginx         ClusterIP   None         <none>        80/TCP           414d

[root@mcwk8s03 mcwtest]# kubectl get svc|grep mcwtest

mcwtest-svc   NodePort    10.2.0.155   <none>        2024:33958/TCP   26h

[root@mcwk8s03 mcwtest]# kubectl get deploy|grep mcwtest

mcwtest-deploy     1/1     1            1           26h

[root@mcwk8s03 mcwtest]# kubectl get pod|grep mcwtest

mcwtest-deploy-6465665557-g9zjd     1/1     Running            1          25h

[root@mcwk8s03 mcwtest]# kubectl get pod -o wide|grep mcwtest

mcwtest-deploy-6465665557-g9zjd     1/1     Running            1          25h    172.17.89.10   mcwk8s05   <none>           <none>

[root@mcwk8s03 mcwtest]#

停止服务之前，可以正常用nodeip: nodeport访问

[root@mcwk8s03 mcwtest]# kubectl get svc

NAME          TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)          AGE

kubernetes    ClusterIP   10.2.0.1     <none>        443/TCP          584d

mcwtest-svc   NodePort    10.2.0.155   <none>        2024:33958/TCP   26h

nginx         ClusterIP   None         <none>        80/TCP           414d

[root@mcwk8s03 mcwtest]# date

Thu Jun  6 00:50:34 CST 2024

[root@mcwk8s03 mcwtest]# curl  -I 10.0.0.35:33958

HTTP/1.0 200 OK

Server: SimpleHTTP/0.6 Python/2.7.18

Date: Wed, 05 Jun 2024 16:50:42 GMT

Content-type: text/html; charset=ANSI_X3.4-1968

Content-Length: 816

[root@mcwk8s03 mcwtest]# curl  -I 10.0.0.36:33958

HTTP/1.0 200 OK

Server: SimpleHTTP/0.6 Python/2.7.18

Date: Wed, 05 Jun 2024 16:50:46 GMT

Content-type: text/html; charset=ANSI_X3.4-1968

Content-Length: 816

[root@mcwk8s03 mcwtest]#

【】停止master上的组件

[root@mcwk8s03 /]# systemctl stop kube-apiserver.service

[root@mcwk8s03 /]# systemctl status kube-apiserver.service

● kube-apiserver.service - Kubernetes API Server

   Loaded: loaded (/usr/lib/systemd/system/kube-apiserver.service; enabled; vendor preset: disabled)

   Active: inactive (dead) since Thu 2024-06-06 00:54:51 CST; 14s ago

     Docs: https://github.com/kubernetes/kubernetes

  Process: 19837 ExecStart=/opt/kubernetes/bin/kube-apiserver $KUBE_APISERVER_OPTS (code=exited, status=0/SUCCESS)

 Main PID: 19837 (code=exited, status=0/SUCCESS)

Jun 06 00:54:51 mcwk8s03 kube-apiserver[19837]: I0606 00:54:51.395965   19837 wrap.go:47] GET /apis/rbac.authorization.k8s.io/v1/clus...3:5887]

Jun 06 00:54:51 mcwk8s03 kube-apiserver[19837]: I0606 00:54:51.396002   19837 wrap.go:47] GET /apis/storage.k8s.io/v1/volumeattachmen...3:5887]

Jun 06 00:54:51 mcwk8s03 kube-apiserver[19837]: I0606 00:54:51.396033   19837 wrap.go:47] GET /apis/admissionregistration.k8s.io/v1be...3:5887]

Jun 06 00:54:51 mcwk8s03 kube-apiserver[19837]: I0606 00:54:51.396047   19837 wrap.go:47] GET /apis/rbac.authorization.k8s.io/v1/role...3:5887]

Jun 06 00:54:51 mcwk8s03 kube-apiserver[19837]: I0606 00:54:51.396068   19837 wrap.go:47] GET /api/v1/nodes?resourceVersion=4830599&t...3:5887]

Jun 06 00:54:51 mcwk8s03 kube-apiserver[19837]: I0606 00:54:51.396083   19837 wrap.go:47] GET /api/v1/secrets?resourceVersion=4710803...3:5887]

Jun 06 00:54:51 mcwk8s03 kube-apiserver[19837]: I0606 00:54:51.396108   19837 wrap.go:47] GET /api/v1/namespaces?resourceVersion=4710...3:5887]

Jun 06 00:54:51 mcwk8s03 kube-apiserver[19837]: I0606 00:54:51.405097   19837 wrap.go:47] GET /api/v1/namespaces/default/endpoints/ku...3:5887]

Jun 06 00:54:51 mcwk8s03 kube-apiserver[19837]: E0606 00:54:51.408262   19837 controller.go:179] no master IPs were listed in storage...service

Jun 06 00:54:51 mcwk8s03 systemd[1]: Stopped Kubernetes API Server.

Hint: Some lines were ellipsized, use -l to show in full.

[root@mcwk8s03 /]#

执行命令已经有问题了

[root@mcwk8s03 /]# kubectl get svc

The connection to the server localhost:8080 was refused - did you specify the right host or port?

[root@mcwk8s03 /]# kubectl get nodes

The connection to the server localhost:8080 was refused - did you specify the right host or port?

[root@mcwk8s03 /]#

/var/log/message报错

Jun  6 00:58:11 mcwk8s03 kube-scheduler: E0606 00:58:11.720321  123920 reflector.go:125] k8s.io/client-go/informers/factory.go:133: 
Failed to list *v1.ReplicaSet: Get http://127.0.0.1:8080/apis/apps/v1/replicasets?limit=500&resourceVersion=0: dial tcp 127.0.0.1:8080: connect: connection refused

nodeip:nodeport的容器没有受到影响，还在运行

[root@mcwk8s03 mcwtest]# curl  -I 10.0.0.36:33958

HTTP/1.0 200 OK

Server: SimpleHTTP/0.6 Python/2.7.18

Date: Wed, 05 Jun 2024 17:01:02 GMT

Content-type: text/html; charset=ANSI_X3.4-1968

Content-Length: 816

[root@mcwk8s03 mcwtest]# date

Thu Jun  6 01:01:12 CST 2024

[root@mcwk8s03 mcwtest]#

还能正常访问

停掉schedule和controller-manager，pod可以正常提供服务

[root@mcwk8s03 mcwtest]# systemctl stop kube-scheduler.service

[root@mcwk8s03 mcwtest]# systemctl stop kube-controller-manager.service

[root@mcwk8s03 mcwtest]# date

Thu Jun  6 01:03:23 CST 2024

[root@mcwk8s03 mcwtest]# curl  -I 10.0.0.36:33958

HTTP/1.0 200 OK

Server: SimpleHTTP/0.6 Python/2.7.18

Date: Wed, 05 Jun 2024 17:03:26 GMT

Content-type: text/html; charset=ANSI_X3.4-1968

Content-Length: 816

[root@mcwk8s03 mcwtest]#

master没有kubelet和kube proxy

[root@mcwk8s03 mcwtest]# ps -ef|grep proxy

root     125098   1429  0 01:05 pts/0    00:00:00 grep --color=auto proxy

[root@mcwk8s03 mcwtest]# ps -ef|grep let

root     125106   1429  0 01:05 pts/0    00:00:00 grep --color=auto let

[root@mcwk8s03 mcwtest]#

【3】停掉node上的组件

[root@mcwk8s03 mcwtest]# curl  -I 10.0.0.36:33958

HTTP/1.0 200 OK

Server: SimpleHTTP/0.6 Python/2.7.18

Date: Wed, 05 Jun 2024 17:06:41 GMT

Content-type: text/html; charset=ANSI_X3.4-1968

Content-Length: 816

[root@mcwk8s03 mcwtest]# curl  -I 10.0.0.35:33958

HTTP/1.0 200 OK

Server: SimpleHTTP/0.6 Python/2.7.18

Date: Wed, 05 Jun 2024 17:06:44 GMT

Content-type: text/html; charset=ANSI_X3.4-1968

Content-Length: 816

[root@mcwk8s03 mcwtest]#

既然master不影响已有pod的正常使用，那么先把apiserver的启动一下，方便看环境

我们master上只启动了apiserver组件的，启动之后就可以看到，pod等信息正常显示

[root@mcwk8s03 mcwtest]# systemctl start apiserver

Failed to start apiserver.service: Unit not found.

[root@mcwk8s03 mcwtest]# systemctl start kube-apiserver.service

[root@mcwk8s03 mcwtest]# kubectl get svc

NAME          TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)          AGE

kubernetes    ClusterIP   10.2.0.1     <none>        443/TCP          584d

mcwtest-svc   NodePort    10.2.0.155   <none>        2024:33958/TCP   26h

nginx         ClusterIP   None         <none>        80/TCP           414d

[root@mcwk8s03 mcwtest]# kubectl get nodes

NAME       STATUS   ROLES    AGE    VERSION

mcwk8s05   Ready    <none>   581d   v1.15.12

mcwk8s06   Ready    <none>   581d   v1.15.12

[root@mcwk8s03 mcwtest]#

查看组件状态，也就是这三个，不影响已有pod的nodeip:nodeport方式的访问。

[root@mcwk8s03 mcwtest]# kubectl get cs

NAME                 STATUS      MESSAGE                                                                                     ERROR

scheduler            Unhealthy   Get http://127.0.0.1:10251/healthz: dial tcp 127.0.0.1:10251: connect: connection refused

controller-manager   Unhealthy   Get http://127.0.0.1:10252/healthz: dial tcp 127.0.0.1:10252: connect: connection refused

etcd-1               Healthy     {"health":"true"}

etcd-2               Healthy     {"health":"true"}

etcd-0               Healthy     {"health":"true"}

[root@mcwk8s03 mcwtest]#

我们master看下，找个clusterIP

[root@mcwk8s03 mcwtest]# kubectl get svc

NAME          TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)          AGE

kubernetes    ClusterIP   10.2.0.1     <none>        443/TCP          584d

mcwtest-svc   NodePort    10.2.0.155   <none>        2024:33958/TCP   26h

nginx         ClusterIP   None         <none>        80/TCP           414d

[root@mcwk8s03 mcwtest]#

然后去node上访问一下，也是可以正常访问的。为啥不在master请求clusterIP：port，这是因为没有ipvsadm规则，master没有部署kubeproxy这些node上的服务把

[root@mcwk8s05 ~]# curl -I 10.2.0.155:2024

HTTP/1.0 200 OK

Server: SimpleHTTP/0.6 Python/2.7.18

Date: Wed, 05 Jun 2024 17:12:34 GMT

Content-type: text/html; charset=ANSI_X3.4-1968

Content-Length: 816

[root@mcwk8s05 ~]#

停掉05和06上的kubelet服务

[root@mcwk8s05 ~]# systemctl stop kubelet.service

[root@mcwk8s05 ~]# 

[root@mcwk8s06 ~]# systemctl stop kubelet.service

[root@mcwk8s06 ~]#

容器的nodeip:nodeport访问还是正常的，clusterIP：port访问也正常

[root@mcwk8s05 ~]# curl  -I 10.0.0.35:33958

HTTP/1.0 200 OK

Server: SimpleHTTP/0.6 Python/2.7.18

Date: Wed, 05 Jun 2024 17:19:13 GMT

Content-type: text/html; charset=ANSI_X3.4-1968

Content-Length: 816

[root@mcwk8s05 ~]# curl  -I 10.0.0.36:33958

HTTP/1.0 200 OK

Server: SimpleHTTP/0.6 Python/2.7.18

Date: Wed, 05 Jun 2024 17:19:17 GMT

Content-type: text/html; charset=ANSI_X3.4-1968

Content-Length: 816

[root@mcwk8s05 ~]# curl -I 10.2.0.155:2024

HTTP/1.0 200 OK

Server: SimpleHTTP/0.6 Python/2.7.18

Date: Wed, 05 Jun 2024 17:19:21 GMT

Content-type: text/html; charset=ANSI_X3.4-1968

Content-Length: 816

[root@mcwk8s05 ~]#

下面开始停止kubeproxy服务,先停止06节点的

[root@mcwk8s06 ~]# systemctl stop kube-proxy.service

[root@mcwk8s06 ~]#

06节点的nodeip:nodeport依然可以访问这个服务

[root@mcwk8s05 ~]# curl  -I 10.0.0.36:33958

HTTP/1.0 200 OK

Server: SimpleHTTP/0.6 Python/2.7.18

Date: Wed, 05 Jun 2024 17:23:41 GMT

Content-type: text/html; charset=ANSI_X3.4-1968

Content-Length: 816

[root@mcwk8s05 ~]# curl -I 10.2.0.155:2024

HTTP/1.0 200 OK

Server: SimpleHTTP/0.6 Python/2.7.18

Date: Wed, 05 Jun 2024 17:23:44 GMT

Content-type: text/html; charset=ANSI_X3.4-1968

Content-Length: 816

[root@mcwk8s05 ~]#

可以看到，06上路由和ipvs规则都还在

[root@mcwk8s06 ~]# ipvsadm -Ln|grep -C 2 10.0.0.36

  -> 172.17.9.2:9100              Masq    1      0          0

  -> 172.17.89.2:9100             Masq    1      0          0

TCP  10.0.0.36:31672 rr

  -> 172.17.9.2:9100              Masq    1      0          0

  -> 172.17.89.2:9100             Masq    1      0          0

TCP  10.0.0.36:33958 rr

  -> 172.17.89.10:20000           Masq    1      0          1

TCP  10.0.0.36:46735 rr

  -> 172.17.89.13:3000            Masq    1      0          0

TCP  10.2.0.1:443 rr

--

TCP  172.17.9.1:46735 rr

  -> 172.17.89.13:3000            Masq    1      0          0

TCP  10.0.0.36:30001 rr

  -> 172.17.89.5:8443             Masq    1      0          0

TCP  10.0.0.36:30003 rr

  -> 172.17.89.4:9090             Masq    1      0          0

TCP  10.2.0.155:2024 rr

[root@mcwk8s06 ~]#

[root@mcwk8s06 ~]# route -n

Kernel IP routing table

Destination     Gateway         Genmask         Flags Metric Ref    Use Iface

0.0.0.0         10.0.0.254      0.0.0.0         UG    100    0        0 eth0

10.0.0.0        0.0.0.0         255.255.255.0   U     100    0        0 eth0

172.17.9.0      0.0.0.0         255.255.255.0   U     0      0        0 docker0

172.17.83.0     172.17.83.0     255.255.255.0   UG    0      0        0 flannel.1

172.17.89.0     172.17.89.0     255.255.255.0   UG    0      0        0 flannel.1

[root@mcwk8s06 ~]#

pod在05机器上，把05的kube-proxy关掉，已经有的pod，也是不影响使用，目前两个node的都停掉了

[root@mcwk8s05 /]# curl -I 10.2.0.155:2024

HTTP/1.0 200 OK

Server: SimpleHTTP/0.6 Python/2.7.18

Date: Wed, 05 Jun 2024 17:26:06 GMT

Content-type: text/html; charset=ANSI_X3.4-1968

Content-Length: 816

[root@mcwk8s05 /]# curl  -I 10.0.0.36:33958

HTTP/1.0 200 OK

Server: SimpleHTTP/0.6 Python/2.7.18

Date: Wed, 05 Jun 2024 17:26:34 GMT

Content-type: text/html; charset=ANSI_X3.4-1968

Content-Length: 816

[root@mcwk8s05 /]# curl  -I 10.0.0.35:33958

HTTP/1.0 200 OK

Server: SimpleHTTP/0.6 Python/2.7.18

Date: Wed, 05 Jun 2024 17:26:37 GMT

Content-type: text/html; charset=ANSI_X3.4-1968

Content-Length: 816

[root@mcwk8s05 /]#

[root@mcwk8s05 /]# curl -I 10.2.0.155:2024

HTTP/1.0 200 OK

Server: SimpleHTTP/0.6 Python/2.7.18

Date: Wed, 05 Jun 2024 17:26:06 GMT

Content-type: text/html; charset=ANSI_X3.4-1968

Content-Length: 816

[root@mcwk8s05 /]# curl  -I 10.0.0.36:33958

HTTP/1.0 200 OK

Server: SimpleHTTP/0.6 Python/2.7.18

Date: Wed, 05 Jun 2024 17:26:34 GMT

Content-type: text/html; charset=ANSI_X3.4-1968

Content-Length: 816

[root@mcwk8s05 /]# curl  -I 10.0.0.35:33958

HTTP/1.0 200 OK

Server: SimpleHTTP/0.6 Python/2.7.18

Date: Wed, 05 Jun 2024 17:26:37 GMT

Content-type: text/html; charset=ANSI_X3.4-1968

Content-Length: 816

[root@mcwk8s05 /]#

【4】停掉flannel服务。下面有个重要的分析容器通信的过程案例

停止前查看服务情况，有去两个宿主机容器网段的路由。并且nodeip:nodeport是通的。nodeip:nodeport。master上没有ipvsadm规则，因此clusterIP：port就没法找到转到对应的容器IP：容器port，但是nodeip:nodeport，不需要ipvs规则找转发的后端，直接通过nodeip通信，访问nodeport，然后到了指定的nodeip:nodeport之后，再根据ipvsadm转发规则，转发给对应的容器IP：容器Port。如果这个容器是在刚刚请求的nodeip主机上，那么直接通过docker0通信找到容器IP：容器port；如果不在这个机器，那么再通过容器跨宿主机通信的方式，再进行找到对应的宿主机，然后找到宿主机上对应的容器IP。

[root@mcwk8s03 mcwtest]# systemctl status flanneld.service

● flanneld.service - Flanneld overlay address etcd agent

   Loaded: loaded (/usr/lib/systemd/system/flanneld.service; enabled; vendor preset: disabled)

   Active: active (running) since Tue 2024-06-04 23:28:50 CST; 1 day 2h ago

 Main PID: 11892 (flanneld)

   Memory: 13.9M

   CGroup: /system.slice/flanneld.service

           └─11892 /opt/kubernetes/bin/flanneld --ip-masq --etcd-endpoints=https://10.0.0.33:2379,https://10.0.0.35:2379,https://10.0.0.36:2...

Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.

[root@mcwk8s03 mcwtest]#

[root@mcwk8s03 mcwtest]#

[root@mcwk8s03 mcwtest]# route -n

Kernel IP routing table

Destination     Gateway         Genmask         Flags Metric Ref    Use Iface

0.0.0.0         10.0.0.254      0.0.0.0         UG    100    0        0 eth0

10.0.0.0        0.0.0.0         255.255.255.0   U     100    0        0 eth0

172.17.9.0      172.17.9.0      255.255.255.0   UG    0      0        0 flannel.1

172.17.83.0     0.0.0.0         255.255.255.0   U     0      0        0 docker0

172.17.89.0     172.17.89.0     255.255.255.0   UG    0      0        0 flannel.1

[root@mcwk8s03 mcwtest]# ifconfig flannel.1

flannel.1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1450

        inet 172.17.83.0  netmask 255.255.255.255  broadcast 0.0.0.0

        inet6 fe80::b083:33ff:fe7b:fd37  prefixlen 64  scopeid 0x20<link>

        ether b2:83:33:7b:fd:37  txqueuelen 0  (Ethernet)

        RX packets 11  bytes 924 (924.0 B)

        RX errors 0  dropped 0  overruns 0  frame 0

        TX packets 11  bytes 924 (924.0 B)

        TX errors 0  dropped 8 overruns 0  carrier 0  collisions 0

[root@mcwk8s03 mcwtest]# ifconfig docker

docker0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500

        inet 172.17.83.1  netmask 255.255.255.0  broadcast 172.17.83.255

        ether 02:42:e9:a4:51:4f  txqueuelen 0  (Ethernet)

        RX packets 0  bytes 0 (0.0 B)

        RX errors 0  dropped 0  overruns 0  frame 0

        TX packets 0  bytes 0 (0.0 B)

        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[root@mcwk8s03 mcwtest]# curl 10.2.0.155:2024

^C

[root@mcwk8s03 mcwtest]# curl  -I 10.0.0.36:33958

HTTP/1.0 200 OK

Server: SimpleHTTP/0.6 Python/2.7.18

Date: Wed, 05 Jun 2024 17:30:23 GMT

Content-type: text/html; charset=ANSI_X3.4-1968

Content-Length: 816

[root@mcwk8s03 mcwtest]#

我们停止03master上的flannel，发现还是正常用nodeip:nodeport访问已有的容器服务的，路由还在，网卡网段还没变化。

[root@mcwk8s03 mcwtest]# route -n

Kernel IP routing table

Destination     Gateway         Genmask         Flags Metric Ref    Use Iface

0.0.0.0         10.0.0.254      0.0.0.0         UG    100    0        0 eth0

10.0.0.0        0.0.0.0         255.255.255.0   U     100    0        0 eth0

172.17.9.0      172.17.9.0      255.255.255.0   UG    0      0        0 flannel.1

172.17.83.0     0.0.0.0         255.255.255.0   U     0      0        0 docker0

172.17.89.0     172.17.89.0     255.255.255.0   UG    0      0        0 flannel.1

[root@mcwk8s03 mcwtest]# ifconfig flannel.1

flannel.1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1450

        inet 172.17.83.0  netmask 255.255.255.255  broadcast 0.0.0.0

        inet6 fe80::b083:33ff:fe7b:fd37  prefixlen 64  scopeid 0x20<link>

        ether b2:83:33:7b:fd:37  txqueuelen 0  (Ethernet)

        RX packets 11  bytes 924 (924.0 B)

        RX errors 0  dropped 0  overruns 0  frame 0

        TX packets 11  bytes 924 (924.0 B)

        TX errors 0  dropped 8 overruns 0  carrier 0  collisions 0

[root@mcwk8s03 mcwtest]# ifconfig docker

docker0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500

        inet 172.17.83.1  netmask 255.255.255.0  broadcast 172.17.83.255

        ether 02:42:e9:a4:51:4f  txqueuelen 0  (Ethernet)

        RX packets 0  bytes 0 (0.0 B)

        RX errors 0  dropped 0  overruns 0  frame 0

        TX packets 0  bytes 0 (0.0 B)

        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[root@mcwk8s03 mcwtest]#

下面我们查看下05 node的信息，准备关闭这个节点的flannel服务，正常访问pod服务

[root@mcwk8s05 ~]# systemctl status flanneld.service

● flanneld.service - Flanneld overlay address etcd agent

   Loaded: loaded (/usr/lib/systemd/system/flanneld.service; disabled; vendor preset: disabled)

   Active: active (running) since Wed 2024-06-05 00:50:25 CST; 24h ago

  Process: 4201 ExecStartPost=/opt/kubernetes/bin/mk-docker-opts.sh -k DOCKER_NETWORK_OPTIONS -d /run/flannel/subnet.env (code=exited, status=0/SUCCESS)

 Main PID: 4184 (flanneld)

   Memory: 16.2M

   CGroup: /system.slice/flanneld.service

           └─4184 /opt/kubernetes/bin/flanneld --ip-masq --etcd-endpoints=https://10.0.0.33:2379,https://10.0.0.35:2379,https://10.0.0.36:2379 -etcd-cafile=/opt/etcd/ssl/ca.pem -etcd-cert...

Jun 05 23:50:25 mcwk8s05 flanneld[4184]: I0605 23:50:25.459783    4184 main.go:388] Lease renewed, new expiration: 2024-06-06 15:50:25.409920414 +0000 UTC

Jun 05 23:50:25 mcwk8s05 flanneld[4184]: I0605 23:50:25.459933    4184 main.go:396] Waiting for 22h59m59.949994764s to renew lease

Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.

[root@mcwk8s05 ~]# route -n

Kernel IP routing table

Destination     Gateway         Genmask         Flags Metric Ref    Use Iface

0.0.0.0         10.0.0.254      0.0.0.0         UG    100    0        0 eth0

10.0.0.0        0.0.0.0         255.255.255.0   U     100    0        0 eth0

172.17.9.0      172.17.9.0      255.255.255.0   UG    0      0        0 flannel.1

172.17.83.0     172.17.83.0     255.255.255.0   UG    0      0        0 flannel.1

172.17.89.0     0.0.0.0         255.255.255.0   U     0      0        0 docker0

[root@mcwk8s05 ~]# curl -I 10.2.0.155:2024

HTTP/1.0 200 OK

Server: SimpleHTTP/0.6 Python/2.7.18

Date: Wed, 05 Jun 2024 17:46:55 GMT

Content-type: text/html; charset=ANSI_X3.4-1968

Content-Length: 816

[root@mcwk8s05 ~]# curl  -I 10.0.0.36:33958

HTTP/1.0 200 OK

Server: SimpleHTTP/0.6 Python/2.7.18

Date: Wed, 05 Jun 2024 17:47:00 GMT

Content-type: text/html; charset=ANSI_X3.4-1968

Content-Length: 816

[root@mcwk8s05 ~]# curl  -I 10.0.0.35:33958

HTTP/1.0 200 OK

Server: SimpleHTTP/0.6 Python/2.7.18

Date: Wed, 05 Jun 2024 17:47:04 GMT

Content-type: text/html; charset=ANSI_X3.4-1968

Content-Length: 816

[root@mcwk8s05 ~]# ifconfig flannel.1

flannel.1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1450

        inet 172.17.89.0  netmask 255.255.255.255  broadcast 0.0.0.0

        inet6 fe80::3470:76ff:feea:39b8  prefixlen 64  scopeid 0x20<link>

        ether 36:70:76:ea:39:b8  txqueuelen 0  (Ethernet)

        RX packets 78  bytes 5468 (5.3 KiB)

        RX errors 0  dropped 0  overruns 0  frame 0

        TX packets 86  bytes 7144 (6.9 KiB)

        TX errors 0  dropped 8 overruns 0  carrier 0  collisions 0

[root@mcwk8s05 ~]# ifconfig docker

docker0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1450

        inet 172.17.89.1  netmask 255.255.255.0  broadcast 172.17.89.255

        inet6 fe80::42:18ff:fee1:e8fc  prefixlen 64  scopeid 0x20<link>

        ether 02:42:18:e1:e8:fc  txqueuelen 0  (Ethernet)

        RX packets 1212277  bytes 505391601 (481.9 MiB)

        RX errors 0  dropped 0  overruns 0  frame 0

        TX packets 1374361  bytes 1966293845 (1.8 GiB)

        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[root@mcwk8s05 ~]#

停掉两个node上的flannel服务

[root@mcwk8s05 ~]# systemctl stop flanneld.service

[root@mcwk8s05 ~]# 

[root@mcwk8s06 ~]# systemctl stop flanneld.service

[root@mcwk8s06 ~]#

查看停掉所有flannel之后，05 node的信息。可以看的，依然可以访问容器服务，通过nodeip:nodeport或者是clusterip:port 。至此，k8s组件和网络插件的停止，目前来看对原有的pod是没有受到影响的。也就是，pod创建好之后，正常提供服务不会依赖k8s组件，网络组件等，也就是只有pod创建的时候，会依赖这些组件创建容器，创建好容器之后，就能正常使用，不依赖组件。

[root@mcwk8s05 ~]# curl  -I 10.0.0.35:33958

HTTP/1.0 200 OK

Server: SimpleHTTP/0.6 Python/2.7.18

Date: Wed, 05 Jun 2024 17:49:28 GMT

Content-type: text/html; charset=ANSI_X3.4-1968

Content-Length: 816

[root@mcwk8s05 ~]# curl  -I 10.0.0.36:33958

HTTP/1.0 200 OK

Server: SimpleHTTP/0.6 Python/2.7.18

Date: Wed, 05 Jun 2024 17:49:33 GMT

Content-type: text/html; charset=ANSI_X3.4-1968

Content-Length: 816

[root@mcwk8s05 ~]# curl -I 10.2.0.155:2024

HTTP/1.0 200 OK

Server: SimpleHTTP/0.6 Python/2.7.18

Date: Wed, 05 Jun 2024 17:49:50 GMT

Content-type: text/html; charset=ANSI_X3.4-1968

Content-Length: 816

[root@mcwk8s05 ~]# ifconfig flannel.1

flannel.1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1450

        inet 172.17.89.0  netmask 255.255.255.255  broadcast 0.0.0.0

        inet6 fe80::3470:76ff:feea:39b8  prefixlen 64  scopeid 0x20<link>

        ether 36:70:76:ea:39:b8  txqueuelen 0  (Ethernet)

        RX packets 83  bytes 5816 (5.6 KiB)

        RX errors 0  dropped 0  overruns 0  frame 0

        TX packets 91  bytes 7576 (7.3 KiB)

        TX errors 0  dropped 8 overruns 0  carrier 0  collisions 0

[root@mcwk8s05 ~]# ifconfig docker

docker0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1450

        inet 172.17.89.1  netmask 255.255.255.0  broadcast 172.17.89.255

        inet6 fe80::42:18ff:fee1:e8fc  prefixlen 64  scopeid 0x20<link>

        ether 02:42:18:e1:e8:fc  txqueuelen 0  (Ethernet)

        RX packets 1213264  bytes 505536182 (482.1 MiB)

        RX errors 0  dropped 0  overruns 0  frame 0

        TX packets 1375186  bytes 1967017624 (1.8 GiB)

        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[root@mcwk8s05 ~]#

此时在master上查看etcd中网络信息，现在将flannel网络服务启动，会不会改变该宿主机的容器网段，从而对现有的容器产生影响，导致网络问题呢。

[root@mcwk8s03 mcwtest]# etcdctl ls /

/coreos.com

[root@mcwk8s03 mcwtest]# etcdctl ls /coreos.com

/coreos.com/network

[root@mcwk8s03 mcwtest]# etcdctl ls /coreos.com/network

/coreos.com/network/config

/coreos.com/network/subnets

[root@mcwk8s03 mcwtest]# etcdctl ls /coreos.com/network/subnets/

/coreos.com/network/subnets/172.17.83.0-24

/coreos.com/network/subnets/172.17.9.0-24

/coreos.com/network/subnets/172.17.89.0-24

[root@mcwk8s03 mcwtest]# etcdctl ls /coreos.com/network/config

/coreos.com/network/config

[root@mcwk8s03 mcwtest]# etcdctl get /coreos.com/network/config

{ "Network": "172.17.0.0/16", "Backend": {"Type": "vxlan"}}

[root@mcwk8s03 mcwtest]#

网络服务启动之后，并没有改变宿主机容器网段

[root@mcwk8s05 ~]# ifconfig flannel.1

flannel.1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1450

        inet 172.17.89.0  netmask 255.255.255.255  broadcast 0.0.0.0

        inet6 fe80::3470:76ff:feea:39b8  prefixlen 64  scopeid 0x20<link>

        ether 36:70:76:ea:39:b8  txqueuelen 0  (Ethernet)

        RX packets 83  bytes 5816 (5.6 KiB)

        RX errors 0  dropped 0  overruns 0  frame 0

        TX packets 91  bytes 7576 (7.3 KiB)

        TX errors 0  dropped 8 overruns 0  carrier 0  collisions 0

[root@mcwk8s05 ~]# ifconfig docker

docker0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1450

        inet 172.17.89.1  netmask 255.255.255.0  broadcast 172.17.89.255

        inet6 fe80::42:18ff:fee1:e8fc  prefixlen 64  scopeid 0x20<link>

        ether 02:42:18:e1:e8:fc  txqueuelen 0  (Ethernet)

        RX packets 1213264  bytes 505536182 (482.1 MiB)

        RX errors 0  dropped 0  overruns 0  frame 0

        TX packets 1375186  bytes 1967017624 (1.8 GiB)

        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[root@mcwk8s05 ~]#

[root@mcwk8s05 ~]# systemctl status flanneld.service

● flanneld.service - Flanneld overlay address etcd agent

   Loaded: loaded (/usr/lib/systemd/system/flanneld.service; disabled; vendor preset: disabled)

   Active: inactive (dead)

Jun 05 23:50:25 mcwk8s05 flanneld[4184]: I0605 23:50:25.459783    4184 main.go:388] Lease renewed, new expiration: 2024-06-06 15:50:25.409920414 +0000 UTC

Jun 05 23:50:25 mcwk8s05 flanneld[4184]: I0605 23:50:25.459933    4184 main.go:396] Waiting for 22h59m59.949994764s to renew lease

Jun 06 01:48:25 mcwk8s05 systemd[1]: Stopping Flanneld overlay address etcd agent...

Jun 06 01:48:25 mcwk8s05 flanneld[4184]: I0606 01:48:25.715882    4184 main.go:404] Stopped monitoring lease

Jun 06 01:48:25 mcwk8s05 flanneld[4184]: I0606 01:48:25.716035    4184 main.go:322] Waiting for all goroutines to exit

Jun 06 01:48:25 mcwk8s05 flanneld[4184]: I0606 01:48:25.741648    4184 main.go:337] shutdownHandler sent cancel signal...

Jun 06 01:48:25 mcwk8s05 flanneld[4184]: I0606 01:48:25.759081    4184 main.go:325] Exiting cleanly...

Jun 06 01:48:25 mcwk8s05 systemd[1]: Stopped Flanneld overlay address etcd agent.

[root@mcwk8s05 ~]# systemctl start flanneld.service

[root@mcwk8s05 ~]# ifconfig flannel.1

flannel.1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1450

        inet 172.17.89.0  netmask 255.255.255.255  broadcast 0.0.0.0

        inet6 fe80::3470:76ff:feea:39b8  prefixlen 64  scopeid 0x20<link>

        ether 36:70:76:ea:39:b8  txqueuelen 0  (Ethernet)

        RX packets 83  bytes 5816 (5.6 KiB)

        RX errors 0  dropped 0  overruns 0  frame 0

        TX packets 91  bytes 7576 (7.3 KiB)

        TX errors 0  dropped 8 overruns 0  carrier 0  collisions 0

[root@mcwk8s05 ~]# ifconfig docker

docker0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1450

        inet 172.17.89.1  netmask 255.255.255.0  broadcast 172.17.89.255

        inet6 fe80::42:18ff:fee1:e8fc  prefixlen 64  scopeid 0x20<link>

        ether 02:42:18:e1:e8:fc  txqueuelen 0  (Ethernet)

        RX packets 1216321  bytes 505986272 (482.5 MiB)

        RX errors 0  dropped 0  overruns 0  frame 0

        TX packets 1377747  bytes 1969464945 (1.8 GiB)

        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[root@mcwk8s05 ~]# route -n

Kernel IP routing table

Destination     Gateway         Genmask         Flags Metric Ref    Use Iface

0.0.0.0         10.0.0.254      0.0.0.0         UG    100    0        0 eth0

10.0.0.0        0.0.0.0         255.255.255.0   U     100    0        0 eth0

172.17.9.0      172.17.9.0      255.255.255.0   UG    0      0        0 flannel.1

172.17.83.0     172.17.83.0     255.255.255.0   UG    0      0        0 flannel.1

172.17.89.0     0.0.0.0         255.255.255.0   U     0      0        0 docker0

[root@mcwk8s05 ~]#

并且在其它节点，依然可以正常访问到容器服务

[root@mcwk8s03 mcwtest]# curl  -I 10.0.0.35:33958

HTTP/1.0 200 OK

Server: SimpleHTTP/0.6 Python/2.7.18

Date: Wed, 05 Jun 2024 18:01:00 GMT

Content-type: text/html; charset=ANSI_X3.4-1968

Content-Length: 816

[root@mcwk8s03 mcwtest]#

[root@mcwk8s06 ~]# systemctl stop flanneld.service

[root@mcwk8s06 ~]# curl -I 10.2.0.155:2024

HTTP/1.0 200 OK

Server: SimpleHTTP/0.6 Python/2.7.18

Date: Wed, 05 Jun 2024 18:02:02 GMT

Content-type: text/html; charset=ANSI_X3.4-1968

Content-Length: 816

[root@mcwk8s06 ~]#

综上可知：

容器创建销毁等等，需要用到k8s组件，网络插件等等。但是已经创建的容器，网络已经在宿主机上存在了，把K8S组件和网络插件停止，不影响同宿主机容器间通信，以及不影响这些容器跨主机通信。也就是已有的容器服务，clusterip：port和nodeip:nodeport等方式去访问容器应用，还是正常提供服务可以访问到的。并且我这里停止flannel之后，然后再启动，这个宿主机的容器网段还是原来的，容器网关也是原来的没有变动，etcd保存的网段也是没有变动。所以重新启动网络插件，没有使得已有的容器网段发生改变，因此有个疑问，之前为什么重启flannel服务，会让宿主机的容器网段发生改变呢，从而也要重启容器，使得所在宿主机的容器网段对应上呢