1. 架构信息

系统版本:CentOS 7.6
内核:3.10.0-957.el7.x86_64
Kubernetes: v1.14.1
Docker-ce: 18.09.5
推荐硬件配置:4核8G Keepalived保证apiserever服务器的IP高可用
Haproxy实现apiserver的负载均衡 

2. 节点信息

目前测试为 6 台虚拟机,etcd采用 rpm 安装、kubernetes 使用二进制安装,使用 systemd 来做管理,网络组件采用 flannel,Master 实现了 HA, 集群开启 RBAC;master 不负载 pod,在分发证书等阶段将在另外一台主机上执行,该主机对集群内所有节点配置了 ssh 秘钥登录,基本环境如下

hostname ip 组件 内存 cpu
node-01 172.19.8.111 kube-apiserver、kube-controller-manager、etcd、haproxy、keepalived 8G 4c
node-02 172.19.8.112 kube-apiserver、kube-controller-manager、etcd、haproxy、keepalived 8G 4c
node-03 172.19.8.113 kube-apiserver、kube-controller-manager、etcd 8G 4c
node-04 172.19.8.114 node 8G 4c
node-05 172.19.8.115 node 8G 4c
node-06 172.19.8.116 node 8G 4c
VIP 172.19.8.250      

3.1  关闭防火墙和selinux3. 部署前准备工作

[root@node-01 ~]# sed -ri 's#(SELINUX=).*#\1disabled#' /etc/selinux/config
[root@node-01 ~]# setenforce 0
[root@node-01 ~]# systemctl disable firewalld
[root@node-01 ~]# systemctl stop firewalld

3.2 关闭swap

[root@node-01 ~]# swapoff -a
注:修改/etc/fstab,注销swap相关信息

3.3 添加host记录

[root@node-01 ~]# cat >>/etc/hosts<<EOF
172.19.8.111 node-01
172.19.8.112 node-02
172.19.8.113 node-03
172.19.8.114 node-04
172.19.8.115 node-05
172.19.8.116 node-06
EOF

3.4 打通ssh, node-01免密登录其他服务器

[root@node-01 ~]# ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:uckCmzy46SfU6Lq9jRbugn0U8vQsr5H+PtfGBsvrfCA root@node-01
The key's randomart image is:
+---[RSA 2048]----+
| |
| |
| |
| . o . |
| *.+ S |
| +o==E.oo |
|.=.oBo.o+* |
|o.**oooo+ * |
|oBO=++o++= |
+----[SHA256]-----+

分发node-01公钥,用于免密登录其他服务器


[root@node-01 ~]# for n in `seq -w 01 06`;do ssh-copy-id node-$n;done

3.5  配置内核参数,需要重启服务器,否则后面初始化的时候会报错。

cat <<EOF >  /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_nonlocal_bind = 1
net.ipv4.ip_forward = 1
vm.swappiness=0
EOF sysctl --system

报错处理,没有桥接网络导致,需要安装docker,并启动后才会出现桥接网络

[root@node-01 ~]# sysctl -p /etc/sysctl.d/k8s.conf
net.ipv4.ip_forward = 1
sysctl: cannot stat /proc/sys/net/bridge/bridge-nf-call-ip6tables: 没有那个文件或目录
sysctl: cannot stat /proc/sys/net/bridge/bridge-nf-call-iptables: 没有那个文件或目录

3.6 如果kube-proxy使用ipvs模式,需要加载ipvs模块

cat > /etc/sysconfig/modules/ipvs.modules <<EOF
#!/bin/bash
modprobe -- ip_vs
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
modprobe -- ip_vs_sh
modprobe -- nf_conntrack_ipv4
EOF
chmod 755 /etc/sysconfig/modules/ipvs.modules && bash /etc/sysconfig/modules/ipvs.modules && lsmod | grep -e ip_vs -e nf_conntrack_ipv4

3.7  添加yum源

cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
exclude=kube*
EOF

考虑到国内无法拉取google源,可以使用阿里云源

$ cat << EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF wget http://mirrors.aliyun.com/repo/Centos-7.repo -O /etc/yum.repos.d/CentOS-Base.repo
wget http://mirrors.aliyun.com/repo/epel-7.repo -O /etc/yum.repos.d/epel.repo

以上部署需要在每个节点执行。

4.  部署keepalived和haproxy

4.1 在node-01和node-02上面安装keepalived和haproxy

$ yum install -y keepalived haproxy

4.2 配置keepalived

node-01 配置信息

[root@node-01 ~]# cat /etc/keepalived/keepalived.conf
! Configuratile for keepalived
global_defs {
notification_email {
995958026@qq.com
}
notification_email_from keepalived@ptmind.com
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id node-01
} vrrp_script check_apiserver {
script "/workspace/crontab/check_apiserver"
interval 5
weight -20
fall 3
rise 1
} vrrp_instance VIP_250 {
state MASTER
interface eth0
virtual_router_id 250
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 890iop
}
track_script {
check_apiserver
}
virtual_ipaddress {
172.19.8.250
}
}

检查脚本配置

$ cat /workspace/crontab/check_apiserver
#!/bin/bash
curl 127.0.0.1:8080 &>/dev/null
if [ $? -eq 0 ];then
exit 0
else
#systemctl stop keepalived
exit 1
fi
$ chmod 755 /workspace/crontab/check_apiserver

node-02 配置

[root@node-02 ~]# cat /etc/keepalived/keepalived.conf
! Configuration File for keepalived global_defs {
notification_email {
435002493@qq.com
}
notification_email_from keepalived@ptmind.com
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id node-02
} vrrp_instance VI_250 {
state BACKUP
interface eth0
virtual_router_id 250
priority 90
advert_int 1
authentication {
auth_type PASS
auth_pass 890iop
}
virtual_ipaddress {
172.19.8.250
}
}

4.3 配置haproxy

node-01和node-02的haproxy配置是一样的。此处我们监听的是172.19.8.250的8443端口,因为haproxy是和k8s apiserver是部署在同一台服务器上,都用6443会冲突。 
[root@node-01 ~]# cat /etc/haproxy/haproxy.cfg
global
chroot /var/lib/haproxy
daemon
group haproxy
user haproxy
# log warning
pidfile /var/lib/haproxy.pid
maxconn 20000
spread-checks 3
nbproc 8 defaults
log global
mode tcp
retries 3
option redispatch listen https-apiserver
bind 0.0.0.0:8443
mode tcp
balance roundrobin
timeout server 900s
timeout connect 15s server apiserver01 172.19.8.111:6443 check port 6443 inter 5000 fall 5
server apiserver02 172.19.8.112:6443 check port 6443 inter 5000 fall 5
server apiserver03 172.19.8.113:6443 check port 6443 inter 5000 fall 5

4.4 启动服务

systemctl enable keepalived && systemctl start keepalived
systemctl enable haproxy && systemctl start haproxy

5 安装docker

由于kubeadm对docker的版本是有要求的,需要安装与kubeadm匹配的版本。本文docker采用docker-ce

yum-config-manager \
--add-repo \
https://download.docker.com/linux/centos/docker-ce.repo yum install docker-ce
cat > /etc/docker/daemon.json <<EOF
{
"exec-opts": ["native.cgroupdriver=systemd"],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
},
"storage-driver": "overlay2",
"storage-opts": [
"overlay2.override_kernel_check=true"
]
}
EOF mkdir -p /etc/systemd/system/docker.service.d # Restart Docker
systemctl daemon-reload
systemctl restart docker
systemctl enable docker

6 安装kubectl和kubeadm

yum -y install kubeadm-1.14.1 kubectl-1.14.1 --disableexcludes=kubernetes

设置kubelet开机启动

systemctl enable kubelet 

7 配置

7.1  修改初始化配置

使用 kubeadm config print init-defaults > kubeadm-init.yaml 打印出默认配置,然后在根据自己的环境修改配置.

[root@node-01 ~]# kubeadm config print init-defaults > kubeadm-init.yaml
[root@node-01 ~]# cat kubeadm-init.yaml
apiVersion: kubeadm.k8s.io/v1beta1
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 172.19.8.111
bindPort: 6443
nodeRegistration:
criSocket: /var/run/dockershim.sock
name: node-01
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/master
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta1
certificatesDir: /etc/kubernetes/pki
clusterName: k8s-test
controlPlaneEndpoint: "172.19.8.250:8443"
controllerManager: {}
dns:
type: CoreDNS
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: k8s.gcr.io
kind: ClusterConfiguration
kubernetesVersion: v1.14.0
networking:
dnsDomain: cluster.local
podSubnet: "10.244.0.0/16"
serviceSubnet: 10.245.0.0/16
scheduler: {}
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: "ipvs" 上述部分是配置 kube-proxy 使用ipvs模式,默认为iptables模式,如果使用iptables,可以不添加红色部分。

kube-proxy说明

在k8s中,提供相同服务的一组pod可以抽象成一个service,通过service提供的统一入口对外提供服务,每个service都有一个虚拟IP地址(clusterip)和端口号供客户端访问。
Kube-proxy存在于各个node节点上,主要用于Service功能的实现,具体来说,就是实现集群内的客户端pod访问service,或者是集群外的主机通过NodePort等方式访问service。
kube-proxy默认使用的是iptables模式,通过各个node节点上的iptables规则来实现service的负载均衡,但是随着service数量的增大,iptables模式由于线性查找匹配、全量更新等特点,其性能会显著下降。
IPVS是LVS的核心组件,是一种四层负载均衡器。IPVS具有以下特点:
与Iptables同样基于Netfilter,但使用的是hash表;
支持TCP, UDP,SCTP协议,支持IPV4,IPV6;
支持多种负载均衡策略:rr, wrr, lc, wlc, sh, dh, lblc…
支持会话保持;
LVS主要由两部分组成: ipvs(ip virtual server):即ip虚拟服务,是工作在内核空间上的一段代码,主要是实现调度的代码,它是实现负载均衡的核心。
ipvsadm: 工作在用户空间,负责为ipvs内核框架编写规则,用于定义谁是集群服务,谁是后端真实服务器。我们可以通过ipvsadm指令创建集群服务

7.2  预下载镜像

[root@node-01 ~]# kubeadm config images pull --config kubeadm-init.yaml
[config/images] Pulled k8s.gcr.io/kube-apiserver:v1.14.0
[config/images] Pulled k8s.gcr.io/kube-controller-manager:v1.14.0
[config/images] Pulled k8s.gcr.io/kube-scheduler:v1.14.0
[config/images] Pulled k8s.gcr.io/kube-proxy:v1.14.0
[config/images] Pulled k8s.gcr.io/pause:3.1
[config/images] Pulled k8s.gcr.io/etcd:3.3.10
[config/images] Pulled k8s.gcr.io/coredns:1.3.1

7.2.1  如果是国内环境,由于被墙,可能拉取失败,需要手动拉取国内镜像,然后修改tag

获取需要的镜像列表

[root@node-01 ~]# kubeadm config images list
k8s.gcr.io/kube-apiserver:v1.14.1
k8s.gcr.io/kube-controller-manager:v1.14.1
k8s.gcr.io/kube-scheduler:v1.14.1
k8s.gcr.io/kube-proxy:v1.14.1
k8s.gcr.io/pause:3.1
k8s.gcr.io/etcd:3.3.10
k8s.gcr.io/coredns:1.3.1

可从阿里云的镜像替换为谷歌的镜像

#!/bin/bash
images=(
kube-apiserver:v1.14.1
kube-controller-manager:v1.14.1
kube-scheduler:v1.14.1
kube-proxy:v1.14.1
pause:3.1
etcd:3.3.10
coredns:1.3.1
kubernetes-dashboard-amd64:v1.10.1
heapster-influxdb-amd64:v1.3.3
heapster-amd64:v1.4.2
)
for imageName in ${images[@]} ; do
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/$imageName
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/$imageName k8s.gcr.io/$imageName
done

每个节点都要拉取。

7.3  初始化

报错:前面已经修改了内核,但是没有生效,需要重启

[root@node-01 ~]# kubeadm init --config kubeadm-init.yaml
[init] Using Kubernetes version: v1.14.0
[preflight] Running pre-flight checks
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR FileContent--proc-sys-net-bridge-bridge-nf-call-iptables]: /proc/sys/net/bridge/bridge-nf-call-iptables contents are not set to 1
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`

解决:

echo "1" > /proc/sys/net/bridge/bridge-nf-call-iptables

或者重启服务器。

重新初始化

[root@node-01 ~]# kubeadm init --config kubeadm-init.yaml
[init] Using Kubernetes version: v1.14.0
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Activating the kubelet service
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [node-01 localhost] and IPs [172.19.8.111 127.0.0.1 ::1]
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [node-01 localhost] and IPs [172.19.8.111 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [node-01 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.245.0.1 172.19.8.111 172.19.8.250]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "admin.conf" kubeconfig file
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 13.502727 seconds
[upload-config] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.14" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --experimental-upload-certs
[mark-control-plane] Marking the node node-01 as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node node-01 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: abcdef.0123456789abcdef
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: CoreDNS
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[addons] Applied essential addon: kube-proxy Your Kubernetes control-plane has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/ You can now join any number of control-plane nodes by copying certificate authorities
and service account keys on each node and then running the following as root: kubeadm join 172.19.8.250:8443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:89accff8b4514d49be4b88906c50fdab4ba8a211788da7252b880c925af77671 \
--experimental-control-plane Then you can join any number of worker nodes by running the following on each as root: kubeadm join 172.19.8.250:8443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:89accff8b4514d49be4b88906c50fdab4ba8a211788da7252b880c925af77671

遇到报错:

[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed. Unfortunately, an error has occurred:
timed out waiting for the condition This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled) If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet' Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI, e.g. docker.
Here is one example how you may list all Kubernetes containers running in docker:
- 'docker ps -a | grep kube | grep -v pause'
Once you have found the failing container, you can inspect its logs with:
- 'docker logs CONTAINERID'
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
报错分析: 这种情况较难分析,没有明确的报错信息,在系统日志中很难发现端疑,几种情况列举一下
 1.拉取镜像失败,国内拉取google失败,可以换成阿里云,需要修改kubeadm-init.yaml ,imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers
 2.检查容器是否正常启动
 3.配置的vip不能被访问,导致不能连接apiserver,检查防火墙配置。这也是导致我的初始化报错的原因。
 4.如果失败,则清空初始化信息,执行kubeadm reset , 关闭docker,重启防火墙,如果etcd是外部的,将看到以前集群的状态,需要删除etcd数据,例如etcdctl del "" --prefix
 
kubeadm init主要执行了以下操作:
[init]:指定版本进行初始化操作
[preflight] :初始化前的检查和下载所需要的Docker镜像文件
[kubelet-start] :生成kubelet的配置文件”/var/lib/kubelet/config.yaml”,没有这个文件kubelet无法启动,所以初始化之前的kubelet实际上启动失败。
[certificates]:生成Kubernetes使用的证书,存放在/etc/kubernetes/pki目录中。
[kubeconfig] :生成 KubeConfig 文件,存放在/etc/kubernetes目录中,组件之间通信需要使用对应文件。
[control-plane]:使用/etc/kubernetes/manifest目录下的YAML文件,安装 Master 组件。
[etcd]:使用/etc/kubernetes/manifest/etcd.yaml安装Etcd服务。
[wait-control-plane]:等待control-plan部署的Master组件启动。
[apiclient]:检查Master组件服务状态。
[uploadconfig]:更新配置
[kubelet]:使用configMap配置kubelet。
[patchnode]:更新CNI信息到Node上,通过注释的方式记录。
[mark-control-plane]:为当前节点打标签,打了角色Master,和不可调度标签,这样默认就不会使用Master节点来运行Pod。
[bootstrap-token]:生成token记录下来,后边使用kubeadm join往集群中添加节点时会用到
[addons]:安装附加组件CoreDNS和kube-proxy 

7.4  为kubectl准备kubeconfig文件

kubectl默认会在执行的用户家目录下面的.kube目录下寻找config文件。这里是将在初始化时[kubeconfig]步骤生成的admin.conf拷贝到.kube/config。
在该配置文件中,记录了API Server的访问地址,所以后面直接执行kubectl命令就可以正常连接到API Server中。 
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

7.5  查看组件状态

[root@node-01 ~]# kubectl get cs
NAME STATUS MESSAGE ERROR
controller-manager Healthy ok
scheduler Healthy ok
etcd-0 Healthy {"health":"true"}
[root@node-01 ~]# kubectl get node
NAME STATUS ROLES AGE VERSION
node-01 NotReady master 37m v1.14.1
目前只有一个节点,角色是master,状态是NotReady。是因为没有网络插件的原因。

7.6 添加其他master节点

将node-01将证书文件拷贝至其他master节点
USER=root
CONTROL_PLANE_IPS="node-02 node-03"
for host in ${CONTROL_PLANE_IPS}; do
ssh "${USER}"@$host "mkdir -p /etc/kubernetes/pki/etcd"
scp /etc/kubernetes/pki/ca.* "${USER}"@$host:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/sa.* "${USER}"@$host:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/front-proxy-ca.* "${USER}"@$host:/etc/kubernetes/pki/
scp /etc/kubernetes/pki/etcd/ca.* "${USER}"@$host:/etc/kubernetes/pki/etcd/
scp /etc/kubernetes/admin.conf "${USER}"@$host:/etc/kubernetes/
done
在其他master执行,注意--experimental-control-plane参数,下面具体命令要根据kubeadm输出
[root@node-02 ~]# kubeadm join 172.19.8.250:8443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:30d13676940237d9c4f0c5c05e67cbeb58cc031f97e3515df27174e6cb777f60 \
--experimental-control-plane
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[preflight] Running pre-flight checks before initializing the new control plane instance
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [node-02 localhost] and IPs [172.19.8.112 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [node-02 localhost] and IPs [172.19.8.112 127.0.0.1 ::1]
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [node-02 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.245.0.1 172.19.8.112 172.19.8.250]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[certs] Using the existing "sa" key
[kubeconfig] Generating kubeconfig files
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Using existing kubeconfig file: "/etc/kubernetes/admin.conf"
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[check-etcd] Checking that the etcd cluster is healthy
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.14" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Activating the kubelet service
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
[etcd] Announced new etcd member joining to the existing etcd cluster
[etcd] Wrote Static Pod manifest for a local etcd member to "/etc/kubernetes/manifests/etcd.yaml"
[etcd] Waiting for the new etcd member to join the cluster. This can take up to 40s
[upload-config] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[mark-control-plane] Marking the node node-02 as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node node-02 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule] This node has joined the cluster and a new control plane instance was created: * Certificate signing request was sent to apiserver and approval was received.
* The Kubelet was informed of the new secure connection details.
* Control plane (master) label and taint were applied to the new node.
* The Kubernetes control plane instances scaled up.
* A new etcd member was added to the local/stacked etcd cluster. To start administering your cluster from this node, you need to run the following as a regular user: mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config Run 'kubectl get nodes' to see this node join the cluster.

注意:token有效期是有限的,如果旧的token过期,可以使用kubeadm token create --print-join-command重新创建一条token。

为kubectl准备kubeconfig文件
kubectl默认会在执行的用户家目录下面的.kube目录下寻找config文件。这里是将在初始化时[kubeconfig]步骤生成的admin.conf拷贝到.kube/config。
在该配置文件中,记录了API Server的访问地址,所以后面直接执行kubectl命令就可以正常连接到API Server中。
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
[root@node-02 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
node-01 NotReady master 90m v1.14.1
node-02 NotReady master 36s v1.14.1

7.7 部署node节点

在node-04、node-05、node-06执行,注意没有--experimental-control-plane参数,下面具体命令要根据kubeadm输出

kubeadm join 172.19.8.250:8443 --token abcdef.0123456789abcdef     --discovery-token-ca-cert-hash sha256:89accff8b4514d49be4b88906c50fdab4ba8a211788da7252b880c925af77671

7.8 部署网络插件flannel

Master节点NotReady的原因就是因为没有使用任何的网络插件,此时Node和Master的连接还不正常。目前最流行的Kubernetes网络插件有Flannel、Calico、Canal、Weave这里选择使用flannel。

[root@node-01 ~]# kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

这将在每个节点上运行flannel的daemonset

7.9  查看节点状态,需要几秒钟才会变化

[root@node-01 ~]# kubectl get node
NAME STATUS ROLES AGE VERSION
node-01 Ready master 163m v1.14.1
node-02 Ready master 74m v1.14.1
node-03 Ready master 68m v1.14.1
node-04 Ready <none> 66m v1.14.1
node-05 Ready <none> 40m v1.14.1
node-06 Ready <none> 62m v1.14.1

查看pod

[root@node-01 ~]# kubectl get pod -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-fb8b8dccf-5hwwz 0/1 ContainerCreating 0 165m
coredns-fb8b8dccf-r6z4q 0/1 ContainerCreating 0 165m
etcd-node-01 1/1 Running 0 163m
etcd-node-02 1/1 Running 0 75m
etcd-node-03 1/1 Running 0 70m
kube-apiserver-node-01 1/1 Running 0 163m
kube-apiserver-node-02 1/1 Running 0 75m
kube-apiserver-node-03 1/1 Running 0 70m
kube-controller-manager-node-01 1/1 Running 1 163m
kube-controller-manager-node-02 1/1 Running 0 75m
kube-controller-manager-node-03 1/1 Running 0 70m
kube-flannel-ds-amd64-2p8cd 0/1 CrashLoopBackOff 3 110s
kube-flannel-ds-amd64-9rjm9 0/1 CrashLoopBackOff 3 110s
kube-flannel-ds-amd64-bvhdn 0/1 Error 4 110s
kube-flannel-ds-amd64-l7bzb 0/1 CrashLoopBackOff 3 110s
kube-flannel-ds-amd64-qb5h6 0/1 CrashLoopBackOff 3 110s
kube-flannel-ds-amd64-w2jvq 0/1 Error 4 110s
kube-proxy-57vgk 1/1 Running 0 63m
kube-proxy-gkz7g 1/1 Running 0 70m
kube-proxy-h2kcg 1/1 Running 0 67m
kube-proxy-lc5bj 1/1 Running 0 41m
kube-proxy-rmxjs 1/1 Running 0 165m
kube-proxy-wlfrx 1/1 Running 0 75m
kube-scheduler-node-01 1/1 Running 1 164m
kube-scheduler-node-02 1/1 Running 0 75m
kube-scheduler-node-03 1/1 Running 0 70m
注意上面的报错信息,kube-flannel-ds 在报错,原因是kubeadm-init.yaml中没有配置networking.podSubnet,重新配置需要所有节点执行kubeadm rest,再执行kubeadm init,重新导证书。
修复后检查
[root@node-01 ~]# kubectl get pod -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-fb8b8dccf-6qsvj 1/1 Running 0 23m
coredns-fb8b8dccf-tvm9c 1/1 Running 0 23m
etcd-node-01 1/1 Running 0 22m
etcd-node-02 1/1 Running 0 10m
etcd-node-03 1/1 Running 0 10m
kube-apiserver-node-01 1/1 Running 0 22m
kube-apiserver-node-02 1/1 Running 0 10m
kube-apiserver-node-03 1/1 Running 0 8m55s
kube-controller-manager-node-01 1/1 Running 1 22m
kube-controller-manager-node-02 1/1 Running 0 10m
kube-controller-manager-node-03 1/1 Running 0 9m5s
kube-flannel-ds-amd64-49f8b 1/1 Running 0 6m41s
kube-flannel-ds-amd64-8vhc8 1/1 Running 0 6m41s
kube-flannel-ds-amd64-fhh85 1/1 Running 0 6m41s
kube-flannel-ds-amd64-hg27k 1/1 Running 0 6m41s
kube-flannel-ds-amd64-m6wxf 1/1 Running 0 6m41s
kube-flannel-ds-amd64-qqpnp 1/1 Running 0 6m41s
kube-proxy-6jhqr 1/1 Running 0 23m
kube-proxy-frsd8 1/1 Running 0 7m9s
kube-proxy-fstbk 1/1 Running 0 7m20s
kube-proxy-pk9qf 1/1 Running 0 10m
kube-proxy-pshmk 1/1 Running 0 10m
kube-proxy-tpbcm 1/1 Running 0 7m2s
kube-scheduler-node-01 1/1 Running 1 22m
kube-scheduler-node-02 1/1 Running 0 10m
kube-scheduler-node-03 1/1 Running 0 9m

至此使用kubeadm部署k8s已经完成。


简单介绍calico网络插件

kubectl apply -f https://docs.projectcalico.org/v3.3/getting-started/kubernetes/installation/hosted/rbac-kdd.yaml
wget https://docs.projectcalico.org/v3.3/getting-started/kubernetes/installation/hosted/kubernetes-datastore/calico-networking/1.7/calico.yaml

此处需要修改calico.yaml,该文件里面指定了pod使用的网络为 "192.168.0.0/16” ,要保证 kubeadm-init.yaml  和 calico.yaml 中的配置相同。本文中kubeadm-init.yaml 中配置了 podSubnet: "10.244.0.0/16”,因此需要修改calico.yaml

然后执行

kubectl apply -f calico.yaml
网络插件安装完成后,可以通过检查coredns pod的运行状态来判断网络插件是否正常运行:
kubectl get pods --all-namespaces
NAMESPACE     NAME                                   READY     STATUS              RESTARTS   AGE
kube-system   calico-node-lxz4c                      0/2       ContainerCreating   0          4m
kube-system   coredns-78fcdf6894-7xwn7               0/1       Pending             0          5m
kube-system   coredns-78fcdf6894-c2pq8               0/1       Pending             0          5m
kube-system   etcd-iz948lz3o7sz                      1/1       Running             0          5m
kube-system   kube-apiserver-iz948lz3o7sz            1/1       Running             0          5m
kube-system   kube-controller-manager-iz948lz3o7sz   1/1       Running             0          5m
kube-system   kube-proxy-wcj2r                       1/1       Running             0          5m
kube-system   kube-scheduler-iz948lz3o7sz            1/1       Running             0          4m
 
# 注:coredns 启动需要一定时间,刚开始是Pending
等待coredns pod的状态变成Running。

kubernetes之kubeadm 安装kubernetes 高可用集群的更多相关文章

  1. 大数据高可用集群环境安装与配置(09)——安装Spark高可用集群

    1. 获取spark下载链接 登录官网:http://spark.apache.org/downloads.html 选择要下载的版本 2. 执行命令下载并安装 cd /usr/local/src/ ...

  2. Linux源码安装RabbitMQ高可用集群

    1.环境说明 linux版本:CentOS Linux release 7.9.2009 erlang版本:erlang-24.0 rabbitmq版本:rabbitmq_server-3.9.13 ...

  3. 【工具-Nginx】从入门安装到高可用集群搭建

    文章已收录至https://lichong.work,转载请注明原文链接. ps:欢迎关注公众号"Fun肆编程"或添加我的私人微信交流经验 一.Nginx安装配置及常用命令 1.环 ...

  4. kubeadm实现k8s高可用集群环境部署与配置

    高可用架构 k8s集群的高可用实际是k8s各核心组件的高可用,这里使用主备模式,架构如下: 主备模式高可用架构说明: 核心组件 高可用模式 高可用实现方式 apiserver 主备 keepalive ...

  5. rabbitmq安装与高可用集群配置

    rabbitmq版本:3.6.12 rabbitmq安装 1.安装openssl wget http://www.openssl.org/source/openssl-1.0.0a.tar.gz &a ...

  6. 大数据高可用集群环境安装与配置(06)——安装Hadoop高可用集群

    下载Hadoop安装包 登录 https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/ 镜像站,找到我们要安装的版本,点击进去复制下载链接 ...

  7. 大数据高可用集群环境安装与配置(07)——安装HBase高可用集群

    1. 下载安装包 登录官网获取HBase安装包下载地址 https://hbase.apache.org/downloads.html 2. 执行命令下载并安装 cd /usr/local/src/ ...

  8. 大数据高可用集群环境安装与配置(10)——安装Kafka高可用集群

    1. 获取安装包下载链接 访问https://kafka.apache.org/downloads 找到kafka对应版本 需要与服务器安装的scala版本一致(运行spark-shell可以看到当前 ...

  9. 使用 Sealos 在 3 分钟内快速部署一个生产级别的 Kubernetes 高可用集群

    本文首发于:微信公众号「运维之美」,公众号 ID:Hi-Linux. 「运维之美」是一个有情怀.有态度,专注于 Linux 运维相关技术文章分享的公众号.公众号致力于为广大运维工作者分享各类技术文章和 ...

  10. Neo4j 高可用集群安装

    安装neo4j高可用集群,抓图安装过程 http://www.ibm.com/developerworks/cn/java/j-lo-neo4j/ Step1.下载neo4j商业版并解压,复制为neo ...

随机推荐

  1. 【LeetCode】1180. Count Substrings with Only One Distinct Letter 解题报告(C++)

    作者: 负雪明烛 id: fuxuemingzhu 个人博客:http://fuxuemingzhu.cn/ 目录 题目描述 题目大意 解题方法 组合数 日期 题目地址:https://leetcod ...

  2. 【LeetCode】252. Meeting Rooms 解题报告(C++)

    作者: 负雪明烛 id: fuxuemingzhu 个人博客:http://fuxuemingzhu.cn/ 目录 题目描述 题目大意 解题方法 排序 日期 题目地址:https://leetcode ...

  3. 【LeetCode】838. Push Dominoes 解题报告(Python)

    [LeetCode]838. Push Dominoes 解题报告(Python) 标签(空格分隔): LeetCode 作者: 负雪明烛 id: fuxuemingzhu 个人博客: http:// ...

  4. 慢 SQL 优化

    # 导致SQL执行慢的原因 1. 硬件问题.如网络速度慢,内存不足,I/O吞吐量小,磁盘空间满了等. 2. 没有索引或者索引失效.(一般在互联网公司,DBA会在半夜把表锁了,重新建立一遍索引,因为当你 ...

  5. Ubuntu18.04 + Windows10 双系统安装

    此处忽略Windows10安装!!! 准备 安装环境 OS:Windows10 CPU:Intel(R) Core(TM) i5-10600KF CPU @ 4.10GHz 4.10 GHz GPU: ...

  6. EntityFrameworkCore数据迁移(一)

    .net core出来已经有很长一段时间了,而EentityFrameworkCore(后面简称EFCore)是.net framework的EntityFramework在.net core中的实现 ...

  7. 开源实践 | 携程在OceanBase的探索与实践

    写在前面:选型考虑 携程于1999年创立,2016-2018年全面推进应用 MySQL 数据库,前期线上业务.前端技术等以 SQL Server 为主,后期数据库逐步从 SQL Server 转到开源 ...

  8. JavaScript 钩子

    <!DOCTYPE html> <html> <head> <meta charset="utf-8"/> <script s ...

  9. 初识python:hello world 仪式感

    python print 函数(在python中,不区分  ' ' 和 " "): print('hello world') 或者 print("hello wrold& ...

  10. python_自动查找指定目录下的文件或目录的方法

    代码如下 import os def find_file(search_path, file_type="file", filename=None, file_startswith ...