kubernetes-集群备份和恢复
- [root@master ~]# ETCDCTL_API= etcdctl snapshot save /backup/k8s1/snapshot.db
- Snapshot saved at /backup/k8s1/snapshot.db
- [root@master ~]# du -h /backup/k8s1/snapshot.db
- 1.6M /backup/k8s1/snapshot.db
3:拷贝kubernetes目录下ssl文件
- [root@master ~]# cp /etc/kubernetes/ssl/* /backup/k8s1/
- [root@master ~]# ll /backup/k8s1/
- 总用量 1628
- -rw-r--r--. 1 root root 1675 12月 10 21:21 admin-key.pem
- -rw-r--r--. 1 root root 1391 12月 10 21:21 admin.pem
- -rw-r--r--. 1 root root 997 12月 10 21:21 aggregator-proxy.csr
- -rw-r--r--. 1 root root 219 12月 10 21:21 aggregator-proxy-csr.json
- -rw-------. 1 root root 1675 12月 10 21:21 aggregator-proxy-key.pem
- -rw-r--r--. 1 root root 1383 12月 10 21:21 aggregator-proxy.pem
- -rw-r--r--. 1 root root 294 12月 10 21:21 ca-config.json
- -rw-r--r--. 1 root root 1675 12月 10 21:21 ca-key.pem
- -rw-r--r--. 1 root root 1350 12月 10 21:21 ca.pem
- -rw-r--r--. 1 root root 1082 12月 10 21:21 kubelet.csr
- -rw-r--r--. 1 root root 283 12月 10 21:21 kubelet-csr.json
- -rw-------. 1 root root 1675 12月 10 21:21 kubelet-key.pem
- -rw-r--r--. 1 root root 1452 12月 10 21:21 kubelet.pem
- -rw-r--r--. 1 root root 1273 12月 10 21:21 kubernetes.csr
- -rw-r--r--. 1 root root 488 12月 10 21:21 kubernetes-csr.json
- -rw-------. 1 root root 1679 12月 10 21:21 kubernetes-key.pem
- -rw-r--r--. 1 root root 1639 12月 10 21:21 kubernetes.pem
- -rw-r--r--. 1 root root 1593376 12月 10 21:32 snapshot.db
4:模拟集群崩溃,执行clean.yml清除操作
[root@master ~]# cd /etc/ansible/
[root@master ansible]# ansible-playbook .clean.yml
- [root@master ansible]# ansible-playbook .prepare.yml
- [root@master ansible]# ansible-playbook .etcd.yml
- [root@master ansible]# ansible-playbook .docker.yml
- [root@master ansible]# ansible-playbook .kube-master.yml
- [root@master ansible]# ansible-playbook .kube-node.yml
3:暂停etcd服务
[root@master ansible]# ansible etcd -m service -a 'name=etcd state=stopped'
4:清空数据
- [root@master ansible]# ansible etcd -m file -a 'name=/var/lib/etcd/member/ state=absent'
- [DEPRECATION WARNING]: The TRANSFORM_INVALID_GROUP_CHARS settings is set to allow bad characters in group names by default, this will change, but still be user
- configurable on deprecation. This feature will be removed in version 2.10. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg.
- [WARNING]: Invalid characters were found in group names but not replaced, use -vvvv to see details
- 192.168.1.203 | CHANGED => {
- "ansible_facts": {
- "discovered_interpreter_python": "/usr/bin/python"
- },
- "changed": true,
- "path": "/var/lib/etcd/member/",
- "state": "absent"
- }
- 192.168.1.202 | CHANGED => {
- "ansible_facts": {
- "discovered_interpreter_python": "/usr/bin/python"
- },
- "changed": true,
- "path": "/var/lib/etcd/member/",
- "state": "absent"
- }
- 192.168.1.200 | CHANGED => {
- "ansible_facts": {
- "discovered_interpreter_python": "/usr/bin/python"
- },
- "changed": true,
- "path": "/var/lib/etcd/member/",
- "state": "absent"
- }
4:将备份的etcd数据文件同步到每个etcd节点上
- [root@master ansible]# for i in ; do rsync -av /backup/k8s1 192.168..$i:/backup/; done
- sending incremental file list
- created directory /backup
- k8s1/
- k8s1/admin-key.pem
- k8s1/admin.pem
- k8s1/aggregator-proxy-csr.json
- k8s1/aggregator-proxy-key.pem
- k8s1/aggregator-proxy.csr
- k8s1/aggregator-proxy.pem
- k8s1/ca-config.json
- k8s1/ca-key.pem
- k8s1/ca.pem
- k8s1/kubelet-csr.json
- k8s1/kubelet-key.pem
- k8s1/kubelet.csr
- k8s1/kubelet.pem
- k8s1/kubernetes-csr.json
- k8s1/kubernetes-key.pem
- k8s1/kubernetes.csr
- k8s1/kubernetes.pem
- k8s1/snapshot.db
- sent ,, bytes received bytes ,239.60 bytes/sec
- total size is ,, speedup is 1.00
- sending incremental file list
- created directory /backup
- k8s1/
- k8s1/admin-key.pem
- k8s1/admin.pem
- k8s1/aggregator-proxy-csr.json
- k8s1/aggregator-proxy-key.pem
- k8s1/aggregator-proxy.csr
- k8s1/aggregator-proxy.pem
- k8s1/ca-config.json
- k8s1/ca-key.pem
- k8s1/ca.pem
- k8s1/kubelet-csr.json
- k8s1/kubelet-key.pem
- k8s1/kubelet.csr
- k8s1/kubelet.pem
- k8s1/kubernetes-csr.json
- k8s1/kubernetes-key.pem
- k8s1/kubernetes.csr
- k8s1/kubernetes.pem
- k8s1/snapshot.db
- sent ,, bytes received bytes ,,066.00 bytes/sec
- total size is ,, speedup is 1.00
5:在每个etcd节点执行下面数据恢复操作,然后重启etcd
##说明:在/etc/systemd/system/etcd.service找到--inital-cluster etcd1=https://xxxx:2380,etcd2=https://xxxx:2380,etcd3=https://xxxx:2380替换恢复命令中的--initial-cluster{ }变量,--name=【当前etcd-node-name】,最后还需要填写当前节点的IP:2380
①【deploy操作】
- [root@master ansible]# cd /backup/k8s1/
- [root@master k8s1]# ETCDCTL_API= etcdctl snapshot restore snapshot.db --name etcd1 --initial-cluster etcd1=https://192.168.1.200:2380,etcd2=https://192.168.1.202:2380,etcd3=https://192.168.1.203:2380 --initial-cluster-token etcd-cluster-0 --initial-advertise-peer-urls https://192.168.1.200:2380
- -- ::50.037127 I | mvcc: restore compact to
- -- ::50.052409 I | etcdserver/membership: added member 12229714d8728d0e [https://192.168.1.200:2380] to cluster b8ef796b710cde7d
- -- ::50.052451 I | etcdserver/membership: added member 552fb05951af50c9 [https://192.168.1.203:2380] to cluster b8ef796b710cde7d
- -- ::50.052474 I | etcdserver/membership: added member 8b4f4a6559bf7c2c [https://192.168.1.202:2380] to cluster b8ef796b710cde7d
执行上面步骤后,会在当前节点目录下,生成一个【node-name】.etcd目录文件
- [root@master k8s1]# tree etcd1.etcd/
- etcd1.etcd/
- └── member
- ├── snap
- │ ├── -.snap
- │ └── db
- └── wal
- └── -.wal
- [root@master k8s1]# cp -r etcd1.etcd/member /var/lib/etcd/
- [root@master k8s1]# systemctl restart etcd
②【etcd2节点操作】
- [root@node1 ~]# cd /backup/k8s1/
- [root@node1 k8s1]# ETCDCTL_API= etcdctl snapshot restore snapshot.db --name etcd2 --initial-cluster etcd1=https://192.168.1.200:2380,etcd2=https://192.168.1.202:2380,etcd3=https://192.168.1.203:2380 --initial-cluster-token etcd-cluster-0 --initial-advertise-peer-urls https://192.168.1.202:2380
- -- ::35.175032 I | mvcc: restore compact to
- -- ::35.232386 I | etcdserver/membership: added member 12229714d8728d0e [https://192.168.1.200:2380] to cluster b8ef796b710cde7d
- -- ::35.232507 I | etcdserver/membership: added member 552fb05951af50c9 [https://192.168.1.203:2380] to cluster b8ef796b710cde7d
- -- ::35.232541 I | etcdserver/membership: added member 8b4f4a6559bf7c2c [https://192.168.1.202:2380] to cluster b8ef796b710cde7d
- [root@node1 k8s1]# tree etcd2.etcd/
- etcd2.etcd/
- └── member
- ├── snap
- │ ├── -.snap
- │ └── db
- └── wal
- └── -.wal
- [root@node1 k8s1]# cp -r etcd1.etcd/member /var/lib/etcd/
- [root@node1 k8s1]# systemctl restart etcd
③【etcd3节点操作】
- [root@node2 ~]# cd /backup/k8s1/
- [root@node2 k8s1]# ETCDCTL_API= etcdctl snapshot restore snapshot.db --name etcd3 --initial-cluster etcd1=https://192.168.1.200:2380,etcd2=https://192.168.1.202:2380,etcd3=https://192.168.1.203:2380 --initial-cluster-token etcd-cluster-0 --initial-advertise-peer-urls https://192.168.1.203:2380
- -- ::55.943364 I | mvcc: restore compact to
- -- ::55.988674 I | etcdserver/membership: added member 12229714d8728d0e [https://192.168.1.200:2380] to cluster b8ef796b710cde7d
- -- ::55.988726 I | etcdserver/membership: added member 552fb05951af50c9 [https://192.168.1.203:2380] to cluster b8ef796b710cde7d
- -- ::55.988754 I | etcdserver/membership: added member 8b4f4a6559bf7c2c [https://192.168.1.202:2380] to cluster b8ef796b710cde7d
- [root@node2 k8s1]# tree etcd3.etcd/
- etcd3.etcd/
- └── member
- ├── snap
- │ ├── -.snap
- │ └── db
- └── wal
- └── -.wa
- [root@node2 k8s1]# cp -r etcd1.etcd/member /var/lib/etcd/
- [root@node2 k8s1]# systemctl restart etcd
6:在deploy节点上操作重建网络
[root@master ansible]# cd /etc/ansible/
[root@master ansible]# ansible-playbook tools/change_k8s_network.yml
7:查看pod、svc恢复是否成功
- [root@master ansible]# kubectl get svc
- NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
- kubernetes ClusterIP 10.68.0.1 <none> /TCP 5d5h
- nginx ClusterIP 10.68.241.175 <none> /TCP 5d4h
- tomcat ClusterIP 10.68.235.35 <none> /TCP 76m
- [root@master ansible]# kubectl get pods
- NAME READY STATUS RESTARTS AGE
- nginx-7c45b84548-4998z / Running 5d4h
- tomcat-8fc9f5995-9kl5b / Running 77m
三、自动备份、自动恢复
1:一键备份
[root@master ansible]# ansible-playbook /etc/ansible/.backup.yml
2:模拟故障
[root@master ansible]# ansible-playbook /etc/ansible/.clean.yml
修改文件/etc/ansible/roles/cluster-restore/defaults/main.yml,指定要恢复的etcd快照备份,如果不修改就是最新的一次
3:执行自动恢复操作
[root@master ansible]# ansible-playbook /etc/ansible/.restore.yml
[root@master ansible]# ansible-playbook /etc/ansible/tools/change_k8s_network.yml
kubernetes-集群备份和恢复的更多相关文章
- etcd v3集群备份和恢复
官方文档 https://github.com/etcd-io/etcd/blob/master/Documentation/op-guide/recovery.md 一.运行3个etcd节点 我们用 ...
- kubernetes集群断电后etcd启动失败之etcd备份方案
一.问题描述 二进制部署的单Master节点的v1.13.10版本的集群,etcd部署的是3.3.10版本,部署在master节点上.在异常断电后,kubernetes集群无法正常启动.这里通过查看k ...
- Velero:备份、迁移Kubernetes集群资源和PV
Velero基本介绍 官方文档:https://velero.io/docs/v1.4/ 基本工作原理: 不管需求是实现什么,比如:集群迁移.恢复.备份,其核心都是通过velero client CL ...
- Kubernetes集群部署关键知识总结
Kubernetes集群部署需要安装的组件东西很多,过程复杂,对服务器环境要求很苛刻,最好是能连外网的环境下安装,有些组件还需要连google服务器下载,这一点一般很难满足,因此最好是能提前下载好准备 ...
- 基于kubernetes集群的Vitess最佳实践
概要 本文主要说明基于kubernetes集群部署并使用Vitess; 本文假定用户已经具备了kubernetes集群使用环境,如果不具备请先参阅基于minikube的kubernetes集群搭建, ...
- 基于minikube的kubernetes集群部署及Vitess最佳实践
简介 minikube是一个可以很容易在本地运行Kubernetes集群的工具, minikube在电脑上的虚拟机内运行单节点Kubernetes集群,可以很方便的供Kubernetes日常开发使用: ...
- Kubernetes集群
Kubernetes已经成为当下最火热的一门技术,未来一定也会有更好的发展,围绕着云原生的周边产物也越来越多,使得上云更加便利更加有意义,本文主要讲解一些蔚来汽车从传统应用落地到Kubernetes集 ...
- Kubernetes 集群无损升级实践 转至元数据结尾
一.背景 活跃的社区和广大的用户群,使 Kubernetes 仍然保持3个月一个版本的高频发布节奏.高频的版本发布带来了更多的新功能落地和 bug 及时修复,但是线上环境业务长期运行,任何变更出错都可 ...
- 使用 Kubeadm+Containerd 部署一个 Kubernetes 集群
本文独立博客阅读地址:https://ryan4yin.space/posts/kubernetes-deployemnt-using-kubeadm/ 本文由个人笔记 ryan4yin/knowle ...
- Kubernetes集群使用CentOS 7.6系统时kubelet日志含有“Reason:KubeletNotReady Message:PLEG is not healthy:”信息
问题描述 Kubernetes集群使用CentOS 7.6版本的系统时,kubelet日志中可能存在以下告警信息. Reason:KubeletNotReady Message:PLEG is not ...
随机推荐
- ASP.NET Core 3 使用原生 依赖注入 集成 AspectCore ,实现 AOP 功能
在NETCORE中可以使用AOP的方式有很多很多,包括国内优秀的开源框架asp.netcore同样可以实现AOP编程模式. IOC方面,个人喜欢net core 3自带的DI,因为他注册服务简洁优 ...
- 如何解决UNMOUNTABLE BOOT VALUME
Windows error:UNMOUNTABLE BOOT VALUME 解决方法:Windows 修复工具 chkdsk命令 chkdsk D:/f ps:chkdsk 磁盘名 /f
- nyoj 170-网络的可靠性 (度为1)
170-网络的可靠性 内存限制:64MB 时间限制:3000ms 特判: No 通过数:15 提交数:21 难度:3 题目描述: A公司是全球依靠的互联网解决方案提供商,也是2010年世博会的高级赞助 ...
- JS如何在不给新空间的情况下给数组去重?
1.先排序,在让相邻元素对比去重 const nums = [3, 1, 1, 5, 2, 3, 4, 3, 5, 5, 6, 4, 6, 6, 6]; Array.prototype.arrayNo ...
- python中的__call__方法
在Python中,函数其实是一个对象: >>> f = abs >>> f.__name__ 'abs' >>> f(-) 由于 f 可以被调用, ...
- 推荐算法之用矩阵分解做协调过滤——LFM模型
隐语义模型(Latent factor model,以下简称LFM),是推荐系统领域上广泛使用的算法.它将矩阵分解应用于推荐算法推到了新的高度,在推荐算法历史上留下了光辉灿烂的一笔.本文将对 LFM ...
- ES6的基础知识(一)
1.ECMAScript 6.0(以下简称ES6). 2.ECMAScript 和 JavaScript 的关系是,前者是后者的规格,后者是前者的其中一种实现. 3.对ES6支持的浏览器:超过 90% ...
- go语言学习笔记(二)
整数 有符号整数 int8 int16 int32 int64 无符号整数 uin8 uin16 uin32 uin64 无符号整数 uintptr可以进行运算这点很重要请了解unsafe包,大小不明 ...
- windows 10上源码编译libjpeg-turbo和使用教程 | compile and use libjpeg-turbo on windows 10
本文首发于个人博客https://kezunlin.me/post/83828674/,欢迎阅读! compile and use libjpeg-turbo on windows 10 Series ...
- 借汇编之力窥探String背后的数据结构奥秘
熟悉C++.java.VB等编程语言的朋友都知道String(字符串),它是编程语言中表示文本的数据类型,字符串由若干字符组成的,是所有编程语⾔中⾮常重要的成员.可能很多朋友平时只是使用它,没有仔细研 ...