OpenStack 虚拟机冷/热迁移功能实践与流程分析
目录
文章目录
前文列表
NOTE:本文语境限于 OpenStack 原生 Libvirt Driver(QEMU-KVM Hypervisor)。
虚拟机迁移的应用场景
- 当某个计算节点发生故障修复时,将其之上的虚拟机迁移出去。
- 对某个计算节点进行硬件升级维护时,将其之上的虚拟机迁移出去。
- 当某个计算节点的负载超出安全阈值时,将其之上的一些虚拟机迁移出去。
- 对某个 REGION 或 AZ 中的计算节点进行负载均衡。
需要迁移的虚拟机数据类型
根据虚拟机启动方式、虚拟机磁盘类型的不同可以分为:
Boot from image
- 本地系统盘文件:Root file
- 本地数据盘文件:Ephemeral file、Swap file
- 本地配置文件:console.log、disk.info
- 共享数据盘:Volumes
- 内存数据:RAM
- 其他:Config Drive
Boot from volume
- 共享系统盘:Root Volume
- 本地数据盘文件:Ephemeral file、Swap file
- 本地配置文件:console.log、disk.info
- 共享数据盘:Volumes
- 虚拟机的内存数据:RAM
- 其他:Config Drive
NOTE:更多 OpenStack 虚拟机磁盘文件类型与存储方式请浏览《OpenStack 虚拟机的磁盘文件类型与存储方式》。
虚拟机迁移的存储场景
文件存储
典型的文件存储是 NAS(Network Attached Storage,网络接入存储)、NFS(Network File System,网络文件系统)。多个计算节点可以通过 NFS 共享 instances_path
存放磁盘文件,Cinder 也支持 NFS Backend 将 Volumes 存放到共享目录下。可见,在共享存储的场景中,需要迁移的仅仅是虚拟机的内存数据。
块存储
典型的块存储(Block Storage)是 SAN(Storage Area Network,区域存储网络)。块存储通过 iSCSI、FC 等协议将块设备 Attach 到应用服务器上,并被文件系统层接管,数据以块的形式存储在 Volume 里。同样的,Nova、Glance、Cinder 都可以接入 Block Storage Backend。即虚拟机使用了共享的系统盘、数据盘,Local 只存放 Ephemeral file、Swap file、console.log、disk.info 等磁盘及配置文件。如果虚拟机没有使用 Ephemeral、Swap Disks 的话,那么需要迁移的也同样只有内存数据。
从某种层面上看,我们不妨将文件存储、块存储归纳为共享存储行列。但需要注意的是,严格来说纯粹的块存储并不能称之为共享存储,因为块设备无法做到在多个挂载点上同时刷写数据,这需要业务层的支持,例如 CInder Multi-Attach。这也导致了在迁移场景中,对文件存储、块存储的处理方式并不相同:使用块存储的迁移过程,块设备需要经历先 Detach、再重定向 Attach 的过程;而使用文件存储的迁移过程中只需要在目的节点上直接访问 mountd 目录即可。
值得一提的是,NAS 和 SAN 作为传统存储解决方案分别提供了文件存储和块存储。而 Ceph 作为统一存储解决方案,能够同时提供文件存储、块存储和对象存储,现在已经被大量使用到生产环境中。
非共享存储
非共享存储场景中,虚拟机的磁盘文件存放方式为 Local,这就需要对虚拟机的内存数据、本地磁盘文件均进行迁移。迁移方式就是数据块级别的拷贝,简称块迁移。显然,这种场景对热迁移并不友好,因为拷贝时间太长会提高数据的丢失率(e.g. 拷贝过程中的网络问题)。
迁移的类型
为人熟知的迁移类型有冷迁移和热迁移,这两个概念很好区别,以迁移过程中是否需要关闭业务主机作为辨识。
冷迁移:即关闭虚拟机、数据迁移。需要迁移的只有系统盘数据、数据盘数据,而无需迁移内存数据,使用块迁移方式。
- 好处:操作简单,迁移方式灵活,虚拟机不产生动态数据,所以数据丢失率小。
- 缺点:运行在虚拟机之上的业务被中断。
热迁移:又称动态迁移、在线迁移,是一种用户无感的迁移方式。虚拟机不需要关机,业务不被中断,但相对的是一种复杂的迁移方式。
- 好处:用户无感,平滑迁移,有效保证业务高可用性。
- 缺点:操作复杂,数据丢失率大。
Non-live migration, also known as cold migration or simply migration. The instance is shut down, then moved to another hypervisor and restarted. The instance recognizes that it was rebooted, and the application running on the instance is disrupted.
Live migration, The instance keeps running throughout the migration. This is useful when it is not possible or desirable to stop the application running on the instance. Live migrations can be classified further by the way they treat instance storage:
- Shared storage-based live migration. The instance has ephemeral disks that are located on storage shared between the source and destination hosts.
- Block live migration, or simply block migration. The instance has ephemeral disks that are not shared between the source and destination hosts. Block migration is incompatible with read-only devices such as CD-ROMs and Configuration Drive (config_drive).
- Volume-backed live migration. Instances use volumes rather than ephemeral disks.
迁移的方式
根据虚拟机数据类型、存储场景的不同,迁移方式可以分为:
冷迁移
- 本地磁盘文件 — 基于 scp 指令的块迁移
- 块存储卷 — 重定向挂载
- 文件存储目录 — 直接访问本地文件系统
热迁移
- 本地磁盘文件 — 基于 scp 指令的块迁移,或基于 Hypervisor 的块迁移(视乎 Hypervisor 是否对除内存数据之外的数据进行块迁移)
- 块存储卷 — 重定向挂载
- 文件存储目录 — 直接访问本地文件系统
- 内存数据 — 基于 Hypervisor 的块迁移
执行虚拟机冷迁移
迁移场景信息:
- Boot from image
- Nova 没有接入后端存储,虚拟机磁盘文件存放方式为 Local
Step 1. 保证源计算节点和目的计算节点的 nova 用户可以进行 SSH 免密登录,因为 nova-compute.service 默认是由 nova 用户启动的,该服务进程会在源计算节点和目的计算节点之间使用 scp 指令进行数据拷贝。否则就会出现如下异常:
2019-03-15 03:33:21.428 10639 ERROR oslo_messaging.rpc.server ResizeError: Resize error: not able to execute ssh command: Unexpected error while running command.
2019-03-15 03:33:21.428 10639 ERROR oslo_messaging.rpc.server Command: ssh -o BatchMode=yes 172.17.1.16 mkdir -p /var/lib/nova/instances/1365380a-a532-4811-8784-57f507acac46
2019-03-15 03:33:21.428 10639 ERROR oslo_messaging.rpc.server Exit code: 255
2019-03-15 03:33:21.428 10639 ERROR oslo_messaging.rpc.server Stdout: u''
2019-03-15 03:33:21.428 10639 ERROR oslo_messaging.rpc.server Stderr: u'Permission denied (publickey,gssapi-keyex,gssapi-with-mic).\r\n'
Step 2. 关闭 SELinux 或者执行下述指令让 SSH 免密登录认证文件可被访问
chcon -R -t ssh_home_t /var/lib/nova/.ssh/authorized_keys
NOTE:SELinux 相关日志 /var/log/audit/audit.log
Step 3. 执行迁移
[stack@undercloud (overcloudrc) ~]$ openstack server create --image c9debff2-cd87-4688-b712-87a2948461ce --flavor a0dd32df-8c1b-47ed-9b7c-88612a5dd78d --nic net-id=11d8d379-dcd9-46ff-9cd1-25d2737affb4 tst-block-migrate-vm
+--------------------------------------+-----------------------------------------------+
| Field | Value |
+--------------------------------------+-----------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | |
| OS-EXT-SRV-ATTR:host | None |
| OS-EXT-SRV-ATTR:hypervisor_hostname | None |
| OS-EXT-SRV-ATTR:instance_name | |
| OS-EXT-STS:power_state | NOSTATE |
| OS-EXT-STS:task_state | scheduling |
| OS-EXT-STS:vm_state | building |
| OS-SRV-USG:launched_at | None |
| OS-SRV-USG:terminated_at | None |
| accessIPv4 | |
| accessIPv6 | |
| addresses | |
| adminPass | WswHKEmcPnV3 |
| config_drive | |
| created | 2019-03-15T09:32:17Z |
| flavor | 1U2G (a0dd32df-8c1b-47ed-9b7c-88612a5dd78d) |
| hostId | |
| id | 80996760-0c30-4e2a-847a-b9d882182df5 |
| image | cirros (c9debff2-cd87-4688-b712-87a2948461ce) |
| key_name | None |
| name | tst-block-migrate-vm |
| os-extended-volumes:volumes_attached | [] |
| progress | 0 |
| project_id | a6c78435075246f3aa5ab946b87086c5 |
| properties | |
| security_groups | [{u'name': u'default'}] |
| status | BUILD |
| updated | 2019-03-15T09:32:17Z |
| user_id | 4fe574569664493bbd660abfe762a630 |
+--------------------------------------+-----------------------------------------------+
[stack@undercloud (overcloudrc) ~]$ openstack server show tst-block-migrate-vm
+--------------------------------------+----------------------------------------------------------+
| Field | Value |
+--------------------------------------+----------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | ovs |
| OS-EXT-SRV-ATTR:host | overcloud-ovscompute-1.localdomain |
| OS-EXT-SRV-ATTR:hypervisor_hostname | overcloud-ovscompute-1.localdomain |
| OS-EXT-SRV-ATTR:instance_name | instance-0000008e |
| OS-EXT-STS:power_state | Running |
| OS-EXT-STS:task_state | None |
| OS-EXT-STS:vm_state | active |
| OS-SRV-USG:launched_at | 2019-03-15T09:32:31.000000 |
| OS-SRV-USG:terminated_at | None |
| accessIPv4 | |
| accessIPv6 | |
| addresses | net1=10.0.1.14 |
| config_drive | |
| created | 2019-03-15T09:32:17Z |
| flavor | 1U2G (a0dd32df-8c1b-47ed-9b7c-88612a5dd78d) |
| hostId | 9f1230901ddf3fe0e1a41e1c650a784c122b791f89fdf66a40cff3d6 |
| id | 80996760-0c30-4e2a-847a-b9d882182df5 |
| image | cirros (c9debff2-cd87-4688-b712-87a2948461ce) |
| key_name | None |
| name | tst-block-migrate-vm |
| os-extended-volumes:volumes_attached | [] |
| progress | 0 |
| project_id | a6c78435075246f3aa5ab946b87086c5 |
| properties | |
| security_groups | [{u'name': u'default'}] |
| status | ACTIVE |
| updated | 2019-03-15T09:32:32Z |
| user_id | 4fe574569664493bbd660abfe762a630 |
+--------------------------------------+----------------------------------------------------------+
[stack@undercloud (overcloudrc) ~]$ openstack server migrate --block-migration --wait tst-block-migrate-vm
Complete
[stack@undercloud (overcloudrc) ~]$ openstack server show tst-block-migrate-vm
+--------------------------------------+----------------------------------------------------------+
| Field | Value |
+--------------------------------------+----------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | ovs |
| OS-EXT-SRV-ATTR:host | overcloud-ovscompute-0.localdomain |
| OS-EXT-SRV-ATTR:hypervisor_hostname | overcloud-ovscompute-0.localdomain |
| OS-EXT-SRV-ATTR:instance_name | instance-0000008e |
| OS-EXT-STS:power_state | Running |
| OS-EXT-STS:task_state | None |
| OS-EXT-STS:vm_state | active |
| OS-SRV-USG:launched_at | 2019-03-15T09:33:52.000000 |
| OS-SRV-USG:terminated_at | None |
| accessIPv4 | |
| accessIPv6 | |
| addresses | net1=10.0.1.14 |
| config_drive | |
| created | 2019-03-15T09:32:17Z |
| flavor | 1U2G (a0dd32df-8c1b-47ed-9b7c-88612a5dd78d) |
| hostId | 0f2ec590cd73fe0e9522f1ba715dae7a7d4b884e15aa8254defe85d0 |
| id | 80996760-0c30-4e2a-847a-b9d882182df5 |
| image | cirros (c9debff2-cd87-4688-b712-87a2948461ce) |
| key_name | None |
| name | tst-block-migrate-vm |
| os-extended-volumes:volumes_attached | [] |
| progress | 0 |
| project_id | a6c78435075246f3aa5ab946b87086c5 |
| properties | |
| security_groups | [{u'name': u'default'}] |
| status | ACTIVE |
| updated | 2019-03-15T09:34:53Z |
| user_id | 4fe574569664493bbd660abfe762a630 |
+--------------------------------------+----------------------------------------------------------+
NOTE:上述执行迁移时没有显式选中目的计算节点,交由 nova-scheduler.service 负责调度。
冷迁移日志分析
从上述迁移场景信息可知,Nova 会对该虚拟机使用块迁移方式进行磁盘文件的拷贝,这一点也体现在了操作日志中。
源主机日志分析:
# 开始进入迁移逻辑。
Starting migrate_disk_and_power_off
# 在目的计算节点上尝试创建 tmp 文件,以此来判断源计算节点和目的计算节点是否使用了共享存储
Creating file /var/lib/nova/instances/80996760-0c30-4e2a-847a-b9d882182df5/8ac1bb9977bc4b4b948c4c8fdad9f1f6.tmp on remote host 172.17.1.16 create_file
# 创建 tmp 文件失败,表示没有使用共享存储,因为虚拟机目录不存在
'ssh -o BatchMode=yes 172.17.1.16 touch /var/lib/nova/instances/80996760-0c30-4e2a-847a-b9d882182df5/8ac1bb9977bc4b4b948c4c8fdad9f1f6.tmp' failed. Not Retrying.
# 在目的计算节点上创建虚拟机目录
Creating directory /var/lib/nova/instances/80996760-0c30-4e2a-847a-b9d882182df5 on remote host 172.17.1.16
# 关闭虚拟机电源
Shutting down instance
Instance shutdown successfully after 35 seconds.
# Hypervisor 层删除虚拟机实例
Instance destroyed successfully.
# 重命名虚拟机目录
mv /var/lib/nova/instances/80996760-0c30-4e2a-847a-b9d882182df5 /var/lib/nova/instances/80996760-0c30-4e2a-847a-b9d882182df5_resize
# 拷贝虚拟机磁盘文件及配置文件至目的主机
scp -r /var/lib/nova/instances/80996760-0c30-4e2a-847a-b9d882182df5_resize/disk 172.17.1.16:/var/lib/nova/instances/80996760-0c30-4e2a-847a-b9d882182df5/disk
scp -r /var/lib/nova/instances/80996760-0c30-4e2a-847a-b9d882182df5_resize/disk.info 172.17.1.16:/var/lib/nova/instances/80996760-0c30-4e2a-847a-b9d882182df5/disk.info
# Nova 层面虚拟机已停止
VM Stopped (Lifecycle Event)
# 正式删除源主机虚拟机目录
rm -rf /var/lib/nova/instances/80996760-0c30-4e2a-847a-b9d882182df5_resize
# 移除虚拟机网络设备
Unplugging vif VIFBridge(active=True,address=fa:16:3e:d0:f6:a4,bridge_name='qbr15c7b577-89',has_traffic_filtering=True,id=15c7b577-89f5-46f6-8111-5f4e0c8ebaa1,network=Network(11d8d379-dcd9-46ff-9cd1-25d2737affb4),plugin='ovs',port_profile=VIFPortProfileBase,preserve_on_delete=False,vif_name='tap15c7b577-89')
brctl delif qbr15c7b577-89 qvb15c7b577-89
ip link set qbr15c7b577-89 down
brctl delbr qbr15c7b577-89
ovs-vsctl --timeout=120 -- --if-exists del-port br-int qvo15c7b577-89
ip link delete qvo15c7b577-89
目的主机日志分析:
# Nova 层面新建虚拟机,预扣虚拟机资源
Claim successful
# 迁移中
Migrating
# 更新虚拟机 vNIC 的 Port binding:host_id 信息
Updating port 15c7b577-89f5-46f6-8111-5f4e0c8ebaa1 with attributes {'binding:host_id': u'overcloud-ovscompute-0.localdomain'}
# 创建虚拟机镜像
Creating image
# 检查是否可以 resize 虚拟机磁盘文件
Checking if we can resize image /var/lib/nova/instances/80996760-0c30-4e2a-847a-b9d882182df5/disk.
Cannot resize image /var/lib/nova/instances/80996760-0c30-4e2a-847a-b9d882182df5/disk
# 确定虚拟机 console.log 日志文件存在
Ensure instance console log exists: /var/lib/nova/instances/80996760-0c30-4e2a-847a-b9d882182df5/console.log
# 组装 GuestOS 的 XML 文件
End _get_guest_xml
# 添加虚拟机网络虚拟设备
Plugging vif VIFBridge(active=False,address=fa:16:3e:d0:f6:a4,bridge_name='qbr15c7b577-89',has_traffic_filtering=True,id=15c7b577-89f5-46f6-8111-5f4e0c8ebaa1,network=Network(11d8d379-dcd9-46ff-9cd1-25d2737affb4),plugin='ovs',port_profile=VIFPortProfileBase,preserve_on_delete=False,vif_name='tap15c7b577-89')
brctl addbr qbr15c7b577-89
brctl setfd qbr15c7b577-89 0
brctl stp qbr15c7b577-89 off
brctl setageing qbr15c7b577-89 0
tee /sys/class/net/qbr15c7b577-89/bridge/multicast_snooping
tee /proc/sys/net/ipv6/conf/qbr15c7b577-89/disable_ipv6
ip link add qvb15c7b577-89 type veth peer name qvo15c7b577-89
ip link set qvb15c7b577-89 up
ip link set qvb15c7b577-89 promisc on
ip link set qvb15c7b577-89 mtu 1450
ip link set qvo15c7b577-89 up
ip link set qvo15c7b577-89 promisc on
ip link set qvo15c7b577-89 mtu 1450
ip link set qbr15c7b577-89 up
brctl addif qbr15c7b577-89 qvb15c7b577-89
ovs-vsctl -- --may-exist add-br br-int -- set Bridge br-int datapath_type=system
ovs-vsctl --timeout=120 -- --if-exists del-port qvo15c7b577-89 -- add-port br-int qvo15c7b577-89 -- set Interface qvo15c7b577-89 external-ids:iface-id=15c7b577-89f5-46f6-8111-5f4e0c8ebaa1 external-ids:iface-status=active external-ids:attached-mac=fa:16:3e:d0:f6:a4 external-ids:vm-uuid=80996760-0c30-4e2a-847a-b9d882182df
ip link set qvo15c7b577-89 mtu 1450
# Noca 层面虚拟机已启动
Instance running successfully.
VM Started (Lifecycle Event)
NOTE:虽然上述将源主机和目的主机的操作日志分开记录,但实际上两者的 nova-compute.service 是交叉交互的,并非源主机的迁移操作处理完了之后再开始进行目的主机的迁移操作处理。
执行虚拟机热迁移
迁移场景信息:
- Boot from image
- Nova 没有接入后端存储,虚拟机磁盘文件存放方式为 Local
- Cinder 接入 LVM Backend
- 具有多个块设备
- 具有多个端口
- 具有 NUMA 亲和
- 具有 CPU 绑定
从迁移场景信息可知,虚拟机的 Local 磁盘文件、内存数据都通过 Libvirt Live Migration 完成迁移;共享块设备通过重定向挂载完成迁移;端口设备通过 Neutron 完成虚拟网络设备的创建和删除。
为了保障 OpenStack 虚拟机热迁移的正常运行,需要满足几个前提条件:
- 源和目标节点的 CPU 类型要一致。
- 源和目标节点的 Libvirt 版本要一致。
- 源和目标节点能相互识别对方的 hostname(DNS)。
- 虚拟机可能会使用 Config Driver 保存 metadata 或 user data。在 Block Migration 模式下,Config Driver 也会被迁移到目标节点。由于目前 Libvirt 只支持迁移 vfat 类型的 Config Driver,所以需要在 nova.conf 中明确指明启动虚拟机时创建 vfat 类型的 Config Driver,以保持一致。
配置 Libvirt 使用 SSH 协议进行数据传输:
[libvirt]
...
live_migration_uri=qemu+ssh://nova_migration@%s/system?keyfile=/etc/nova/migration/identity
qemu+ssh
:使用 ssh 协议nova_migration
:执行 ssh 的用户%s
:计算节点 hostname,e.g.nova_migration@cpu01
keyfile
:安全通信的 ssh 私钥
除此之外还可以使用 TCP 协议进行数据传输:
live_migration_uri=qemu+tcp://nova_migration@%s/system
如果使用了 TCP 协议,那么还得将源和目标节点的 Libvirt TCP 远程监听服务打开:
# /etc/libvirt/libvirtd.conf
listen_tls = 0
listen_tcp = 1
auth_tcp = "none"
# /etc/init/libvirt-bin.conf
# options passed to libvirtd, add "-l" to listen on tcp
env libvirtd_opts="-d -l"
# /etc/default/libvirt-bin
libvirtd_opts="-d -l"
NOTE:在热迁移时也可能会采用块迁移,但这不是值得推荐的方式。
Block live migration requires copying disks from the source to the destination host. It takes more time and puts more load on the network. Shared-storage and volume-backed live migration does not copy disks.
[stack@undercloud (overcloudrc) ~]$ openstack server show VM1
+--------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+
| Field | Value |
+--------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+
| OS-DCF:diskConfig | AUTO |
| OS-EXT-AZ:availability_zone | ovs |
| OS-EXT-SRV-ATTR:host | overcloud-ovscompute-0.localdomain |
| OS-EXT-SRV-ATTR:hypervisor_hostname | overcloud-ovscompute-0.localdomain |
| OS-EXT-SRV-ATTR:instance_name | instance-000000a0 |
| OS-EXT-STS:power_state | Running |
| OS-EXT-STS:task_state | None |
| OS-EXT-STS:vm_state | active |
| OS-SRV-USG:launched_at | 2019-03-19T08:04:50.000000 |
| OS-SRV-USG:terminated_at | None |
| accessIPv4 | |
| accessIPv6 | |
| addresses | net1=10.0.1.17, 10.0.1.8, 10.0.1.16, 10.0.1.10, 10.0.1.18, 10.0.1.19 |
| config_drive | |
| created | 2019-03-19T08:04:04Z |
| flavor | Flavor1 (2ff09ec5-19e4-40b9-a52e-6026652c0788) |
| hostId | 0f2ec590cd73fe0e9522f1ba715dae7a7d4b884e15aa8254defe85d0 |
| id | a2855dfd-c6e5-4cbf-9fdf-4b083cc8ec37 |
| image | CentOS-7-x86_64-GenericCloud (0aff2888-47f8-4133-928a-9c54414b3afb) |
| key_name | stack |
| name | VM1 |
| os-extended-volumes:volumes_attached | [] |
| progress | 0 |
| project_id | a6c78435075246f3aa5ab946b87086c5 |
| properties | |
| security_groups | [{u'name': u'default'}, {u'name': u'default'}, {u'name': u'default'}, {u'name': u'default'}, {u'name': u'default'}, {u'name': u'default'}] |
| status | ACTIVE |
| updated | 2019-03-19T08:04:50Z |
| user_id | 4fe574569664493bbd660abfe762a630 |
+--------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+
[stack@undercloud (overcloudrc) ~]$ openstack server add volume VM1 volume1
[stack@undercloud (overcloudrc) ~]$ openstack server add volume VM1 volume2
[stack@undercloud (overcloudrc) ~]$ openstack server add volume VM1 volume3
[stack@undercloud (overcloudrc) ~]$ openstack server add volume VM1 volume4
[stack@undercloud (overcloudrc) ~]$ openstack server add volume VM1 volume5
[stack@undercloud (overcloudrc) ~]$ openstack server migrate --block-migration --live overcloud-ovscompute-1.localdomain --wait VM1
Complete
[stack@undercloud (overcloudrc) ~]$ openstack server show VM1
+--------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field | Value |
+--------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| OS-DCF:diskConfig | AUTO |
| OS-EXT-AZ:availability_zone | ovs |
| OS-EXT-SRV-ATTR:host | overcloud-ovscompute-1.localdomain |
| OS-EXT-SRV-ATTR:hypervisor_hostname | overcloud-ovscompute-1.localdomain |
| OS-EXT-SRV-ATTR:instance_name | instance-000000a0 |
| OS-EXT-STS:power_state | Running |
| OS-EXT-STS:task_state | None |
| OS-EXT-STS:vm_state | active |
| OS-SRV-USG:launched_at | 2019-03-19T08:04:50.000000 |
| OS-SRV-USG:terminated_at | None |
| accessIPv4 | |
| accessIPv6 | |
| addresses | net1=10.0.1.17, 10.0.1.8, 10.0.1.16, 10.0.1.10, 10.0.1.18, 10.0.1.19 |
| config_drive | |
| created | 2019-03-19T08:04:04Z |
| flavor | Flavor1 (2ff09ec5-19e4-40b9-a52e-6026652c0788) |
| hostId | 9f1230901ddf3fe0e1a41e1c650a784c122b791f89fdf66a40cff3d6 |
| id | a2855dfd-c6e5-4cbf-9fdf-4b083cc8ec37 |
| image | CentOS-7-x86_64-GenericCloud (0aff2888-47f8-4133-928a-9c54414b3afb) |
| key_name | stack |
| name | VM1 |
| os-extended-volumes:volumes_attached | [{u'id': u'afbe0783-50b8-4036-b59a-69b94dbdb630'}, {u'id': u'27fc8950-6e98-4ba7-9366-907e8fd2a90a'}, {u'id': u'df8b33a8-6d8c-4e0e-a742-869fec4ff923'}, {u'id': u'534bb675-4d8c-4380-8bd2-4aeaedbcda40'}, {u'id': u'623a513a-2cca- |
| | 47e5-9426-71a154cbe0c0'}] |
| progress | 0 |
| project_id | a6c78435075246f3aa5ab946b87086c5 |
| properties | |
| security_groups | [{u'name': u'default'}, {u'name': u'default'}, {u'name': u'default'}, {u'name': u'default'}, {u'name': u'default'}, {u'name': u'default'}] |
| status | ACTIVE |
| updated | 2019-03-19T08:18:39Z |
| user_id | 4fe574569664493bbd660abfe762a630 |
+--------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
NUMA 亲和和 CPU 绑定迁移情况:
# 源主机
[root@overcloud-ovscompute-0 nova]# numactl -H
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 6 7 8
node 0 size: 4095 MB
node 0 free: 1273 MB
node 1 cpus: 9 10 11 12 13 14 15
node 1 size: 4096 MB
node 1 free: 2410 MB
node distances:
node 0 1
0: 10 20
1: 20 10
[root@overcloud-ovscompute-0 nova]# virsh list
Id Name State
----------------------------------------------------
11 instance-000000a0 running
[root@overcloud-ovscompute-0 nova]# virsh vcpuinfo instance-000000a0
VCPU: 0
CPU: 0
State: running
CPU time: 50.1s
CPU Affinity: y---------------
VCPU: 1
CPU: 1
State: running
CPU time: 26.8s
CPU Affinity: -y--------------
[root@overcloud-ovscompute-0 nova]# virsh vcpupin instance-000000a0
VCPU: CPU Affinity
----------------------------------
0: 0
1: 1
# 目的主机
[root@overcloud-ovscompute-1 nova]# numactl -H
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 6 7 8
node 0 size: 4095 MB
node 0 free: 1420 MB
node 1 cpus: 9 10 11 12 13 14 15
node 1 size: 4096 MB
node 1 free: 2270 MB
node distances:
node 0 1
0: 10 20
1: 20 10
[root@overcloud-ovscompute-1 nova]# virsh vcpuinfo instance-000000a0
VCPU: 0
CPU: 0
State: running
CPU time: 2.9s
CPU Affinity: y---------------
VCPU: 1
CPU: 1
State: running
CPU time: 1.2s
CPU Affinity: -y--------------
[root@overcloud-ovscompute-1 nova]# virsh vcpupin instance-000000a0
VCPU: CPU Affinity
----------------------------------
0: 0
1: 1
热迁移日志分析
源主机日志分析:
# 通过创建 tmpfile 来检测是否使用了共存存储
Check if temp file /var/lib/nova/instances/tmpZ0Bj8s exists to indicate shared storage is being used for migration. Exists? False
# 开始热迁移
Starting monitoring of live migration _live_migration /usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py:6566
# 轮询监控 libvirtd 热迁移状态并打印迁移进度日志,并动态传递 downtime(最大停机时间)
# 因为虚拟机的数据仍会不断变化,所以最终迁移的 Size 往往会大于 data_gb
Current None elapsed 0 steps [(0, 46), (300, 47), (600, 48), (900, 51), (1200, 57), (1500, 66), (1800, 84), (2100, 117), (2400, 179), (2700, 291), (3000, 500)] update_downtime /usr/lib/python2.7/site-packages/nova/virt/libvirt/migration.py:348
Increasing downtime to 46 ms after 0 sec elapsed time
Migration running for 0 secs, memory 100% remaining; (bytes processed=0, remaining=0, total=0)
# Nova 层面暂停虚拟机
VM Paused (Lifecycle Event)
# 虚拟机数据迁移完成
Migration operation has completed
Migration operation thread has finished
# 卸载虚拟机的共享块设备
calling os-brick to detach iSCSI Volume disconnect_volume
iscsiadm -m node -T iqn.2010-10.org.openstack:volume-afbe0783-50b8-4036-b59a-69b94dbdb630 -p 172.17.3.18:3260 --op delete
Checking to see if SCSI volumes sdc have been removed.
SCSI volumes sdc have been removed.
# Nova 层面虚拟机已停止
VM Stopped (Lifecycle Event)
# 拔出虚拟机网络
Unplugging vif VIFBridge
# 删除虚拟机本地磁盘文件
mv /var/lib/nova/instances/a2855dfd-c6e5-4cbf-9fdf-4b083cc8ec37 /var/lib/nova/instances/a2855dfd-c6e5-4cbf-9fdf-4b083cc8ec37_del
Deleting instance files /var/lib/nova/instances/a2855dfd-c6e5-4cbf-9fdf-4b083cc8ec37_del
Deletion of /var/lib/nova/instances/a2855dfd-c6e5-4cbf-9fdf-4b083cc8ec37_del complete
# 迁移完成
Migrating instance to overcloud-ovscompute-1.localdomain finished successfully.
Live migration monitoring is all done
目的主机日志分析:
# 创建 tmpfile 检测是否使用共享存储
Creating tmpfile /var/lib/nova/instances/tmpZ0Bj8s to notify to other compute nodes that they should mount the same storage.
# 创建虚拟机本地磁盘文件
Creating instance directory: /var/lib/nova/instances/a2855dfd-c6e5-4cbf-9fdf-4b083cc8ec37
touch -c /var/lib/nova/instances/_base/ff34147b1062cd454ae2a8959f069e2e18691ec9
qemu-img create -f qcow2 -o backing_file=/var/lib/nova/instances/_base/ff34147b1062cd454ae2a8959f069e2e18691ec9 /var/lib/nova/instances/a2855dfd-c6e5-4cbf-9fdf-4b083cc8ec37/disk
Creating disk.info with the contents: {u'/var/lib/nova/instances/a2855dfd-c6e5-4cbf-9fdf-4b083cc8ec37/disk': u'qcow2'}
Checking to make sure images and backing files are present before live migration.
# 检查磁盘文件是否可以 Resize
Checking if we can resize image /var/lib/nova/instances/a2855dfd-c6e5-4cbf-9fdf-4b083cc8ec37/disk. size=10737418240
qemu-img resize /var/lib/nova/instances/a2855dfd-c6e5-4cbf-9fdf-4b083cc8ec37/disk 10737418240
# 挂载虚拟机的共享块设备
Connecting volumes before live migration.
Calling os-brick to attach iSCSI Volume connect_volume /usr/lib/python2.7/site-packages/nova/virt/libvirt/volume/iscsi.py:63
Trying to connect to iSCSI portal 172.17.3.18:3260
iscsiadm -m node -T iqn.2010-10.org.openstack:volume-afbe0783-50b8-4036-b59a-69b94dbdb630 -p 172.17.3.18:3260
Attached iSCSI volume {'path': u'/dev/sda', 'scsi_wwn': '360014052de14ef00f124a939740ba645', 'type': 'block'}
# 插入虚拟机网络
Plugging VIFs before live migration.
# 更新 Port 信息
Port 35f7ede8-2a78-44b6-8c65-108e6f1080aa updated with migration profile {'migrating_to': 'overcloud-ovscompute-1.localdomain'} successfully
# Nova 层面虚拟机已启动
VM Started (Lifecycle Event)
VM Resumed (Lifecycle Event)
从上述日志可以看出,热迁移的关键动作是交由 Hypervisor 层完成的,Nova 只是对 Hypervisor Live Migration 功能进行了封装和调度管理。在上例中 Libvirt Live Migration 通过 SSH 协议将虚拟机的本地磁盘文件、内存数据一并传输到目的主机。
参考资料
https://developers.redhat.com/blog/2015/03/24/live-migrating-qemu-kvm-virtual-machines/
https://www.cnblogs.com/sammyliu/p/4572287.html
https://docs.openstack.org/nova/pike/admin/configuring-migrations.html
https://docs.openstack.org/nova/pike/admin/live-migration-usage.html
https://blog.csdn.net/lemontree1945/article/details/79901874
https://www.ibm.com/developerworks/cn/linux/l-cn-mgrtvm1/index.html
https://blog.csdn.net/hawkerou/article/details/53482268
OpenStack 虚拟机冷/热迁移功能实践与流程分析的更多相关文章
- OpenStack 虚拟机冷/热迁移的实现原理与代码分析
目录 文章目录 目录 前文列表 冷迁移代码分析(基于 Newton) Nova 冷迁移实现原理 热迁移代码分析 Nova 热迁移实现原理 向 libvirtd 发出 Live Migration 指令 ...
- KVM虚拟机的热迁移---Live Migration
KVM虚拟机的热迁移---Live Migration: 服务器虚拟化技术是当前的热点,而虚拟机的“热迁移(Live Migration)”技术则是虚拟机的运行状态完整保存下来,同时可以快速的回复到原 ...
- OpenStack虚拟机冷迁移与热迁移
一.虚拟机迁移分析 openstacvk虚拟机迁移分为冷迁移和热迁移两种方式. 1.1冷迁移: 冷迁移(cold migration),也叫静态迁移.关闭电源的虚拟机进行迁移.通过冷迁移,可以选择将关 ...
- KVM 虚拟机的热迁移
热迁移:顾名思义在虚拟机不关机的情况下将KVM虚拟机进行迁移 准备工作:两台KVM虚拟机,一台nfs虚拟机,centos7.4系统 主机 IP地址 主机名 KVM01 10.00.11 kvm01 K ...
- openstack 虚机热迁移问题:虚机状态一直处于迁移中的情况处理
前提:在偶尔的虚机热迁移中,发现虚机一直属于迁移状态中. 但是查看后台流量监控,发现没有流量已经下来了.然后在目标机器上查看,发现kvm已经在目标机器上. 1.查看kvm 实际所处宿主机方法: a.拿 ...
- OpenStack Blazar 架构解析与功能实践
目录 文章目录 目录 Blazar Blazar 的安装部署 Blazar 的软件架构 Blazar 的资源模型与状态机 Blazar 的主机资源预留功能(Host Reservation) 代码实现 ...
- U-boot中SPL功能和源码流程分析
在U-boot目录下,有个比较重要的目录就是SPL的,SPL到底是什么呢?为什么要用它呢? SPL(Secondary programloader)是uboot第一阶段执行的代码.主要负责搬移uboo ...
- OpenStack之虚机热迁移代码解析
OpenStack之虚机热迁移代码解析 话说虚机迁移分为冷迁移以及热迁移,所谓热迁移用度娘的话说即是:热迁移(Live Migration,又叫动态迁移.实时迁移),即虚机保存/恢复(Save/Res ...
- 华为云计算IE面试笔记-FusionCompute虚拟机热迁移定义,应用场景,迁移要求,迁移过程
*热迁移传送了什么数据?保存在哪? 虚拟机的内存.虚拟机描述信息(配置和设备信息).虚拟机的状态 虚拟机的配置和设备信息:操作系统(类别.版本号).引导方式(VM通过硬盘.光盘.U盘.网络启动)和引导 ...
随机推荐
- Linux SWAP交换分区维护
1.查看当前swap分区信息
- wireshark 抓usb包
https://www.freebuf.com/articles/system/96216.html https://blog.csdn.net/shiailan/article/details/97 ...
- Tableau预测
Tableau可以通过对现有的数据进行预测.
- 火焰图(Flame Graphs)的安装和基本用法
火焰图(Flame Graphs) 一.概述: 火焰图(flame graph)是性能分析的利器,通过它可以快速定位性能瓶颈点. perf 命令(performance 的缩写)是 Linux 系统原 ...
- 微信小程序轮播图的制作与跳转
<!--轮播图 --> <view class='swiperBanner'> <swiper indicator-dots='{{indicatorDots}}' au ...
- Acwing-100-IncDec序列(差分)
链接: https://www.acwing.com/problem/content/102/ 题意: 给定一个长度为 n 的数列 a1,a2,-,an,每次可以选择一个区间 [l,r],使下标在这个 ...
- 【leetcode】1266. Minimum Time Visiting All Points
题目如下: On a plane there are n points with integer coordinates points[i] = [xi, yi]. Your task is to f ...
- Spring后台,通过name取值
表单中,有同名控件(text/hidden/checkbox.......)的情况下,采用getParameterValues("name"):String[] 表单中,只有一个n ...
- jquery checkbox选择器 语法
jquery checkbox选择器 语法 作用::checkbox 选择器选取类型为 checkbox 的 <input> 元素.大理石平台价格表 语法:$(":checkbo ...
- Nowcoder 练习赛26 D xor序列 ( 线性基 )
题目链接 题意 : 中文题.点链接 分析 : 对于给定的 X 和 Y 假设存在一个 Z 使得 X (xor) Z = Y 做一个变形 X (xor) Z (xor) Y = 0 X (xor) Y = ...