Carrier-Grade Mirantis OpenStack (the Mirantis NFV Initiative), Part 1: Single Root I/O Virtualization (SR-IOV)
The Mirantis NFV initiative aims to create an NFV ecosystem for OpenStack, with validated hardware at the bottom; hardened, configurationally-optimized OpenStack as a platform in the middle, and validated VNFs and other NFV software and application components at the top. As the pure play OpenStack company, we know that OpenStack is the best way to create an NFV infrastructure (NFVi), but we also know that our NFV clients – both telcos and enterprises – need more than just the OpenStack platform. They need a complete solution for NFV Infrastructure (NFVi) that answers the whole stack of architectural challenges presented by NFV — in compute, networking, storage, availability, scale and performance — and that reliably provides the network functions, orchestration and management functionality carriers need.
To provide this solution, Mirantis is integrating and optimizing OpenStack itself, and working with an ever-growing number of partners. In this article, we’ll talk about one important innovation that will help turn OpenStack into NFVi, Single Root I/O Virtualization or SR-IOV.
SR-IOV is a PCI Special Interest Group (PCI-SIG) specification for virtualizing network interfaces, representing each physical resource as a configurable entity (called a PF for Physical Function), and creating multiple virtual interfaces (VFs or Virtual Functions) with limited configurability on top of it, recruiting support for doing so from the system BIOS, and conventionally, also from the host OS or hypervisor. Among other benefits, SR-IOV makes it possible to run a very large number of network-traffic-handling VMs per compute without increasing the number of physical NICs/ports, and provides means for pushing processing for this down into the hardware layer, off-loading the hypervisor and significantly improving both throughput and deterministic network performance. That’s why it’s an NFV must-have.
We first talked about SR-IOV at the OpenStack Summit in Vancouver, in a session with an unofficial title that might as well have been “Run, Forrest, run!” because the main idea of SR-IOV is to get data to VMs more quickly. Now, we’re going to look at actually using SR-IOV with Mirantis OpenStack.
SR-IOV can be complicated. Note: On Intel NICs, PF cannot support promiscuous mode when SR-IOV is enabled, so it cannot be doing L2 bridging. Because of this, you shouldn’t enable SR-IOV on interfaces that have standard Fuel networks assigned to them. (One way to get around this problem is to use nova host aggregates and different flavours for normal and SR-IOV enabled instances, but it’s out of scope for us in this article; if you’d like to hear more about it, let us know in the comments, and we’ll do a separate blog post.)
You should note that SR-IOV has a couple of limitations in the Kilo release of OpenStack. Most notably, instance migration with SR-IOV attached ports is not supported. Also, iptables-based filtering is not usable with SR-IOV NICs, because SR-IOV bypasses the normal network stack, so security groups cannot be used with SR-IOV enabled ports (though you still can use security groups for normal ports).
So now that we know what we’re talking about, let’s look at how to enable SR-IOV and use SR-IOV. While you can use Fuel to deploy a Mirantis OpenStack cloud that includes all of the pieces for SR-IOV, it still needs to be configured separately.
Enabling SR-IOV
To enable SR-IOV, you need to configure it on compute and controller nodes. Let’s start with the compute nodes.
Configure SR-IOV on Compute nodes
To enable SR-IOV, perform the following steps only on Compute nodes that will be used for running instances with SR-IOV virtual NICs:
- Ensure that your compute nodes are capable of PCI passthrough and SR-IOV. Your hardware must provide VT-d and SR-IOV capabilities and these extensions may need to be enabled in the BIOS. VT-d options are usually configured in the Chipset Configuration/North Bridge/IIO configuration” section of the BIOS, while SR-IOV support is configured in “PCIe/PCI/PnP Configuration.”
If your system supports VT-d you should see the messages related to DMAR in dmesg output:# grep -i dmar /var/log/dmesg
[ 0.000000] ACPI: DMAR 0000000079d31860 000140 (v01 ALASKA A M I 00000001 INTL 20091013)
[ 0.061993] dmar: Host address width 46
[ 0.061996] dmar: DRHD base: 0x000000fbffc000 flags: 0x0
[ 0.062004] dmar: IOMMU 0: reg_base_addr fbffc000 ver 1:0 cap d2078c106f0466 ecap f020de
[ 0.062007] dmar: DRHD base: 0x000000c7ffc000 flags: 0x1
[ 0.062012] dmar: IOMMU 1: reg_base_addr c7ffc000 ver 1:0 cap d2078c106f0466 ecap f020de
[ 0.062014] dmar: RMRR base: 0x0000007bc94000 end: 0x0000007bca2fffThis is just an example, of course; your output may differ.
If your system supports SR-IOV you should see SR-IOV capability section for each NIC PF, and the total VFs should be non-zero:
lspci -vvv | grep -i "initial vf"
Initial VFs: 64, Total VFs: 64, Number of VFs: 0, Function Dependency Link: 00
Initial VFs: 64, Total VFs: 64, Number of VFs: 0, Function Dependency Link: 01
Initial VFs: 8, Total VFs: 8, Number of VFs: 0, Function Dependency Link: 00
Initial VFs: 8, Total VFs: 8, Number of VFs: 0, Function Dependency Link: 01 - Check that VT-d is enabled in the kernel using this command:
# grep -i "iommu.*enabled" /var/log/dmesg
If you don’t see a response similar to:
[0.000000] Intel-IOMMU: enabled
then it’s not yet enabled. Enable it by editing
/etc/default/grub
to add:GRUB_CMDLINE_LINUX=" console=ttyS0,9600 console=tty0 net.ifnames=0 biosdevname=0 rootdelay=90 nomodeset root=UUID=d2b06335-bf6d-44b8-a0a4-a54224bdc7f8 intel_iommu=on"
Next, update grub and reboot to get the changes to take effect:
# update-grub
# rebootand repeat the check. For new environments you may want to add these kernel parameters before deploying so that they will be applied to all nodes of environment. You can do that from the Fuel interface in the “Kernel Parameters” section of the “Settings” tab.
NOTE: If you have an AMD motherboard, you need to check for ‘AMD-Vi’ in the output of the dmesg command and pass the options “iommu=pt iommu=1″ to kernel, (but we haven’t yet tested that). - Enable the number of virtual functions required on the SR-IOV interface. NOTE: Do not set the number of VFs to more than required, since this might degrade performance. Depending on kernel and NIC driver version you might get more queues on each PF with fewer VFs (usually, fewer than 32).First, enable the interface:
ip link set eth1 up
Next, from the command-line, get the maximum number of functions that could potentially be enabled for your NIC:
cat /sys/class/net/eth1/device/sriov_totalvfs
Then finally, enable the desired number of virtual functions for your NIC:
echo 31 > /sys/class/net/eth1/device/sriov_numvfs
NOTE: These settings aren’t saved across reboots. To save them, add them to
/etc/rc.local
:ip link set eth1 up
echo "echo 31 > /sys/class/net/eth1/device/sriov_numvfs" >> /etc/rc.local - Check to make sure that SR-IOV is enabled:
# ip link show eth1 |grep vf
vf 0 MAC 00:00:00:00:00:00, spoof checking on, link-state auto
vf 1 MAC c2:cd:57:9b:6c:7d, spoof checking on, link-state auto
...If you don’t see ‘link-state auto’ in output, then your installation will require an SR-IOV agent. You can enable it like so:
apt-get install neutron-plugin-sriov-agent
# nohup neutron-sriov-nic-agent --debug --log-file /tmp/sriov_agent --config-file
/etc/neutron/neutron.conf --config-file /etc/neutron/plugins/ml2/ml2_conf_sriov.ini - Edit
/etc/nova/nova.conf
:pci_passthrough_whitelist={"devname": "eth1", "physical_network":"physnet2"}
- Edit
/etc/neutron/plugins/ml2/ml2_conf_sriov.ini
:[sriov_nic]
physical_device_mappings = physnet2:eth1 - Restart the compute service:
# restart nova-compute
- Get the vendor’s product id; you’ll need it to configure SR-IOV on the controller nodes.
NOTE: This is just an example of the output. Actual value may differ on your hardware.
# lspci -nn|grep -e "Ethernet.*Virtual"
06:10.1 Ethernet controller [0200]: Intel Corporation 82599 Ethernet Controller Virtual Function [8086:10ed] (rev 01)
06:10.3 Ethernet controller [0200]: Intel Corporation 82599 Ethernet Controller Virtual Function [8086:10ed] (rev 01)
...Write down the vendor’s product id (the value in square brackets).
Configure SR-IOV on the Controller nodes
- Edit
/etc/neutron/plugins/ml2/ml2_conf.ini
; use the vendor’s product id from the previous step as the value for supported_pci_vendor_devs:
Change the line for mechanism_driversmechanism_drivers =openvswitch,l2population,sriovnicswitch
and add new section at the end of file:
[ml2_sriov]
supported_pci_vendor_devs = 8086:10ed - Edit
/etc/nova/nova.conf
:[DEFAULT]
scheduler_default_filters=DifferentHostFilter,RetryFilter,
AvailabilityZoneFilter,RamFilter,CoreFilter,DiskFilter,ComputeFilter,
ComputeCapabilitiesFilter,ImagePropertiesFilter,ServerGroupAntiAffinityFilter,
ServerGroupAffinityFilter,PciPassthroughFilter - Restart services:
restart neutron-server
restart nova-api
Using SR-IOV
Now you’re ready to actually use SR-IOV.
- A recommended practice for using SR-IOV is to create a separate host aggregate for SR-IOV enabled computes.
nova aggregate-create sriov
nova aggregate-set-metadata sriov sriov=true
nova aggregate-create normal
nova aggregate-set-metadata normal sriov=false… and add some hosts to them:
nova aggregate-add-host sriov node-9.domain.tld
nova aggregate-add-host normal node-10.domain.tld - Create a new flavor for VMs that require SR-IOV support:
nova flavor-create m1.small.sriov auto 2048 20 2
nova flavor-key m1.small.sriov set aggregate_instance_extra_specs:sriov=trueYou should update all other flavours so they will start only on hosts without SR-IOV support:
openstack flavor list -f csv|grep -v sriov|cut -f1 -d,| tail -n +2|
xargs -I% -n 1 nova flavor-key %
set aggregate_instance_extra_specs:sriov=falseTo use the SR-IOV port you need to create an instance with ports that use the vnic-type “direct”. For now, you’ll need to do this via the command line. Because the default Cirros image does not have the Intel NIC drivers included, we’ll use an Ubuntu cloud image to test SR-IOV.
- Prepare the ubuntu cloud image:
# glance image-create --name trusty --disk-format raw --container-format bare
--is-public True
--location https://cloud-images.ubuntu.com/trusty/current/trusty-server-cloudimg-amd64-disk1.imgYou can only login to this instance by using an ssh public key, so let’s go ahead and create a keypair. You can do this from the Horizon interface, but we’ll do it from the command-line, like so:
# nova keypair-add key1 > key1.pem
# chmod 600 key1.pem
- Create a port for the instance:
# neutron port-create net04 --binding:vnic-type direct --device_owner nova-compute --name sriov-port1
- Spawn the instance:
# port_id=`neutron port-list | grep sriov-port1 | awk ‘{print $2}’`
# nova boot --flavor m1.small --image trusty --key_name key1
--nic port-id=$port_id sriov-vm1 - Get the node’s ip address:
# nova list | grep sriov-vm1 | awk '{print $12}'
net04=192.168.111.5 - Connect to the instance to check if everything up and running:
Find controllers with namespace which has access to instance:# dhcp-agent-list-hosting-net net04
# neutron dhcp-agent-list-hosting-net -f csv -c host net04 --quote none | tail -n+2
node-7.domain.tld
node-9.domain.tldConnect to the instance (this command should be run on one of the controllers which we found in previous step):
# ip netns exec `ip netns show|grep qdhcp-$(neutron net-list | grep 'net04 ' | awk '{print$2}')` ssh -i key1.pem ubuntu@192.168.111.5
And that should be it!
Troubleshooting
Sometimes something goes wrong. Here are some common problems and solutions.
- If you see errors in /var/log/nova/nova-compute.log on the compute host:
libvirtError: internal error: missing IFLA_VF_INFO in netlink response
… you should install a newer version of libnl3, as shown above.
- If you see:
libvirtError: unsupported configuration: host doesn't support passthrough of host PCI devices
… in /var/log/nova/nova-compute.log, it means that VT-d is not supported or not enabled.
- If you see:
NovaException: Unexpected vif_type=binding_failed
You should enable the SR-IOV agent, or if you’ve already done so, check that it’s running:
# neutron agent-list | grep sriov-nic-agent
| dfa4edcf-63c1-4af7-a291-ec139a16f346 | NIC Switch agent | node-16.domain.tld | :-) | True | neutron-sriov-nic-agent |Otherwise, examine the log file /tmp/sriov_agent for clues to what else might be wrong.
Conclusion
For now, configuring Mirantis OpenStack for SR-IOV is still relatively complex, thus potentially challenging to do on large clusters and prone to error. During the Mikata cycle, we’ll be making improvements to current configurations, doing deeper testing, and working on automating configuration and deployment of SR-IOV via Fuel.
http://dev-vpierre-plugindev.pantheon.io/carrier-grade-mirantis-openstack-the-mirantis-nfv-initiative-part-1-single-root-io-virtualization-sr-iov/
Carrier-Grade Mirantis OpenStack (the Mirantis NFV Initiative), Part 1: Single Root I/O Virtualization (SR-IOV)的更多相关文章
- OpenStack for NFV applications: enabling Single Root I/O virtualization and PCI-Passthrough
http://superuser.openstack.org/articles/openstack-for-nfv-applications-enabling-single-root-i-o-virt ...
- RedHat 和 Mirantis OpenStack 产品的版本和功能汇总和对比(持续更新)
Mirantis 和 Red Hat 作为 OpenStack 商业化产品领域的两大领军企业,在行业内有重要的地位.因此,研究其产品版本发布周期和所支持的功能,对制定 OpenStack 产品的版本和 ...
- Mirantis OpenStack 8.0 版本大概性分析
作为 OpenStack 领域标杆性企业之一的 Mirantis 在2016年3月初发布了最新的 MOS 8.0 版本.本文试着基于公开资料进行一些归纳分析. 1. 版本概况 1.1 概况 社区版本: ...
- Mirantis OpenStack 8.0 版本
作为 OpenStack 领域标杆性企业之一的 Mirantis 在2016年3月初发布了最新的 MOS 8.0 版本.本文试着基于公开资料进行一些归纳分析. 1. 版本概况 1.1 概况 社区版本: ...
- Mirantis OpenStack 7.0: NFVI Deployment Guide — NUMA/CPU pinning
https://www.mirantis.com/blog/mirantis-openstack-7-0-nfvi-deployment-guide-numacpu-pinning/ Compute ...
- Mirantis OpenStack HA
Mysql使用Galera做Active/Active集群,同时使用Pacemaker,因为Galera mysql用到了领导机选举机制quorum,所以控制节点至少三个 RabbitMQ使用mirr ...
- 开源NFV管理器 - OpenStack Tacker介绍 NFV和Tacker介绍和主要功能
原文链接:https://blog.csdn.net/bc_vnetwork/article/details/51463518 1.NFV概述 NFV(网络功能虚拟化Network Function ...
- NFV实验平台
NFV架构如下图所示. NFVI对应于数据平面,数据平面转发数据并提供用于运行网络服务的资源. MANO对应于控制平面,该控制平面负责构建各种VNF之间的连接以及编排NFVI中的资源. VNF层对应于 ...
- openstack系列文章(一)
学习openstack的系列文章-虚拟化 虚拟化 KVM CPU 虚拟化 KVM 内存虚拟化 全虚拟化 I/O 设备 半虚拟化 I/O 设备 I/O PCI PCIe 设备直接分配 SR-IOV 在 ...
随机推荐
- 常用代码块:java使用系统浏览器打开url
方法一:用于windows try { Runtime.getRuntime().exec("rundll32 url.dll,FileProtocolHandler "+url) ...
- python模块学习(一)
模块,用一砣代码实现了某个功能的代码集合. 类似于函数式编程和面向过程编程,函数式编程则完成一个功能,其他代码用来调用即可,提供了代码的重用性和代码间的耦合.而对于一个复杂的功能来,可能需要多个函数才 ...
- make编译三
多目标 Makefile 的规则中的目标可以不止一个,其支持多目标,有可能我们的多个目标同时依赖于一个文件,并且其生成的命令大体类似.于是我们就能把其合并起来.但是如果多个目标的生成规则的执行命令是同 ...
- NPOI 导入 导出
using NPOI.XSSF.UserModel; using System.IO; 导入 /// <summary> /// Excel转换DataTable /// </s ...
- Amazon2014在线笔试 第三题
问题描述: 算法分析: s1:层数对齐:分别求两个数所在的层(l1,l2),把层数大的(假设l2>l1)先往上找父节点,从而对齐到l1层: s2:两个数同时往上找, 直到找到公共的父节点(一定能 ...
- 每天一个Linux命令(50)netstat命令
netstat命令用来打印Linux中网络系统的状态信息,可让你得知整个Linux系统的网络情况. (1)用法: 用法: netstat [选项参数] (2)功能: ...
- Raspberry Pi开发之旅-土壤湿度检测
一.土壤传感器 传感器四个针脚: 针脚 含义 AO 模拟信号输出 DO 数字信号输出 GND 电源负极 VCC 电源正极 二.接线 YL-38和YL69 之间直接用2根母对母线连接. YL-38和树 ...
- 优美的英文诗歌Beautiful English Poetry
<When you are old>——<当你老了> --- William Butler Yeats ——威廉·巴特勒·叶芝When you are old and grey ...
- python爬虫之html解析Beautifulsoup和Xpath
Beautiifulsoup Beautiful Soup 是一个HTML/XML的解析器,主要的功能也是如何解析和提取 HTML/XML 数据.BeautifulSoup 用来解析 HTML 比较简 ...
- 深入理解JVM2
1 JVM简介 VM是Java Virtual Machine(Java虚拟机)的缩写,JVM是一种用于计算设备的规范,它是一个虚构出来的计算机,是通过在实际的计算机上仿真模拟各种计算机功能来实现的. ...