virtio guest side implementation: PCI, virtio device, virtio net and virtqueue
With the publishing of OASIS virtio specification version 1.0, virtio made another big step in becoming an official standard from a De-Facto standard for virtual i/o device in paravirtualization environment.
virtio device
For virtio, device emulation is done in hypervisor/host. They are commonly implemented as PCI devices, virtio over memory mapped device (MMIO) or channel i/o is also seen.
Take QEMU as example, it emulates the control plane of virtio PCI device like device status, feature bits and device configuration space, while the implementation of virtqueue backend data plane has three options so far:
- Virtio backend running inside QEMU
virtqueue notification and actual data access are done directly by QEMU. virtio.c contains the major implementation.
- Virtio backend inside host kernel, vhost
QEMU help setup kick/irq eventfd, vhost utilizes them to communicates with drivers in guest directly for virtqueue notification. Linux kernel module vhost sends/receives data via virtqueue with guest without exiting to host user space. vhost-worker is the kernel thread handling the notification and data buffer, the arrangement that enables it to access whole QEMU/guest address space is that: QEMU issues VHOST_SET_OWNER ioctl call to saves the mm context of qemu process in vhost_dev, then vhost-worker thread take on the specified mm context. Check use_mm() kernel function.
- Virtio backend running in separate userspace process
virtio device driver
Guest OS implements the drivers for driving virtio device presented by underlying hypervisor. Here we walk through the virtio PCI network card driver example in Linux kernel.
Virtio PCI
As that for a regular PCI device, virtio pci driver fills data and call back functions into standard pci_driver structure, among them the most important parts are the pci_device_id table and probe function. All virtio devices use vendor id of ox1af4. virtio_pci_probe() will be called if a virtio pci device with that specific vendor id is detected on PCI bus. The probe function allocates a virtio_pci_device structure which saves the necessary information like pci_dev pointer, MSI-X config and virtqueues list. virtio_device structure is also part of virtio_pci_device, “struct virtio_config_ops virtio_pci_config_ops” provides a set of virtio device status/feature/configuration/virtqueue callback functions and it will be linked into virtio_device.
At the end of virtio PCI probe processing, PCI subsystem vendor and device id will be assigned to the new virtio_device as its vendor and device id, then the newly created virtio_device which is inside virtio_pci_device structure will be register onto virtio bus: “int register_virtio_device(struct virtio_device *dev)”
Virtio netdev
virtio_net_driver is a Linux kernel driver module registered with virtio_bus. Note that virtio_bus has been registered during kernel initialization : core_initcall(virtio_init); and the code is built in kernel and not a loadable module.
The virtio_driver probe fucntion virtnet_probe() will be called once a virtio device with device ID VIRTIO_ID_NET has been detected. To work as a network device, net_device structure is created via alloc_etherdev_mq() , various network features are checked. Within the probe function, TX/RX virtqueue will be created/initialized, the find_vqs() callback function in virtio_pci_config_ops of virtio_device will called during this process. virtnet_info works as the private data of virtio net_device to link net_device and virtio_device together.
net_device_ops virtnet_netdev is also configured for the new net device which will be ready to function as a network interface card for sending/receiving data.
Virtio netdev RX/TX
At least 2 virtqueues will be created for virtio net_device interface, one for TX and the other one for RX. If host supports flow steering feature VIRTIO_NET_F_MQ, then more than one pair of queues may be created. Supporting of VIRTIO_NET_F_CTRL_VQ adds another virtqueue which is used by driver to send commands to manipulate various features of device which would not easily map into the PCI configuration space.
In RX direction, virtnet_receive() is called by virtnet_poll() or virtnet_busy_poll() depending on kernel CONFIG_NET_RX_BUSY_POLL option setting. virtqueue_get_buf() is the next layer function which gets the data from host. As special requirement for virtio driver, it needs to add in buffer for host to put data there. Function try_fill_recv() serves that purpose, virtqueue_add_inbuf() is called eventually to expose input buffers to the other end.
For virtnet device driver, the TX processing starts with the call back function start_xmit() within virtnet_netdev , it first frees up any pending old buffer via free_old_xmit_skbs(), then goes into xmit_skb()which calls virtqueue_add_outbuf() located in virtio_ring.c. virtqueue_get_buf()is also called by free_old_xmit_skbs() to get used buffer and release the virtqueue ring descriptor back to desc free list.
Virtqueue
virtqueue is the fundamental building block of virtio, it is the mechanism for bulk data transport on virtio devices. From virtqueque point of view, both virtio netdev RX and TX queues are virtually the same. Each virtqueue consists of three parts: Descriptor table, Available ring and Used ring.
Each descriptor could refer to a buffer that driver uses for virtio device, the addr field of it points to a physical address of guest.
Available ring stores index of entries in descriptor table for which guest informs host/hypervisor that the buffer descriptor points to is available for use. In the perspective of guest virtio netdev TX queue, the buff is filled with data to be processed by host. While for guest virtio netdev RX queue, the buffer is empty and should be filled by host. virtqueue_add() which is called by both virtqueue_add_outbuf() & virtqueue_add_inbuf() operates on the available ring. Guest performs write operation on available ring data structure, host reads it only.
Used ring is where the virtio device (host) returns buffer once it is done with them. it is only written to by the device, and read by the driver (guest). For both virtio netdev RX/TX queues, detach_buf() which is called by virtqueue_get_buf() will take the descriptors indicated by used ring and put them back to descriptor table free list.
Refer to Virtual I/O Device (VIRTIO) Version 1.0 and virtio: Towards a De-Facto Standard For Virtual I/O Devices for authentic information about virtio.
https://jipanyang.wordpress.com/2014/10/27/virtio-guest-side-implementation-pci-virtio-device-virtio-net-and-virtqueue/
virtio guest side implementation: PCI, virtio device, virtio net and virtqueue的更多相关文章
- Virtio: An I/O virtualization framework for Linux
The Linux kernel supports a variety of virtualization schemes, and that's likely to grow as virtuali ...
- [qemu] 在前端驱动使用virtio的情况下,如何让后端使用vhost-user [未解决]
首先,如果你更关心原理和知识,请读读这个 http://chuansong.me/n/2186528 (值得细细的逐字读). 在<<深入浅出dpdk>>中提到,vhost-us ...
- QEMU KVM Libvirt手册(8): 半虚拟化设备virtio
KVM本身并不提供半虚拟化功能,是通过virtio来实现的 The benefits of virtio drivers are of lower overhead and higher perfor ...
- virtio后端驱动详解
2016-10-08 virtIO是一种半虚拟化驱动,广泛用于在XEN平台和KVM虚拟化平台,用于提高客户机IO的效率,事实证明,virtIO极大的提高了VM IO 效率,配备virtIO前后端驱动的 ...
- Virtio SCSI设备介绍
Qemu的存储栈 在KVM虚拟化环境中,当客户机的内核存储系统像在物理机上一样通过页缓存.文件系统.通用块设备层运行到实际设备驱动时,这时驱动对设备寄存器的访问会触发CPU从客户机代码切换到物理机内的 ...
- 【原创】Linux虚拟化KVM-Qemu分析(九)之virtio设备
背景 Read the fucking source code! --By 鲁迅 A picture is worth a thousand words. --By 高尔基 说明: KVM版本:5.9 ...
- 【原创】Linux虚拟化KVM-Qemu分析(十)之virtio驱动
背景 Read the fucking source code! --By 鲁迅 A picture is worth a thousand words. --By 高尔基 说明: KVM版本:5.9 ...
- virtio 驱动的数据结构理解
ps:本文基于4.19.204内核 Q:vqueue的结构成员解释: A:结构如下,解析附后: struct virtqueue { struct list_head list;//caq:一个vir ...
- KVM下virtio驱动虚拟机XML配置文件分析
[root@opennebula qemu]# pwd /etc/libvirt/qemu [root@opennebula qemu]# ls networks one-12.xml one-12. ...
随机推荐
- js对象转成用&拼接的请求参数(转)
var parseParam=function(param, key){ var paramStr=""; if(param instanceof String||param in ...
- 基于Requests和BeautifulSoup实现“自动登录”
基于Requests和BeautifulSoup实现“自动登录”实例 自动登录抽屉新热榜 #!/usr/bin/env python # -*- coding:utf-8 -*- import req ...
- django博客项目1.环境搭建
安装 Python Windows 下安装 Python 非常简单,去 Python 官方网站找到 Python 3 的下载地址,根据你的系统选择 32 位或者 64 位的安装包,下载好后双击安装即可 ...
- 如何实现redis集群?
由于Redis出众的性能,其在众多的移动互联网企业中得到广泛的应用.Redis在3.0版本前只支持单实例模式,虽然现在的服务器内存可以到100GB.200GB的规模,但是单实例模式限制了Redis没法 ...
- windows10下安装face_recongnition
第一步:安装vistual studio,我安装的是最新版本2017. 另外,并且因为要学习C# ,选了所需要的东西.暂不知这一步是否必需. 第二步:接下来安装boost 通过此链接:https:// ...
- 用python的turtle画分形树
由于分形树具有对称性,自相似性,所以我们可以用递归来完成绘制.只要确定开始树枝长.每层树枝的减短长度和树枝分叉的角度,我们就可以把分形树画出来啦!! 代码如下: # -*- coding: utf-8 ...
- nginx常用
1.rewrite return 301 http://example.com$request_uri; rewrite ^ http://example.com permanent; 2.try_f ...
- 同步锁,死锁现象与递归锁,信息量Semaphore.....(Day36)
一.同步锁 三个需要注意的点: #1.线程抢的是GIL锁,GIL锁相当于执行权限,拿到执行权限后才能拿到互斥锁Lock,其他线程也可以抢到GIL,但如果发现Lock仍然没有被释放则阻塞,即便是拿到执行 ...
- 关于delphi软件运行出现Invalid floating point operation的错误的解决办法
关于delphi软件运行出现Invalid floating point operation的错误的解决办法 关于delphi软件运行出现Invalid floating point operat ...
- Github 的其他用法
一.概述 Github 除了作为代码托管库外,有趣的程序员们还利用它解锁了有趣的新姿势. 二.新姿势 2.1 Github Pages 可以为项目建立静态主页(即gh-pages分支), 也可以建立命 ...