[译]Vulkan教程(23)暂存buffer

Staging buffer 暂存buffer

Introduction 入门

The vertex buffer we have right now works correctly, but the memory type that allows us to access it from the CPU may not be the most optimal memory type for the graphics card itself to read from. The most optimal memory has the VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT flag and is usually not accessible by the CPU on dedicated graphics cards. In this chapter we're going to create two vertex buffers. One staging buffer in CPU accessible memory to upload the data from the vertex array to, and the final vertex buffer in device local memory. We'll then use a buffer copy command to move the data from the staging buffer to the actual vertex buffer.

我们现在的顶点buffer可以正确地工作,但是允许我们从CPU读写的内存类型,对图形卡可能不是最优的。最优的内存有VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT 标志,它通常在专用图形卡上是不能被CPU读写的。本章我们将创建2个顶点buffer。在CPU可读写的内存里的一个暂存buffer,用于保存顶点数组的数据,和最终的设备局部内存的顶点buffer。我们然后用一个buffer复制命令来将数据从暂存buffer移动到实际的顶点buffer。

Transfer queue 转移队列

The buffer copy command requires a queue family that supports transfer operations, which is indicated using VK_QUEUE_TRANSFER_BIT. The good news is that any queue family with VK_QUEUE_GRAPHICS_BIT or VK_QUEUE_COMPUTE_BIT capabilities already implicitly support VK_QUEUE_TRANSFER_BIT operations. The implementation is not required to explicitly list it in queueFlags in those cases.

复制buffer的命令要求队列家族支持转移操作,这由VK_QUEUE_TRANSFER_BIT标志。好消息是,任何有VK_QUEUE_GRAPHICS_BIT 或VK_QUEUE_COMPUTE_BIT 能力的队列家族已经隐式地支持操作了。此时,实现不需要显式地将它列在queueFlags 里。

If you like a challenge, then you can still try to use a different queue family specifically for transfer operations. It will require you to make the following modifications to your program:

如果你喜欢调整,那么你仍旧可以尝试用一个不同的队列家族that专门针对转移操作。它会要求你对程序做出如下修改:

  • Modify QueueFamilyIndices and findQueueFamilies to explicitly look for a queue family with the VK_QUEUE_TRANSFER bit, but not the VK_QUEUE_GRAPHICS_BIT. 修改QueueFamilyIndices 和findQueueFamilies  to显式地查询带有VK_QUEUE_TRANSFER 位的队列家族,但是不带VK_QUEUE_GRAPHICS_BIT位。
  • Modify createLogicalDevice to request a handle to the transfer queue. 修改createLogicalDevice  to请求转移队列的句柄。
  • Create a second command pool for command buffers that are submitted on the transfer queue family. 创建第二个命令池for命令缓存that提交到转移队列家族。
  • Change the sharingMode of resources to be VK_SHARING_MODE_CONCURRENT and specify both the graphics and transfer queue families. 修改资源的sharingMode 为VK_SHARING_MODE_CONCURRENT ,同时指定图形和转移队列家族。
  • Submit any transfer commands like vkCmdCopyBuffer (which we'll be using in this chapter) to the transfer queue instead of the graphics queue. 提交转移命令(例如vkCmdCopyBuffer ,我们在本章就会这样用)到转移队列,而不是到图形队列。

It's a bit of work, but it'll teach you a lot about how resources are shared between queue families.

这需要点工作,但是它会教给你很多关于资源如何在队列家族间共享的事。

Abstracting buffer creation 抽象buffer创建

Because we're going to create multiple buffers in this chapter, it's a good idea to move buffer creation to a helper function. Create a new function createBuffer and move the code in createVertexBuffer (except mapping) to it.

因为本章我们要创建多个buffer,将buffer创建移动到一个辅助函数是个好主意。创建新函数createBuffer ,将createVertexBuffer 的代码(除了映射)放进去。

 void createBuffer(VkDeviceSize size, VkBufferUsageFlags usage, VkMemoryPropertyFlags properties, VkBuffer& buffer, VkDeviceMemory& bufferMemory) {
VkBufferCreateInfo bufferInfo = {};
bufferInfo.sType = VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO;
bufferInfo.size = size;
bufferInfo.usage = usage;
bufferInfo.sharingMode = VK_SHARING_MODE_EXCLUSIVE; if (vkCreateBuffer(device, &bufferInfo, nullptr, &buffer) != VK_SUCCESS) {
throw std::runtime_error("failed to create buffer!");
} VkMemoryRequirements memRequirements;
vkGetBufferMemoryRequirements(device, buffer, &memRequirements); VkMemoryAllocateInfo allocInfo = {};
allocInfo.sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO;
allocInfo.allocationSize = memRequirements.size;
allocInfo.memoryTypeIndex = findMemoryType(memRequirements.memoryTypeBits, properties); if (vkAllocateMemory(device, &allocInfo, nullptr, &bufferMemory) != VK_SUCCESS) {
throw std::runtime_error("failed to allocate buffer memory!");
} vkBindBufferMemory(device, buffer, bufferMemory, );
}

Make sure to add parameters for the buffer size, memory properties and usage so that we can use this function to create many different types of buffers. The last two parameters are output variables to write the handles to.

确保为buffer大小、内存属性和用法添加参数,这样我们就可以用这个函数创建许多不同类型的buffer了。最后2个参数是要写入句柄的输出变量。

You can now remove the buffer creation and memory allocation code from createVertexBuffer and just call createBuffer instead:

现在你可以去掉createVertexBuffer 里的创建buffer和分配内存的代码了,只需调用createBuffer 即可:

void createVertexBuffer() {
VkDeviceSize bufferSize = sizeof(vertices[]) * vertices.size();
createBuffer(bufferSize, VK_BUFFER_USAGE_VERTEX_BUFFER_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT, vertexBuffer, vertexBufferMemory); void* data;
vkMapMemory(device, vertexBufferMemory, , bufferSize, , &data);
memcpy(data, vertices.data(), (size_t) bufferSize);
vkUnmapMemory(device, vertexBufferMemory);
}

Run your program to make sure that the vertex buffer still works properly.

运行你的程序,确保顶点buffer仍旧工作正常。

Using a staging buffer 使用暂存buffer

We're now going to change createVertexBuffer to only use a host visible buffer as temporary buffer and use a device local one as actual vertex buffer.

我们现在要修改createVertexBuffer  to只用1个宿主可见的buffer作为临时buffer,用一个设备局部buffer作为实际的顶点buffer。

 void createVertexBuffer() {
VkDeviceSize bufferSize = sizeof(vertices[]) * vertices.size(); VkBuffer stagingBuffer;
VkDeviceMemory stagingBufferMemory;
createBuffer(bufferSize, VK_BUFFER_USAGE_TRANSFER_SRC_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT, stagingBuffer, stagingBufferMemory); void* data;
vkMapMemory(device, stagingBufferMemory, , bufferSize, , &data);
memcpy(data, vertices.data(), (size_t) bufferSize);
vkUnmapMemory(device, stagingBufferMemory); createBuffer(bufferSize, VK_BUFFER_USAGE_TRANSFER_DST_BIT | VK_BUFFER_USAGE_VERTEX_BUFFER_BIT, VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT, vertexBuffer, vertexBufferMemory);
}

We're now using a new stagingBuffer with stagingBufferMemory for mapping and copying the vertex data. In this chapter we're going to use two new buffer usage flags:

我们现在用stagingBuffer 和stagingBufferMemory 来映射和复制顶点数据。本章我们要用2个新的buffer用法标志:

  • VK_BUFFER_USAGE_TRANSFER_SRC_BIT: Buffer can be used as source in a memory transfer operation. Buffer可以被用于内存转移操作的源。
  • VK_BUFFER_USAGE_TRANSFER_DST_BIT: Buffer can be used as destination in a memory transfer operation. Buffer可被用于内存转移操作的目标。

The vertexBuffer is now allocated from a memory type that is device local, which generally means that we're not able to use vkMapMemory. However, we can copy data from the stagingBuffer to the vertexBuffer. We have to indicate that we intend to do that by specifying the transfer source flag for the stagingBuffer and the transfer destination flag for the vertexBuffer, along with the vertex buffer usage flag.

vertexBuffer 现在从设备局部的内存类型分配了,这一般意味着我们不能对它用vkMapMemory。但是,我们可以从stagingBuffer 向vertexBuffer复制数据。我们必须指明,我们想这样做by指定转移源标志for stagingBuffer ,指定转移目标标志for vertexBuffer,当然还有之前的顶点buffer用法标志。

We're now going to write a function to copy the contents from one buffer to another, called copyBuffer.

我们现在要写一个函数copyBuffer从一个buffer向另一个复制数据。

void copyBuffer(VkBuffer srcBuffer, VkBuffer dstBuffer, VkDeviceSize size) {

}

Memory transfer operations are executed using command buffers, just like drawing commands. Therefore we must first allocate a temporary command buffer. You may wish to create a separate command pool for these kinds of short-lived buffers, because the implementation may be able to apply memory allocation optimizations. You should use the VK_COMMAND_POOL_CREATE_TRANSIENT_BIT flag during command pool generation in that case.

内存转移操作是通过命令buffer执行的,就像绘制命令一样。因此我们必须先分配一个临时的命令buffer。你可能希望创建一个单独的命令池for这些短命的buffer,因为实现可能能做一些内存分配的优化。那样的话,在生成命令池时你应当用VK_COMMAND_POOL_CREATE_TRANSIENT_BIT 标志。

void copyBuffer(VkBuffer srcBuffer, VkBuffer dstBuffer, VkDeviceSize size) {
VkCommandBufferAllocateInfo allocInfo = {};
allocInfo.sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_ALLOCATE_INFO;
allocInfo.level = VK_COMMAND_BUFFER_LEVEL_PRIMARY;
allocInfo.commandPool = commandPool;
allocInfo.commandBufferCount = ; VkCommandBuffer commandBuffer;
vkAllocateCommandBuffers(device, &allocInfo, &commandBuffer);
}

And immediately start recording the command buffer:

然后立即开始录制命令buffer:

VkCommandBufferBeginInfo beginInfo = {};
beginInfo.sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_BEGIN_INFO;
beginInfo.flags = VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT; vkBeginCommandBuffer(commandBuffer, &beginInfo);

The VK_COMMAND_BUFFER_USAGE_SIMULTANEOUS_USE_BIT flag that we used for the drawing command buffers is not necessary here, because we're only going to use the command buffer once and wait with returning from the function until the copy operation has finished executing. It's good practice to tell the driver about our intent using VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT.

用于绘制命令buffer的VK_COMMAND_BUFFER_USAGE_SIMULTANEOUS_USE_BIT 标志,这里并不需要,因为我们只想用命令buffer一次,等待复制操作完成,函数返回。用VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT告诉驱动我们的意图,是个好习惯。

VkBufferCopy copyRegion = {};
copyRegion.srcOffset = ; // Optional
copyRegion.dstOffset = ; // Optional
copyRegion.size = size;
vkCmdCopyBuffer(commandBuffer, srcBuffer, dstBuffer, , &copyRegion);

Contents of buffers are transferred using the vkCmdCopyBuffer command. It takes the source and destination buffers as arguments, and an array of regions to copy. The regions are defined in VkBufferCopy structs and consist of a source buffer offset, destination buffer offset and size. It is not possible to specify VK_WHOLE_SIZE here, unlike the vkMapMemory command.

Buffer的内容通过vkCmdCopyBuffer 命令转移。它接收源buffer、目的buffer和一个区域数组为参数to复制。区域在VkBufferCopy 结构体中定义,由源buffer偏移量、目标buffer偏移量和大小组成。这里不可能指定VK_WHOLE_SIZE,这与vkMapMemory 命令不同。

vkEndCommandBuffer(commandBuffer);

This command buffer only contains the copy command, so we can stop recording right after that. Now execute the command buffer to complete the transfer:

这个命令buffer值包含复制命令,所以我们可以在此之后立即结束录制。现在执行命令buffer来完成转移操作:

VkSubmitInfo submitInfo = {};
submitInfo.sType = VK_STRUCTURE_TYPE_SUBMIT_INFO;
submitInfo.commandBufferCount = ;
submitInfo.pCommandBuffers = &commandBuffer; vkQueueSubmit(graphicsQueue, , &submitInfo, VK_NULL_HANDLE);
vkQueueWaitIdle(graphicsQueue);

Unlike the draw commands, there are no events we need to wait on this time. We just want to execute the transfer on the buffers immediately. There are again two possible ways to wait on this transfer to complete. We could use a fence and wait with vkWaitForFences, or simply wait for the transfer queue to become idle with vkQueueWaitIdle. A fence would allow you to schedule multiple transfers simultaneously and wait for all of them complete, instead of executing one at a time. That may give the driver more opportunities to optimize.

与绘制命令不同,这次我们不需要等什么时间。我们只想立即执行buffer的转移操作。还是有2个方式to等待转移完成。我们可以用fence和vkWaitForFences等待,或者简单地用vkQueueWaitIdle等待转移队列变成空闲状态。Fence会允许你同时安排多个转移操作,等待它们全部完成,而不是一次只执行一个。这可能给驱动更多机会去优化。

vkFreeCommandBuffers(device, commandPool, , &commandBuffer);

Don't forget to clean up the command buffer used for the transfer operation.

别忘了清理命令buffer that用于转移操作。

We can now call copyBuffer from the createVertexBuffer function to move the vertex data to the device local buffer:

现在我们可以在函数createVertexBuffer 中调用copyBuffer  to移动顶点数据到设备局部buffer了:

createBuffer(bufferSize, VK_BUFFER_USAGE_TRANSFER_DST_BIT | VK_BUFFER_USAGE_VERTEX_BUFFER_BIT, VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT, vertexBuffer, vertexBufferMemory);

copyBuffer(stagingBuffer, vertexBuffer, bufferSize);

After copying the data from the staging buffer to the device buffer, we should clean it up:

从暂存buffer复制数据到设备buffer后,我们应当清理它:

    ...

    copyBuffer(stagingBuffer, vertexBuffer, bufferSize);

    vkDestroyBuffer(device, stagingBuffer, nullptr);
vkFreeMemory(device, stagingBufferMemory, nullptr);
}

Run your program to verify that you're seeing the familiar triangle again. The improvement may not be visible right now, but its vertex data is now being loaded from high performance memory. This will matter when we're going to start rendering more complex geometry.

运行你的程序to验证你再次看到熟悉的三角形。进步可能目前无法看到,但是它的顶点数据已经从高性能内存加载了。这在我们要开始渲染更复杂的几何体时会显得重要。

Conclusion 总结

It should be noted that in a real world application, you're not supposed to actually call vkAllocateMemory for every individual buffer. The maximum number of simultaneous memory allocations is limited by the maxMemoryAllocationCount physical device limit, which may be as low as 4096 even on high end hardware like an NVIDIA GTX 1080. The right way to allocate memory for a large number of objects at the same time is to create a custom allocator that splits up a single allocation among many different objects by using the offset parameters that we've seen in many functions.

应当注意到,在实际的应用程序中,你不应该为每个单独的buffer都调用vkAllocateMemory 。同时内存分配的最大数是受到maxMemoryAllocationCount 物理设备限制的,which即使在高端硬件(如NVIDIA GTX 1080)可能低到4096 。为大量对象正确地分配内存的方式是,创建一个自定义的分配器that拆分一个单独的空间给许多不同的对象by offset参数that我们在许多函数中见过的。

You can either implement such an allocator yourself, or use the VulkanMemoryAllocator library provided by the GPUOpen initiative. However, for this tutorial it's okay to use a separate allocation for every resource, because we won't come close to hitting any of these limits for now.

你可以自己实现这样的分配器,也可以用GPUOpen倡议的VulkanMemoryAllocator 库。但是,本教程中用一个单独的分配器for每个资源,是可以的,因为我们不会接近任何这些上限。

C++ code / Vertex shader / Fragment shader

[译]Vulkan教程(23)暂存buffer的更多相关文章

  1. [译]Vulkan教程(22)创建顶点buffer

    [译]Vulkan教程(22)创建顶点buffer Vertex buffer creation 创建顶点buffer Introduction 入门 Buffers in Vulkan are re ...

  2. [译]Vulkan教程(25)描述符布局和buffer

    [译]Vulkan教程(25)描述符布局和buffer Descriptor layout and buffer 描述符布局和buffer Introduction 入门 We're now able ...

  3. [译]Vulkan教程(24)索引buffer

    [译]Vulkan教程(24)索引buffer Index buffer 索引buffer Introduction 入门 The 3D meshes you'll be rendering in a ...

  4. [译]Vulkan教程(32)生成mipmap

    [译]Vulkan教程(32)生成mipmap Generating Mipmaps 生成mipmap Introduction 入门 Our program can now load and ren ...

  5. [译]Vulkan教程(27)Image

    [译]Vulkan教程(27)Image Images Introduction 入门 The geometry has been colored using per-vertex colors so ...

  6. [译]Vulkan教程(33)多重采样

    [译]Vulkan教程(33)多重采样 Multisampling 多重采样 Introduction 入门 Our program can now load multiple levels of d ...

  7. [译]Vulkan教程(31)加载模型

    [译]Vulkan教程(31)加载模型 Loading models 加载模型 Introduction 入门 Your program is now ready to render textured ...

  8. [译]Vulkan教程(30)深度缓存

    [译]Vulkan教程(30)深度缓存 Depth buffering 深度缓存 Introduction 入门 The geometry we've worked with so far is pr ...

  9. [译]Vulkan教程(29)组合的Image采样器

    [译]Vulkan教程(29)组合的Image采样器 Combined image sampler 组合的image采样器 Introduction 入门 We looked at descripto ...

随机推荐

  1. Xshell连接阿里云服务器

    1.遇到的问题 ​ 直接用阿里云的终端,还需要登录浏览器很是麻烦,所以用Xshell,ssh远程登录这样就轻松方便了很多. 2.打开 ​ 打开安装好的Xshell,点击新建 出现了这个界面,首先注意主 ...

  2. 小白的springboot之路(十)、全局异常处理

    0.前言 任何系统,我们不会傻傻的在每一个地方进行异常捕获和处理,整个系统一般我们会在一个的地方统一进行异常处理,spring boot全局异常处理很简单: 介绍前先说点题外话,我们现在开发系统,都是 ...

  3. 深入探索Java设计模式(二)之策略模式

    策略设计模式是Java API库中常见的模式之一.这与另一个设计模式(称为状态设计模式)非常相似.本文是在学习完优锐课JAVA架构VIP课程—[框架源码专题]中<学习源码中的优秀设计模式> ...

  4. 【hibernate】重写物理表名和列明

    [hibernate]重写物理表名和列明 转载:https://www.cnblogs.com/yangchongxing/p/10357123.html 假设你的数据库命名有这样的需求,表都以 yc ...

  5. 《Java基础知识》Java标示符、保留字和数制

    一.Java标识符程序员对程序中的各个元素加以命名时使用的命名记号称为标识符(identifier).Java语言中,标识符是以字母,下划线(_),美元符($)开始的一个字符序列,后面可以跟字母,下划 ...

  6. Nginx配置实例-负载均衡实例:平均访问多台服务器

    场景 Nginx配置实例-反向代理实例:根据访问的路径跳转到不同端口的服务中: https://blog.csdn.net/BADAO_LIUMANG_QIZHI/article/details/10 ...

  7. 基于Tomcat的GeoServer部署步骤

    一.安装JAVA 资源:JDK1.8 提取码:0y26 步骤: 1.安装完成后,右击"我的电脑",点击"属性",选择"高级系统设置": 2. ...

  8. 《漫画ERP》经典文章摘抄

    1.对企业来说,应用ERP的价值就在于通过系统的计划和控制功能,结合企业的流程优化,有效的配置各项资源,以加快对市场的响应,降低成本,提高效率和效益,从而提升企业的竞争力:

  9. iOS核心动画高级技巧-5

    9. 图层时间 图层时间 时间和空间最大的区别在于,时间不能被复用 -- 弗斯特梅里克 在上面两章中,我们探讨了可以用CAAnimation和它的子类实现的多种图层动画.动画的发生是需要持续一段时间的 ...

  10. Git很麻烦?只要掌握这几个命令,轻松将代码提交远程仓库

    在上一章节,跟大家介绍了拉取代码的操作,简单暴力.这一章节要介绍的是如何将现有的项目,直接提交到仓库. 现在,如果大家有一个项目要提交到GitHub仓库,安装上一张的方法,需要先在GitHub上建一个 ...