

WebGPU学习(五): 现代图形API技术要点和WebGPU支持情况调研





  • 增加一个uniform buffer object(简称为ubo),用于传输“model矩阵 乘以 view矩阵 乘以 projection矩阵”的结果矩阵(简称为mvp矩阵),并在每帧被更新
  • 设置顶点
  • 开启面剔除
  • 开启深度测试


增加一个uniform buffer object


在WebGL 1中,我们通过uniform1i,uniform4fv等函数传递每个gameObject对应的uniform变量(如diffuseMap, diffuse color, model matrix等)到shader中。


如果gameObject1和gameObject3使用同一个shader1,它们的diffuse color相同,那么只需要传递其中的一个diffuse color,而在WebGL 1中我们一般把这两个diffuse color都传递了,造成了重复的开销。

WebGPU使用uniform buffer object来传递uniform变量。uniform buffer是一个全局的buffer,我们只需要设置一次值,然后在每次draw之前,设置使用的数据范围(通过offset, size来设置),从而复用相同的数据。如果uniform值有变化,则只需要修改uniform buffer对应的数据。

在WebGPU中,我们可以把所有gameObject的model矩阵设为一个ubo,所有相机的view和projection矩阵设为一个ubo,每一种material(如phong material,pbr material等)的数据(如diffuse color,specular color等)设为一个ubo,每一种light(如direction light、point light等)的数据(如light color、light position等)设为一个ubo,这样可以有效减少uniform变量的传输开销。




下面的ubo对应的uniform block,定义布局为std140:

  1. layout (std140) uniform ExampleBlock
  2. {
  3. float value;
  4. vec3 vector;
  5. mat4 matrix;
  6. float values[3];
  7. bool boolean;
  8. int integer;
  9. };


  1. layout (std140) uniform ExampleBlock
  2. {
  3. // base alignment // aligned offset
  4. float value; // 4 // 0
  5. vec3 vector; // 16 // 16 (must be multiple of 16 so 4->16)
  6. mat4 matrix; // 16 // 32 (column 0)
  7. // 16 // 48 (column 1)
  8. // 16 // 64 (column 2)
  9. // 16 // 80 (column 3)
  10. float values[3]; // 16 // 96 (values[0])
  11. // 16 // 112 (values[1])
  12. // 16 // 128 (values[2])
  13. bool boolean; // 4 // 144
  14. int integer; // 4 // 148
  15. };








  • 在vertex shader中定义uniform block


  1. const vertexShaderGLSL = `#version 450
  2. layout(set = 0, binding = 0) uniform Uniforms {
  3. mat4 modelViewProjectionMatrix;
  4. } uniforms;
  5. ...
  6. void main() {
  7. gl_Position = uniforms.modelViewProjectionMatrix * position;
  8. fragColor = color;
  9. }
  10. `;



  • 创建uniformsBindGroupLayout


  1. const uniformsBindGroupLayout = device.createBindGroupLayout({
  2. bindings: [{
  3. binding: 0,
  4. visibility: 1,
  5. type: "uniform-buffer"
  6. }]
  7. });

binding对应vertex shader中uniform block的binding,意思是bindings数组的第一个元素的对应binding为0的uniform block


  • 创建uniform buffer


  1. const uniformBufferSize = 4 * 16; // BYTES_PER_ELEMENT(4) * matrix length(4 * 4 = 16)
  2. const uniformBuffer = device.createBuffer({
  3. size: uniformBufferSize,
  4. usage: GPUBufferUsage.UNIFORM | GPUBufferUsage.COPY_DST,
  5. });
  • 创建uniform bind group


  1. const uniformBindGroup = device.createBindGroup({
  2. layout: uniformsBindGroupLayout,
  3. bindings: [{
  4. binding: 0,
  5. resource: {
  6. buffer: uniformBuffer,
  7. },
  8. }],
  9. });

binding对应vertex shader中uniform block的binding,意思是bindings数组的第一个元素的对应binding为0的uniform block

  • 每一帧更新uniform buffer的mvp矩阵数据


  1. //因为是固定相机,所以只需要计算一次projection矩阵
  2. const aspect = Math.abs(canvas.width / canvas.height);
  3. let projectionMatrix = mat4.create();
  4. mat4.perspective(projectionMatrix, (2 * Math.PI) / 5, aspect, 1, 100.0);
  5. ...
  6. //计算mvp矩阵
  7. function getTransformationMatrix() {
  8. let viewMatrix = mat4.create();
  9. mat4.translate(viewMatrix, viewMatrix, vec3.fromValues(0, 0, -5));
  10. let now = Date.now() / 1000;
  11. mat4.rotate(viewMatrix, viewMatrix, 1, vec3.fromValues(Math.sin(now), Math.cos(now), 0));
  12. let modelViewProjectionMatrix = mat4.create();
  13. mat4.multiply(modelViewProjectionMatrix, projectionMatrix, viewMatrix);
  14. return modelViewProjectionMatrix;
  15. }
  16. ...
  17. return function frame() {
  18. //使用setSubData更新uniform buffer,后面分析
  19. uniformBuffer.setSubData(0, getTransformationMatrix());
  20. ...
  21. }
  • draw之前设置bind group


  1. return function frame() {
  2. ...
  3. //“0”对应vertex shader中uniform block的“set = 0”
  4. passEncoder.setBindGroup(0, uniformBindGroup);
  5. passEncoder.draw(36, 1, 0, 0);
  6. ...
  7. }

详细分析“更新uniform buffer”

本示例使用setSubData来更新uniform buffer:

  1. return function frame() {
  2. uniformBuffer.setSubData(0, getTransformationMatrix());
  3. ...
  4. }

我们在WebGPU学习(五): 现代图形API技术要点和WebGPU支持情况调研->Approaching zero driver overhead->persistent map buffer中,提到了WebGPU目前有两种方法实现“CPU把数据传输到GPU“,即更新GPUBuffer的值:


2.使用persistent map buffer技术



  1. function setBufferDataByPersistentMapBuffer(device, commandEncoder, uniformBufferSize, uniformBuffer, mvpMatricesData) {
  2. const [srcBuffer, arrayBuffer] = device.createBufferMapped({
  3. size: uniformBufferSize,
  4. usage: GPUBufferUsage.COPY_SRC
  5. });
  6. new Float32Array(arrayBuffer).set(mvpMatricesData);
  7. srcBuffer.unmap();
  8. commandEncoder.copyBufferToBuffer(srcBuffer, 0, uniformBuffer, 0, uniformBufferSize);
  9. const commandBuffer = commandEncoder.finish();
  10. const queue = device.defaultQueue;
  11. queue.submit([commandBuffer]);
  12. srcBuffer.destroy();
  13. }
  14. return function frame() {
  15. //uniformBuffer.setSubData(0, getTransformationMatrix());
  16. ...
  17. const commandEncoder = device.createCommandEncoder({});
  18. setBufferDataByPersistentMapBuffer(device, commandEncoder, uniformBufferSize, uniformBuffer, getTransformationMatrix());
  19. ...
  20. }

为了验证性能,我做了benchmark测试,创建一个包含160000个mat4的ubo,使用这2种方法来更新uniform buffer,比较它们的js profile:



使用persistent map buffer(调用setBufferDataByPersistentMapBuffer函数):


可以看到两个的性能差不多。但考虑到persistent map buffer从实现原理上要更快(cpu和gpu共用一个buffer,不需要copy),因此应该优先使用该方法。

另外,WebGPU社区现在还在讨论如何优化更新buffer数据(如有人提出增加GPUUploadBuffer pass),因此我们还需要继续关注该方面的进展。


Advanced-GLSL->Uniform buffer objects


  • 传输顶点的position和color数据到vertex shader的attribute(在glsl 4.5中用“in”表示attribute)中


  1. const vertexShaderGLSL = `#version 450
  2. ...
  3. layout(location = 0) in vec4 position;
  4. layout(location = 1) in vec4 color;
  5. layout(location = 0) out vec4 fragColor;
  6. void main() {
  7. gl_Position = uniforms.modelViewProjectionMatrix * position;
  8. fragColor = color;
  9. }
  10. const fragmentShaderGLSL = `#version 450
  11. layout(location = 0) in vec4 fragColor;
  12. layout(location = 0) out vec4 outColor;
  13. void main() {
  14. outColor = fragColor;
  15. }
  16. `;

在vertex shader中设置color为fragColor(在glsl 4.5中用“out”表示WebGL 1的varying变量),然后在fragment shader中接收fragColor,将其设置为outColor,从而将fragment的color设置为对应顶点的color

  • 创建vertices buffer,设置立方体的顶点数据


  1. cube.ts:
  2. //每个顶点包含position,color,uv数据
  3. //本示例没用到uv数据
  4. export const cubeVertexArray = new Float32Array([
  5. // float4 position, float4 color, float2 uv,
  6. 1, -1, 1, 1, 1, 0, 1, 1, 1, 1,
  7. -1, -1, 1, 1, 0, 0, 1, 1, 0, 1,
  8. -1, -1, -1, 1, 0, 0, 0, 1, 0, 0,
  9. 1, -1, -1, 1, 1, 0, 0, 1, 1, 0,
  10. 1, -1, 1, 1, 1, 0, 1, 1, 1, 1,
  11. -1, -1, -1, 1, 0, 0, 0, 1, 0, 0,
  12. 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
  13. 1, -1, 1, 1, 1, 0, 1, 1, 0, 1,
  14. 1, -1, -1, 1, 1, 0, 0, 1, 0, 0,
  15. 1, 1, -1, 1, 1, 1, 0, 1, 1, 0,
  16. 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
  17. 1, -1, -1, 1, 1, 0, 0, 1, 0, 0,
  18. -1, 1, 1, 1, 0, 1, 1, 1, 1, 1,
  19. 1, 1, 1, 1, 1, 1, 1, 1, 0, 1,
  20. 1, 1, -1, 1, 1, 1, 0, 1, 0, 0,
  21. -1, 1, -1, 1, 0, 1, 0, 1, 1, 0,
  22. -1, 1, 1, 1, 0, 1, 1, 1, 1, 1,
  23. 1, 1, -1, 1, 1, 1, 0, 1, 0, 0,
  24. -1, -1, 1, 1, 0, 0, 1, 1, 1, 1,
  25. -1, 1, 1, 1, 0, 1, 1, 1, 0, 1,
  26. -1, 1, -1, 1, 0, 1, 0, 1, 0, 0,
  27. -1, -1, -1, 1, 0, 0, 0, 1, 1, 0,
  28. -1, -1, 1, 1, 0, 0, 1, 1, 1, 1,
  29. -1, 1, -1, 1, 0, 1, 0, 1, 0, 0,
  30. 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
  31. -1, 1, 1, 1, 0, 1, 1, 1, 0, 1,
  32. -1, -1, 1, 1, 0, 0, 1, 1, 0, 0,
  33. -1, -1, 1, 1, 0, 0, 1, 1, 0, 0,
  34. 1, -1, 1, 1, 1, 0, 1, 1, 1, 0,
  35. 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
  36. 1, -1, -1, 1, 1, 0, 0, 1, 1, 1,
  37. -1, -1, -1, 1, 0, 0, 0, 1, 0, 1,
  38. -1, 1, -1, 1, 0, 1, 0, 1, 0, 0,
  39. 1, 1, -1, 1, 1, 1, 0, 1, 1, 0,
  40. 1, -1, -1, 1, 1, 0, 0, 1, 1, 1,
  41. -1, 1, -1, 1, 0, 1, 0, 1, 0, 0,
  42. ]);
  1. rotatingCube.ts:
  2. const verticesBuffer = device.createBuffer({
  3. size: cubeVertexArray.byteLength,
  4. usage: GPUBufferUsage.VERTEX | GPUBufferUsage.COPY_DST
  5. });
  6. verticesBuffer.setSubData(0, cubeVertexArray);


  • 创建render pipeline时,指定vertex shader的attribute


  1. cube.ts:
  2. export const cubeVertexSize = 4 * 10; // Byte size of one cube vertex.
  3. export const cubePositionOffset = 0;
  4. export const cubeColorOffset = 4 * 4; // Byte offset of cube vertex color attribute.
  1. rotatingCube.ts:
  2. const pipeline = device.createRenderPipeline({
  3. ...
  4. vertexState: {
  5. vertexBuffers: [{
  6. arrayStride: cubeVertexSize,
  7. attributes: [{
  8. // position
  9. shaderLocation: 0,
  10. offset: cubePositionOffset,
  11. format: "float4"
  12. }, {
  13. // color
  14. shaderLocation: 1,
  15. offset: cubeColorOffset,
  16. format: "float4"
  17. }]
  18. }],
  19. },
  20. ...
  21. });
  • render pass->draw指定顶点个数为36


  1. return function frame() {
  2. ...
  3. const passEncoder = commandEncoder.beginRenderPass(renderPassDescriptor);
  4. ...
  5. passEncoder.draw(36, 1, 0, 0);
  6. passEncoder.endPass();
  7. ...
  8. }



  1. const pipeline = device.createRenderPipeline({
  2. ...
  3. rasterizationState: {
  4. cullMode: 'back',
  5. },
  6. ...
  7. });


  1. enum GPUFrontFace {
  2. "ccw",
  3. "cw"
  4. };
  5. enum GPUCullMode {
  6. "none",
  7. "front",
  8. "back"
  9. };
  10. ...
  11. dictionary GPURasterizationStateDescriptor {
  12. GPUFrontFace frontFace = "ccw";
  13. GPUCullMode cullMode = "none";
  14. ...
  15. };






Investigation: Rasterization State



  • 创建render pipeline时,设置depthStencilState


  1. const pipeline = device.createRenderPipeline({
  2. ...
  3. depthStencilState: {
  4. //开启深度测试
  5. depthWriteEnabled: true,
  6. //设置比较函数为less,后面会说明
  7. depthCompare: "less",
  8. //设置depth为24bit
  9. format: "depth24plus-stencil8",
  10. },
  11. ...
  12. });
  • 创建depth texture(注意它的size->depth为1),将它的view设置为render pass -> depthStencilAttachment -> attachment


  1. const depthTexture = device.createTexture({
  2. size: {
  3. width: canvas.width,
  4. height: canvas.height,
  5. depth: 1
  6. },
  7. format: "depth24plus-stencil8",
  8. usage: GPUTextureUsage.OUTPUT_ATTACHMENT
  9. });
  10. const renderPassDescriptor: GPURenderPassDescriptor = {
  11. ...
  12. depthStencilAttachment: {
  13. attachment: depthTexture.createView(),
  14. depthLoadValue: 1.0,
  15. depthStoreOp: "store",
  16. ...
  17. }
  18. };


  1. dictionary GPURenderPassDepthStencilAttachmentDescriptor {
  2. required GPUTextureView attachment;
  3. required (GPULoadOp or float) depthLoadValue;
  4. required GPUStoreOp depthStoreOp;
  5. ...
  6. };

depthLoadValue和depthStoreOp与WebGPU学习(二): 学习“绘制一个三角形”示例->分析render pass->colorAttachment的loadOp和StoreOp类似,我们来看下相关的代码:

  1. const pipeline = device.createRenderPipeline({
  2. ...
  3. depthStencilState: {
  4. ...
  5. depthCompare: "less",
  6. ...
  7. },
  8. ...
  9. });
  10. ...
  11. const renderPassDescriptor: GPURenderPassDescriptor = {
  12. ...
  13. depthStencilAttachment: {
  14. ...
  15. depthLoadValue: 1.0,
  16. depthStoreOp: "store",
  17. ...
  18. }
  19. };



Depth testing




webgpu-samplers Github Repo



