[ZZ] cbuffer和tbuffer
http://blog.chinaunix.net/uid-20235103-id-2578297.html
Shader Model 4支持的新东西,通过打包数据可以获得更好的性能。原文转发:
Shader Constants (DirectX HLSL)
In shader model 4, shader constants are stored in one or more buffer resources in memory. They can be organized into two types of buffers: constant buffers (cbuffers) and texture buffers (tbuffers). Constant buffers are optimized for constant-variable usage, which is characterized by lower-latency access and more frequent update from the CPU. For this reason, additional size, layout, and access restrictions apply to these resources. Texture buffers are accessed like textures and perform better for arbitrarily indexed data. Regardless of which type of resource you use, there is no limit to the number of constant buffers or texture buffers an application can create.
Declaring a constant buffer or a texture buffer looks very much like a structure declaration in C, with the addition of the register and packoffset keywords for manually assigning registers or packing data.
BufferType [Name] [: register(b#)] { VariableDeclaration [: packoffset(c#.xyzw)]; ... }; |
Parameters BufferType
[in] The buffer type.
BufferType | Description |
---|---|
cbuffer | constant buffer |
tbuffer | texture buffer |
Name
[in] Optional, ASCII string containing a unique buffer name.
register(b#)
[in] Optional keyword, used to manually pack constant data. Constants can be packed in a register only in a constant buffer, where the starting register is given by the register number (#).
VariableDeclaration
[in] Variable declaration, similar to a structure member declaration. This can be any HLSL type or effect object (except a texture or a sampler object).
packoffset(c#.xyzw)
[in] Optional keyword, used to manually pack constant data. Constants can be packed in any constant buffer, where the register number is given by (#). Sub-component packing (using xyzw swizzling) is available for constants whose size fit within a single register (do not cross a register boundary). For instance, a float4 could not be packed in a single register starting with the y component as it would not fit in a four-component register.
Remarks
Constant buffers reduce the bandwidth required to update shader constants by allowing shader constants to be grouped together and committed at the same time rather than making individual calls to commit each constant separately.
A constant buffer is a specialized buffer resource that is accessed like a buffer. Each constant buffer can hold up to 4096 vectors; each vector contains up to four 32-bit values. You can bind up to 14 constant buffers per pipeline stage (2 additional slots are reserved for internal use).
A texture buffer is a specialized buffer resource that is accessed like a texture. Texture access (as compared with buffer access) can have better performance for arbitrarily indexed data. You can bind up to 128 texture buffers per pipeline stage.
A buffer resource is designed to minimize the overhead of setting shader constants. The effect framework (see ID3D10Effect Interface) will manage updating constant and texture buffers, or you can use the Direct3D API to update buffers (see Copying and Accessing Resource Data (Direct3D 10) for information). An application can also copy data from another buffer (such as a render target or a stream-output target) into a constant buffer.
For additional information on using constant buffers in a D3D10 application see Resource Types (Direct3D 10) and Creating Buffer Resources (Direct3D 10).
For additional information on using constant buffers in a D3D11 application see Introduction to Buffers in Direct3D 11 and How to: Create a Constant Buffer.
A constant buffer does not require a view to be bound to the pipeline. A texture buffer, however, requires a view and must be bound to a texture slot (or must be bound withSetTextureBuffer when using an effect).
There are two ways to pack constants data: using the register (DirectX HLSL) and packoffset (DirectX HLSL) keywords.
Differences between Direct3D 9 and Direct3D 10 and 11: Unlike the auto-allocation of constants in Direct3D 9, which did not perform packing and instead assigned each variable to a set of float4 registers, HLSL constant variables follow packing rules in Direct3D 10 and 11. |
Organizing constant buffers
Constant buffers reduce the bandwidth required to update shader constants by allowing shader constants to be grouped together and committed at the same time rather than making individual calls to commit each constant separately.
The best way to efficiently use constant buffers is to organize shader variables into constant buffers based on their frequency of update. This allows an application to minimize the bandwidth required for updating shader constants. For example, a shader might declare two constant buffers and organize the data in each based on their frequency of update: data that needs to be updated on a per-object basis (like a world matrix) is grouped into a constant buffer which could be updated for each object. This is separate from data that characterizes a scene and is therefore likely to be updated much less often (when the scene changes).
{
float4x4 matWorld;
float3 vObjectPosition;
int arrayIndex;
}
cbuffer myScene
{
float3 vSunPosition;
float4x4 matView;
}
Default constant buffers
There are two default constant buffers available, $Global and $Param. Variables which are placed in the global scope are added implicitly to the $Global cbuffer, using the same packing method as is used for cbuffers. Uniform parameters in the parameter list of a function appear in the $Param constant buffer when a shader is compiled outside of the effects framework. When compiled inside the effects framework, all uniforms must resolve to variables defined in the global scope.
Examples
Here is an example from Skinning10 Sample that is a texture buffer made up of an array of matrices.
{
matrix g_mTexBoneWorld[MAX_BONE_MATRICES];
};
This example declaration manually assigns a constant buffer to start at a particular register, and also packs particular elements by subcomponents.
{
float4 Element1 : packoffset(c0);
float1 Element2 : packoffset(c1);
float1 Element3 : packoffset(c1.y);
}
Related Topics Shader Model 4
另:
在DirectX10 SDK的范例中,主要是使用Effect框架来组织Shader。但是有些情况下,引擎需要自己来生成或管理shader,sampler,textrue等,这样Effect框架的灵活性就显的不够了。
SDK的“HLSLWithoutFX10 Sample”中 演示了如何不使用Effect框架的方法,但是有些问题没有说到。主要是关于Shader与应用程序间的数据传递。要传递的数据主要有 constant buffer,samplerstate,textrue(resource)。查阅了一些资料加上摸索加上Exjoy的帮助,整理了 一下不使用Effect框架来管理数据传递的方法。主要有两种:
1 最简单也是直接的就是用寄存器名来绑定数据了。
首先是constant的传递。
这 里要先提一下DirectX10中新引入的constant buffer。在DX10中,constant存放于常量缓冲区中,每个常量缓冲区由 4096个常量寄存器组成,共有16个常量缓冲区。这样就可以根据constant更新的频率来组织,可以提升性能。Constant buffer会为 两种:cbuffer,tbuffer。注意tbuffer是并不是用来存储纹理的,而是指可以像纹理那样来访问其中的数据,对于索引类数据有更好的性 能。
来看实例:
在shader中有如下定义
cbuffer MyBuffer : register(b3)
{
float4 Element1 : packoffset(c0);
float1 Element2 : packoffset(c1);
float1 Element3 : packoffset(c1.y);
}
register(bN):b表示constant buffer,N为input slot (0-15) 。
即表示Mybuffer存放于b3中。
在应用程序中使用如下。
g_pd3dDevice->VSSetConstantBuffers( 3, 1, pBuffers );
第一个参数即为要传递的buffer放置的slot起点。类似的函数PSSetConstantBuffers,GSSetConstantBuffers。
Textrue类似,语法为register(tN), t 表示纹理,N 为input slot (0-127) 。
例,PS中:
Texture2D txDiffuse : register(t3); 应用程序中:g_pd3dDevice->PSSetShaderResources( 3, 1, texViewArray );
Samplers语法为register(sN), s 表示取样器,s 为input slot (0-127) 。
例,PS中:
SamplerState samLinear2 : register(s4)
{
Filter = MIN_MAG_MIP_LINEAR;
AddressU = Wrap;
AddressV = Wrap;
};
应用程序中使用的函数为ID3D10Device::PSGetSamplers()。
2 使用shader reflect系统
这种方法可以按变量名来传递数据。
举个例子来说吧,PS中有如下定义:
Texture2D txDiffuse; SamplerState samLinear
{
Filter = MIN_MAG_MIP_LINEAR;
AddressU = Wrap;
AddressV = Wrap;
}; cbuffer pscb0
{
float4 color;
};
(1)创建一个ID3D10ShaderReflection对象,通过这个对象可以从已编译好的shader中取得相应的信息。
hr = D3D10ReflectShader( (void*) pPSBuf->GetBufferPointer(), pPSBuf->GetBufferSize(),&pIShaderReflection );
(2)调用GetDesc,得到的D3D10_SHADER_DESC中的BoundResources为当前的shader绑定的resource数量。这里的resouce包括了constant buffer,texture,sampler,此处返回的BoundResources为3。
D3D10_SHADER_DESC desc;
if( pIShaderReflection )
{
pIShaderReflection->GetDesc( &desc );
}
(3)使用GetResourceBindingDesc得到具体的每个resource的绑定信息。
D3D10_SHADER_INPUT_BIND_DESC resourceBindingDesc0;
D3D10_SHADER_INPUT_BIND_DESC resourceBindingDesc1;
D3D10_SHADER_INPUT_BIND_DESC resourceBindingDesc2; if( pIShaderReflection )
{
pIShaderReflection->GetResourceBindingDesc(0, &resourceBindingDesc0);
pIShaderReflection->GetResourceBindingDesc(1, &resourceBindingDesc1);
pIShaderReflection->GetResourceBindingDesc(2, &resourceBindingDesc2 );
}
D3D10_SHADER_INPUT_BIND_DESC结构中的主要的属性有:
LPCSTR Name 绑定的resource的名字
D3D10_SHADER_INPUT_TYPE Typ
D3D10_SHADER_INPUT_TYPE为枚举 量:D3D10_SIT_CBUFFER,D3D10_SIT_TBUFFER,
D3D10_SIT_TEXTURE,D3D10_SIT_SAMPLER
注意,此处的D3D10_SIT_CBUFFER,D3D10_SIT_TBUFFER都是指constant buffer。 UINT BindPoint:资源绑定的slot。即我们要使用的。
此处结果为:
resourceBindingDesc0 samLinear
resourceBindingDesc1 txDiffuse
resourceBindingDesc2 pscb0
(4)根据(3)得到的信息进行具体的绑定,我们要绑定纹理,所以使用resourceBindingDesc1:
const char* texname1 = "txDiffuse"; if( strcmp( texname1, resourceBindingDesc1.Name) == NULL )
{
//给PS设置纹理
g_pd3dDevice->PSSetShaderResources( resourceBindingDesc1.BindPoint, 1, texViewArray );
}
Constant buffer和sampler类似。
[ZZ] cbuffer和tbuffer的更多相关文章
- 使用FP-Growth算法高效发现频繁项集【zz】
FP树构造 FP Growth算法利用了巧妙的数据结构,大大降低了Aproir挖掘算法的代价,他不需要不断得生成候选项目队列和不断得扫描整个数据库进行比对.为了达到这样的效果,它采用了一种简洁的数据结 ...
- [zz] 基于国家标准的 EndNote 输出样式模板
基于国家标准的 EndNote 输出样式模板 https://cnzhx.net/blog/endnote-output-style-cnzhx/ 发表于 2013-05-26 作者 Haoxian ...
- 炉石ZZ操作 [20161224]
昨天吃完晚饭,开了一盘炉石.选的龙牧,遇到对面马克扎尔战士. 中途,我场上3个较大随从,他突然先拍下一个铜须,菊花一紧,然后果然拍下了大工匠(之前用龙人侦察者看到他牌库有这张牌),逗比的一幕开始了,首 ...
- VC++动态链接库(DLL)编程深入浅出(zz)
VC++动态链接库(DLL)编程深入浅出(zz) 1.概论 先来阐述一下DLL(Dynamic Linkable Library)的概念,你可以简单的把DLL看成一种仓库,它提供给你一些可以直接拿来用 ...
- warning C4305: “=”: 从“int”到“unsigned char”截断解决方法[zz]
在控制台程序中定义: float x; x=22.333; 编译会出现 warning C4305: “初始化”: 从“double”到“float”截断 系统默认此浮点数是22.333是double ...
- 北京市小升初 zz
发信人: django (牛魔王), 信区: SchoolEstate 标 题: 北京市小升初掐尖方式的演变过程(看后恍然大悟) 发信站: 水木社区 (Thu Feb 4 10:51:23 201 ...
- [zz] JIT&HotSpot
zz from 百度百科 最早的Java建置方案是由一套转译程式(interpreter),将每个Java指令都转译成对等的微处理器指令,并根据转译后的指令先后次序依序执行,由于一个Java指令可能被 ...
- [ZZ]计算机视觉、机器学习相关领域论文和源代码大集合
原文地址:[ZZ]计算机视觉.机器学习相关领域论文和源代码大集合作者:计算机视觉与模式 注:下面有project网站的大部分都有paper和相应的code.Code一般是C/C++或者Matlab代码 ...
- 那些证书相关的玩意儿(SSL,X.509,PEM,DER,CRT,CER,KEY,CSR,P12等)[zz]
openssl dgst –sign privatekey.pem –sha1 –keyform PEM –c c:\server.pem 将文件用sha1摘要,并用privatekey.pem中的私 ...
随机推荐
- MVC学习笔记---MVC生命周期及管道
ASP.NET和ASP.NET MVC的HttpApplication请求处理管道有共同的部分和不同之处,本系列将体验ASP.NET MVC请求处理管道生命周期的19个关键环节. ①以IIS6.0为例 ...
- Java开发中程序和代码性能优化
现在计算机的处理性能越来越好,加上JDK升级对一些代码的优化,在代码层针对一些细节进行调整可能看不到性能的明显提升, 但是我觉得在开发中注意这些,更多的是可以保持一种性能优先的意识,对一些敲代码时间比 ...
- Query通过Ajax向PHP服务端发送请求并返回JSON数据
Query通过Ajax向PHP服务端发送请求并返回JSON数据 服务端PHP读取MYSQL数据,并转换成JSON数据,传递给前端Javascript,并操作JSON数据.本文将通过实例演示了jQuer ...
- hdu 2795 线段树(纵向)
注意h的范围和n的范围,纵向建立线段树 题意:h*w的木板,放进一些1*L的物品,求每次放空间能容纳且最上边的位子思路:每次找到最大值的位子,然后减去L线段树功能:query:区间求最大值的位子(直接 ...
- 比较StringBuffer字符串内容是否相等?
为什么会有这个问题呢?首先得看看String和StringBuffer的比较区别: ==只能比较两个字符串的内存地址是否一样,不能比较字符串内容: String的equals方法因为重写了Object ...
- Android开发之日历控件实现
Android开发之日历控件实现:以下都是转载的. 日历控件 日历控件 日历控件 日历控件
- 移动开单软件 手持PDA开单扫描打印系统开发介绍
具体功能预览--(图示) PDA开单打印扫描采集器主程序: ▲门店使用:接单员销售开单.销售退货或查询相关资料. ▲仓库使用:PDA仓库验收货.发货.仓库盘点 ▲在外业务开单:业务在外面开销售单.销售 ...
- Linux下设置memcached访问IP
在虚拟机上装了memcached,本地访问可以,但从其它机器连这台机器的memcached应用总是报连接失败.防火墙的端口都是打开的.Google了才知道原来需要修改memcached的配置文件,将默 ...
- mysql之对索引的操作
1. 为什么使用索引? 数据库对象索引与书的目录非常类似,主要是为了提高从表中检索数据的速度.由于数据储存在数据库表中,所以索引是创建在数据库表对象之上的,由表中的一个字段或多个字段生成的键组成,这些 ...
- 贪心 POJ 2109 Power of Cryptography
题目地址:http://poj.org/problem?id=2109 /* 题意:k ^ n = p,求k 1. double + pow:因为double装得下p,k = pow (p, 1 / ...