Intel® Threading Building Blocks (Intel® TBB) Developer Guide 中文 Parallelizing Data Flow and Dependence Graphs并行化data flow和依赖图
https://www.threadingbuildingblocks.org/docs/help/index.htm
Parallelizing Data Flow and Dependency Graphs
In addition to loop parallelism, the Intel® Threading Building Blocks (Intel® TBB) library also supports graph parallelism. It's possible to create graphs that are highly scalable, but it is also possible to create graphs that are completely sequential.
除了循环并行化,tbb还支持图并行化。这使得创建高度扩展性的图有了可能,同时也能都创建完全顺序执行的图
Using graph parallelism, computations are represented by nodes and the communication channels between these computations are represented by edges. When a node in the graph receives a message, a task is spawned to execute its body object on the incoming message. Messages flow through the graph across the edges that connect the nodes. The following sections present two examples of applications that can be expressed as graphs. For more information on tasks, see the See Also section below.
图并行化中,计算被表示为节点,计算之间的通讯通道被表达为边。当一个节点收到消息,一个任务会被执行。消息通过连接节点的边来流过图。下面有两个例子
The following figure shows a streaming or data flow application where a sequence of values is processed as each value passes through the nodes in the graph. In this example, the sequence is created by a function F. For each value in the sequence, G squares the value and H cubes the value. J then takes each of the squared and cubed values and adds them to a global sum. After all values in the sequence are completely processed, sum is equal to the sum of the sequence of squares and cubes from 1 to 10. In a streaming or data flow graph, the values actually flow across the edges; the output of one node becomes the input of its successor(s).
下图是一个streaming or data flow 的应用

The following graphic shows a different form of graph application. In this example, a dependence graph is used to establish a partial ordering among the steps for making a peanut butter and jelly sandwich. In this partial ordering, you must first get the bread before spreading the peanut butter or jelly on the bread. You must spread on the peanut butter before you put away the peanut butter jar, and likewise spread on the jelly before you put away the jelly jar. And, you need to spread on both the peanut butter and jelly before putting the two slices of bread together. This is a partial ordering because, for example, it doesn't matter if you spread on the peanut butter first or the jelly first. It also doesn't matter if you finish making the sandwich before putting away the jars.
下图是另外一种图的应用,以dependence graph 的形式表达任务的步骤执行

While it can be inferred that resources, such as the bread, or the jelly jar, are shared between ordered steps, it is not explicit in the graph. Instead, only the required ordering of steps is explicit in a dependence graph. For example, you must "Put jelly on 1 slice" before you "Put away jelly jar".
The flow graph interface in the Intel TBB library allows you to express data flow and dependence graphs such as these, as well as more complicated graphs that include cycles, conditionals, buffering and more. If you express your application using the flow graph interface, the runtime library spawns tasks to exploit the parallelism that is present in the graph. For example, in the first example above, perhaps two different values might be squared in parallel, or the same value might be squared and cubed in parallel. Likewise in the second example, the peanut butter might be spread on one slice of bread in parallel with the jelly being spread on the other slice. The interface expresses what is legal to execute in parallel, but allows the runtime library to choose at runtime what will be executed in parallel.
tbb允许你表达data flow and dependence graphs。以及更复杂的图,比如包含cycle,条件,缓冲。。
The support for graph parallelism is contained within the namespace tbb::flow and is defined in the flow_graph.h header file.
See Also
Basic Flow Graph Concepts
基本的概念
Flow Graph Basics: Graph Object 图
Conceptually a flow graph is a collection of nodes and edges. Each node belongs to exactly one graph and edges are made only between nodes in the same graph. In the flow graph interface, a graph object represents this collection of nodes and edges, and is used for invoking whole graph operations such as waiting for all tasks related to the graph to complete, resetting the state of all nodes in the graph, and canceling the execution of all nodes in the graph.
The code below creates a graph object and then waits for all tasks spawned by the graph to complete. The call to wait_for_all in this example returns immediately since this is a trivial graph with no nodes or edges, and therefore no tasks are spawned.
- graph g;
- g.wait_for_all();
Flow Graph Basics: Nodes 节点
A node is a class that inherits from tbb::flow::graph_node and also typically inherits from tbb::flow::sender<T> , tbb::flow::receiver<T> or both. A node performs some operation, usually on an incoming message and may generate zero or more output messages. Some nodes require more than one input message or generate more than one output message.
节点用来做计算
While it is possible to define your own node types by inheriting from graph_node, sender and receiver, it is more typical that predefined node types are used to construct a graph. The list of predefined nodes is available from the See Also section below.
A function_node is a predefined type available in flow_graph.h and represents a simple function with one input and one output. The constructor for afunction_node takes three arguments:
- template< typename Body> function_node(graph &g, size_t concurrency, Body body)
Parameter | Description |
---|---|
Body |
Type of the body object. |
g |
The graph the node belongs to. |
concurrency |
The concurrency limit for the node. You can use the concurrency limit to control how many invocations of the node are allowed to proceed concurrently, from 1 (serial) to an unlimited number. |
body |
User defined function object, or lambda expression, that is applied to the incoming message to generate the outgoing message. |
Below is code for creating a simple graph that contains a single function_node. In this example, a node n is constructed that belongs to graph g, and has a second argument of 1, which allows at most 1 invocation of the node to occur concurrently. The body is a lambda expression that prints each value v that it receives, spins for v seconds, prints the value again, and then returns v unmodified. The code for the function spin_for is not provided.
- graph g;
- function_node< int, int > n( g, 1, []( int v ) -> int {
- cout << v;
- spin_for( v );
- cout << v;
- return v;
- } );
After the node is constructed in the example above, you can pass messages to it, either by connecting it to other nodes using edges or by invoking its function try_put. Using edges is described in the next section.
- n.try_put( 1 );
- n.try_put( 2 );
- n.try_put( 3 );
You can then wait for the messages to be processed by calling wait_for_all on the graph object:
- g.wait_for_all();
In the above example code, the function_node n was created with a concurrency limit of 1. When it receives the message sequence 1, 2 and 3, the node n will spawn a task to apply the body to the first input, 1. When that task is complete, it will then spawn another task to apply the body to 2. And likewise, the node will wait for that task to complete before spawning a third task to apply the body to 3. The calls to try_put do not block until a task is spawned; if a node cannot immediately spawn a task to process the message, the message will be buffered in the node. When it is legal, based on concurrency limits, a task will be spawned to process the next buffered message.
In the above graph, each message is processed sequentially. If however, you construct the node with a different concurrency limit, parallelism can be achieved:
- function_node< int, int > n( g, tbb::flow::unlimited, []( int v ) -> int {
- cout << v;
- spin_for( v );
- cout << v;
- return v;
- } );
You can use unlimited as the concurrency limit to instruct the library to spawn a task as soon as a message arrives, regardless of how many other tasks have been spawned. You can also use any specific value, such as 4 or 8, to limit concurrency to at most 4 or 8, respectively. It is important to remember that spawning a task does not mean creating a thread. So while a graph may spawn many tasks, only the number of threads available in the library's thread pool will be used to execute these tasks.
Suppose you use unlimited in the function_node constructor instead and call try_put on the node:
- n.try_put( 1 );
- n.try_put( 2 );
- n.try_put( 3 );
- g.wait_for_all();
The library spawns three tasks, each one applying n's lambda expression to one of the messages. If you have a sufficient number of threads available on your system, then all three invocations of the body will occur in parallel. If however, you have only one thread in the system, they execute sequentially.
Intel® Threading Building Blocks (Intel® TBB) Developer Guide 中文 Parallelizing Data Flow and Dependence Graphs并行化data flow和依赖图的更多相关文章
- Linux安装Intel Threading Building Blocks(TBB)
编译安装: wget https://codeload.github.com/01org/tbb/tar.gz/2019_U3 tar zxvf 2019_U3 cd tbb-2019_U3 make ...
- 四、Implementation: The Building Blocks 实现:构件
四.Implementation: The Building Blocks 实现:构件 This is the essential part of this guide. We will introd ...
- 虚拟机启动linux系统报错,此主机支持 Intel VT-x,但 Intel VT-x 处于禁用状态
在使用虚拟机启动linux的时候报错,如下: 已将该虚拟机配置为使用 64 位客户机操作系统.但是,无法执行 64 位操作. 此主机支持 Intel VT-x,但 Intel VT-x 处于禁用状态. ...
- Thinkpad 笔记本VMware Workstation 安装虚拟机出现“此主机支持 Intel VT-x,但 Intel VT-x 处于禁用状态”解决方法
今天在使用VMware打算在机器中安装新的虚拟机时,出现"此主机支持 Intel VT-x,但 Intel VT-x 处于禁用状态"错误如下: 提示信息: 已将该虚拟机配 ...
- bc.34.B.Building Blocks(贪心)
Building Blocks Time Limit: 2000/1000 MS (Java/Others) Memory Limit: 65536/65536 K (Java/Others) ...
- DTD - XML Building Blocks
The main building blocks of both XML and HTML documents are elements. The Building Blocks of XML Doc ...
- 内置在虚拟机上64位操作系统:该主机支持 Intel VT-x,但 Intel VT-x 残
VT-Virtual Technology. 在这里特别说一下:虚拟64位操作系统,须要特别在BIOS中打开VT支持.怎样提示:此主机不支持Intel VT-x,则不可能虚拟出64位系统. 当提示:此 ...
- 企业架构研究总结(35)——TOGAF架构内容框架之构建块(Building Blocks)
之前忙于搬家移居,无暇顾及博客,今天终于得闲继续我的“政治课”了,希望之后至少能够补完TOGAF方面的内容.从前面文章可以看出,笔者并无太多能力和机会对TOGAF进行理论和实际的联系,仅可对标准的文本 ...
- TOGAF架构内容框架之构建块(Building Blocks)
TOGAF架构内容框架之构建块(Building Blocks) 之前忙于搬家移居,无暇顾及博客,今天终于得闲继续我的“政治课”了,希望之后至少能够补完TOGAF方面的内容.从前面文章可以看出,笔者并 ...
随机推荐
- Apache-Commons包作用说明
Apache Commons包含了很多开源的工具,用于解决平时编程经常会遇到的问题,减少重复劳动.项目地址http://commons.apache.org/ Commons BeanUtils 提供 ...
- Python-Windows下安装BeautifulSoup和requests第三方模块
http://blog.csdn.net/yannanxiu/article/details/50432498 首先给出官网地址: 1.Request官网 2.BeautifulSoup官网 我下载的 ...
- win2008server系统下文件替换权限
因为那里的文件默认只有系统才有修改权限.选中要替换的文件(一次只能选一个),属性->安全->高级->所有者(选更改)->高级->立即查找->选择 Everyone, ...
- vim基础使用
vim的常用模式有分为命令模式,插入模式,可视模式,正常模式.本教程中,只需要用到正常模式和插入模式.二者间的切换即可以帮助你完成本指南的学习. 进入方法: vim xxx.xml 正常模式 正常模式 ...
- js 获取时间差
写这片博客 ,下面代码虽然简单,但却很实用...默默留下来... var minute = 1000 * 60;var hour = minute * 60;var day = hour * 24;v ...
- Failed to execute goal on project MakeFriends: Could not resolve dependencie The POM for .chengpai.jtd:jtd-service-api:jar:1.0-SNAPSHOT is missing, no dependency information available
本笔者在学习maven的基础,然后建立了一个maven的项目,然后想对其进行依赖操作,pom.xml进行依赖操作时候出现了这样的错误,说是找不到这个依赖的包,但是事实上已经导入了这个包. 同时,也在m ...
- poj3468 A Simple Problem with Integers (线段树区间最大值)
A Simple Problem with Integers Time Limit: 5000MS Memory Limit: 131072K Total Submissions: 92127 ...
- angularjs+jasmine单元测试入门
使用cordova.angularjs.ionic开发hybrid App有一段时间了.为了做单元测试,之前一直是把要测的某一部分产品代码复制到另一个单独的工程中来写测试代码,测好了以后再复制回去.弊 ...
- Java方法的封装
类的封装性即不能让外面的类随意修改一个类的成员变量: 在定义一个类的成员(包括变量和方法),使用private关键字说明这个成员的访问权限,只能被这个类的其他成员方法调用,而不能被其他的类中的方法所调 ...
- Robberies(HDU2955):01背包+概率转换问题(思维转换)
Robberies HDU2955 因为题目涉及求浮点数的计算:则不能从正面使用01背包求解... 为了能够使用01背包!从唯一的整数(抢到的钱下手)... 之后就是概率的问题: 题目只是给出被抓的 ...