TVM Pass IR如何使用

TVM Pass IR如何使用

随着Relay / tir中优化遍数的增加，执行并手动维护其依赖关系变得很棘手。引入了一个基础结构来管理优化过程，并应用于TVM堆栈中IR的不同层。

Relay / tir程序的优化可以以各种粒度应用，即分别使用tvm.relay.transform.FunctionPass/ tvm.tir.transform.PrimFuncPass和的功能级别和模块级别tvm.transform.ModulePass。或者，用户可以依靠在tvm.transform.Sequential中继/ tir程序上应用一系列pass，其中pass之间的依赖性可以通过pass下文解决。有关这些pass的每种类型的更多详细信息，请参阅pass基础结构。

本文主要说明开发人员如何使用pass infra进行特定的优化，创建用于Relay程序的优化管道。同样的方法也可以用于tir。

import numpy as np

import tvm

from tvm import te

import tvm.relay as relay

创建一个示例Relay中继程序

首先，创建一个简单的Relay程序。该程序将用于示例的各种优化。类似地，用户可以编写一个tir基本函数并应用tirpass。

def example():

shape = (1, 64, 54, 54)

c_data = np.empty(shape).astype("float32")

c = relay.const(c_data)

weight = relay.var("weight", shape=(64, 64, 3, 3))

x = relay.var("x", relay.TensorType((1, 64, 56, 56), "float32"))

conv = relay.nn.conv2d(x, weight)

y = relay.add(c, c)

y = relay.multiply(y, relay.const(2, "float32"))

y = relay.add(conv, y)

z = relay.add(y, c)

z1 = relay.add(y, c)

z2 = relay.add(z, z1)

return relay.Function([x, weight], z2)

让为conv2d op注册布局更改，以便可以在示例中应用布局更改通道。alter layout pass如何工作不在本文的讨论范围之内。

@relay.op.register_alter_op_layout("nn.conv2d", level=101)

def alter_conv2d(attrs, inputs, tinfos, out_type):

data, weight = inputs

new_attrs = dict(attrs)

new_attrs["data_layout"] = "NCHW16c"

return relay.nn.conv2d(data, weight, **new_attrs)

优化程序

现在要优化程序。Relay中继具有许多优化功能。将选择其中一些以应用于此示例程序。

有多种方法可以优化中继程序。下面将为每个示例提供示例。

手动应用优化pass

# Let's first create a relay Module which contains one or multiple Relay

# functions for optimization.

f = example()

mod = tvm.IRModule.from_expr(f)

# Now we can apply constant folding on the module.

# fold_const here is a callback that doesn't take any parameters.

fold_const = relay.transform.FoldConstant()

# Then, we can invoke the pass on the given module. Note that the constant

# folding pass works at the function-level. That being said, each function in

# the module will be applied with the optimization. Users don't need to iterate

# through individual functions manually to apply this pass.

mod = fold_const(mod)

# We can see from the updated program that the constants are folded.

print(mod)

输出：

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%1 = add(%0, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%3 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

add(%2, %3) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

可以以类似方式应用更多优化。例如，可以消除z和z1使用的通用表达式。

mod = relay.transform.EliminateCommonSubexpr()(mod)

print(mod)

输出：

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%1 = add(%0, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

add(%2, %2) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

一些优化（例如融合）也是参数化的。例如，选择级别0不允许将算子融合在一起。用户可以传递 fuse_opt_level来启用此功能。

mod = relay.transform.FuseOps(fuse_opt_level=0)(mod)

# We can observe that the optimized module contains functions that only have

# a signle primitive op.

print(mod)

输出：

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%0 = fn (%p0: Tensor[(1, 64, 56, 56), float32], %p1: Tensor[(64, 64, 3, 3), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

nn.conv2d(%p0, %p1, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */

};

%1 = %0(%x, %weight) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = fn (%p01: Tensor[(1, 64, 54, 54), float32], %p11: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

add(%p01, %p11) /* ty=Tensor[(1, 64, 54, 54), float32] */

};

%3 = %2(%1, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%4 = fn (%p02: Tensor[(1, 64, 54, 54), float32], %p12: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

add(%p02, %p12) /* ty=Tensor[(1, 64, 54, 54), float32] */

};

%5 = %4(%3, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%6 = fn (%p03: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

add(%p03, %p03) /* ty=Tensor[(1, 64, 54, 54), float32] */

};

%6(%5) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

使用序列来应用pass序列

如上所述，应用pass实际上是乏味的，并且可能需要用户更好地了解依赖性。例如，融合目前不适用于let绑定。如果relay.transform.ToANormalForm()在融合之前应用算子，将无法将融合在一起，因为此过程会为每个表达式生成let绑定，以规范化Relay程序。

Relaytvm.transform.Sequential通过指定每个遍历，将打包为整体来缓解开发人员显式处理这些问题的麻烦。例如，可以使用以下序列样式应用相同遍历。tvm.transform.Sequential，torch.nn.sequential 和mxnet.gluon.block类似。例如，torch.nn.sequential用于包含一系列PyTorch模块，这些模块将被添加，以构建网络，着重于网络层。取而代之的是tvm.transform.Sequential，下面的过程中的基础工作于优化过程。

# Now let's execute some passes through :py:class:`tvm.transform.Sequential`

f = example()

mod = tvm.IRModule.from_expr(f)

# Glob the interested passes.

seq = tvm.transform.Sequential(

[

relay.transform.FoldConstant(),

relay.transform.EliminateCommonSubexpr(),

relay.transform.FuseOps(fuse_opt_level=2),

]

)

mod1 = seq(mod)

print(mod1)

输出：

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%4 = fn (%p0: Tensor[(1, 64, 56, 56), float32], %p1: Tensor[(64, 64, 3, 3), float32], %p2: Tensor[(1, 64, 54, 54), float32], %p3: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%p0, %p1, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%1 = add(%0, %p2) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = add(%1, %p3) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%3 = add(%1, %p3) /* ty=Tensor[(1, 64, 54, 54), float32] */;

add(%2, %3) /* ty=Tensor[(1, 64, 54, 54), float32] */

};

%4(%x, %weight, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

从转换后的Relay程序中，可以看到仍然有两个相同的加法运算。这是因为EliminateCommonSubexpr 未实际执行。默认情况下，只有优化级别小于或等于2的过程才被执行 tvm.transform.Sequential。下面的pass提供了一个配置界面，供用户自定义要执行的优化级别。

with tvm.transform.PassContext(opt_level=3):

mod2 = seq(mod)

print(mod2)

输出：

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%3 = fn (%p0: Tensor[(1, 64, 56, 56), float32], %p1: Tensor[(64, 64, 3, 3), float32], %p2: Tensor[(1, 64, 54, 54), float32], %p3: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%p0, %p1, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%1 = add(%0, %p2) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = add(%1, %p3) /* ty=Tensor[(1, 64, 54, 54), float32] */;

add(%2, %2) /* ty=Tensor[(1, 64, 54, 54), float32] */

};

%3(%x, %weight, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

可以看到仅保留了两个相同的加法之一。

In addition, users can selectively disable some passes using the disabled_pass config, which is similar to the -fno-xxx option used the general purpose compilers, such as Clang and GCC. For example, we can disable EliminateCommonSubexpr as following. The printed module will again show two identical addition operations.

with tvm.transform.PassContext(opt_level=3, disabled_pass=["EliminateCommonSubexpr"]):

mod3 = seq(mod)

print(mod3)

Out:

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%p0, %p1, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%1 = add(%0, %p2) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = add(%1, %p3) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%3 = add(%1, %p3) /* ty=Tensor[(1, 64, 54, 54), float32] */;

add(%2, %3) /* ty=Tensor[(1, 64, 54, 54), float32] */

};

%4(%x, %weight, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

The passes applied so far are target independent. The pass infra also provides a means to make pass target-aware. For example, the layout alteration pass falls in such category.

with tvm.transform.PassContext(opt_level=3):

mod4 = seq(mod)

print(mod4)

seq1 = tvm.transform.Sequential([relay.transform.AlterOpLayout()])

with tvm.transform.PassContext(opt_level=3):

with tvm.target.Target("llvm"):

mod5 = seq1(mod)

print(mod5)

Out:

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%p0, %p1, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%1 = add(%0, %p2) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = add(%1, %p3) /* ty=Tensor[(1, 64, 54, 54), float32] */;

add(%2, %2) /* ty=Tensor[(1, 64, 54, 54), float32] */

};

%3(%x, %weight, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%0 = layout_transform(%x, src_layout="NCHW", dst_layout="NCHW16c") /* ty=Tensor[(1, 4, 56, 56, 16), float32] */;

%1 = nn.conv2d(%0, %weight, padding=[0, 0, 0, 0], data_layout="NCHW16c") /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

%2 = add(meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%3 = multiply(%2, 2f /* ty=float32 */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%4 = layout_transform(%3, src_layout="NCHW", dst_layout="NCHW16c") /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

%5 = add(%1, %4) /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

%6 = layout_transform(meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, src_layout="NCHW", dst_layout="NCHW16c") /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

%7 = add(%5, %6) /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

%8 = add(%5, %6) /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

%9 = add(%7, %8) /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

layout_transform(%9, src_layout="NCHW16c", dst_layout="NCHW") /* ty=Tensor[(1, 64, 54, 54), float32] */

}

Implement a Pass Using Python Decorator

下一个示例说明了如何使用Python装饰器，通过传递基础流程来编排定制的优化管道。此功能极大地简化了pass的实施。例如，用户可以简单地定义一个修饰的类，进行功能级别的优化，如以下示例所示。transform_function包装一个类，以用c的倍数替换所有常量。稍后，当调用自定义过程时，将访问给定模块中的每个函数，并且将替换函数中的每个常量。

@relay.transform.function_pass(opt_level=1)

class CustomPipeline:

"""Simple test function to replace one argument to another."""

def __init__(self, multiplier):

self.multiplier = multiplier

# This function can define a pass.

def transform_function(self, func, mod, ctx):

obj = self

class ReplaceConstant(tvm.relay.ExprMutator):

def visit_constant(self, c):

return relay.multiply(obj.multiplier, c)

return ReplaceConstant().visit(func)

f = example()

mod = tvm.IRModule.from_expr(f)

custom_pass = CustomPipeline(multiplier=relay.const(3, "float32"))

assert custom_pass.info.name == "CustomPipeline"

mod3 = custom_pass(mod)

print(mod3)

输出：

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%1 = multiply(3f /* ty=float32 */, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = add(%1, %1) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%3 = multiply(3f /* ty=float32 */, 2f /* ty=float32 */) /* ty=float32 */;

%4 = multiply(%2, %3) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%5 = add(%0, %4) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%6 = add(%5, %1) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%7 = add(%5, %1) /* ty=Tensor[(1, 64, 54, 54), float32] */;

add(%6, %7) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

调试pass

TVM为用户提供了一种即插即用式的调试通道，该通道在通过特殊通道（PrintIR）来转储整个模块的IR之后，将IR打印出来。序列传递示例的略微修改版本，可能类似于以下内容，以启用IR转储以进行FoldConstant优化。

f = example()

mod = tvm.IRModule.from_expr(f)

seq = tvm.transform.Sequential(

[

relay.transform.FoldConstant(),

tvm.transform.PrintIR(),

relay.transform.EliminateCommonSubexpr(),

relay.transform.FuseOps(),

relay.transform.AlterOpLayout(),

]

)

# By inserting the ``PrintIR`` pass after ``FoldConstant``, the pass infra will

# dump out the module IR when ``FoldConstant`` is done. Users can plug in this

# pass after any pass they want to debug for viewing the optimization effect.

# There is a more flexible debugging mechanism also exposed by the build configuration

# object. One can pass a tracing function which can be used to execute arbitrary code

# before and/or after each pass. A tracing function will receive a :py::class:`tvm.IRModule`,

# a :py:class:`tvm.transform.PassInfo` object,

# and a boolean indicating whether you are executing before, or after a pass.

# An example is below.

def print_ir(mod, info, is_before):

"""Print the name of the pass, the IR, only before passes execute."""

if is_before:

print("Running pass: {}", info)

print(mod)

with tvm.transform.PassContext(opt_level=3, trace=print_ir):

with tvm.target.Target("llvm"):

# Perform the optimizations.

mod = seq(mod)

print(mod)

print("done")

输出：

Running pass: {} The meta data of the pass: pass name: FoldConstantopt_level: 2required passes: [

]

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) {

%0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]);

%1 = add(meta[relay.Constant][0], meta[relay.Constant][0]);

%2 = multiply(%1, 2f);

%3 = add(%0, %2);

%4 = add(%3, meta[relay.Constant][0]);

%5 = add(%3, meta[relay.Constant][0]);

add(%4, %5)

}

Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

]

def @main() {

add(meta[relay.Constant][0], meta[relay.Constant][0])

}

Running pass: {} The meta data of the pass: pass name: FuseOpsopt_level: 1required passes: [

InferType, ]

def @main() -> Tensor[(1, 64, 54, 54), float32] {

add(meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

]

def @main() -> Tensor[(1, 64, 54, 54), float32] {

%0 = fn (%p0: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

add(%p0, %p0)

};

%0(meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */)

}

Running pass: {} The meta data of the pass: pass name: ToANormalFormopt_level: 1required passes: [

]

def @main() -> Tensor[(1, 64, 54, 54), float32] {

%0 = fn (%p0: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

add(%p0, %p0) /* ty=Tensor[(1, 64, 54, 54), float32] */

};

%0(meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

]

def @main() -> Tensor[(1, 64, 54, 54), float32] {

let %x = meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */;

let %x1 = fn (%p0: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

add(%p0, %p0) /* ty=Tensor[(1, 64, 54, 54), float32] */

};

let %x2 = %x1(%x);

%x2

}

Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

]

def @main() {

multiply(meta[relay.Constant][0], 2f)

}

Running pass: {} The meta data of the pass: pass name: FuseOpsopt_level: 1required passes: [

InferType, ]

def @main() -> Tensor[(1, 64, 54, 54), float32] {

multiply(meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, 2f /* ty=float32 */) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

]

def @main() -> Tensor[(1, 64, 54, 54), float32] {

%0 = fn (%p0: Tensor[(1, 64, 54, 54), float32], %p1: float32, Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

multiply(%p0, %p1)

};

%0(meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, 2f /* ty=float32 */)

}

Running pass: {} The meta data of the pass: pass name: ToANormalFormopt_level: 1required passes: [

]

def @main() -> Tensor[(1, 64, 54, 54), float32] {

%0 = fn (%p0: Tensor[(1, 64, 54, 54), float32], %p1: float32, Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

multiply(%p0, %p1) /* ty=Tensor[(1, 64, 54, 54), float32] */

};

%0(meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, 2f /* ty=float32 */) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

]

def @main() -> Tensor[(1, 64, 54, 54), float32] {

let %x = meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */;

let %x1 = 2f /* ty=float32 */;

let %x2 = fn (%p0: Tensor[(1, 64, 54, 54), float32], %p1: float32, Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

multiply(%p0, %p1) /* ty=Tensor[(1, 64, 54, 54), float32] */

};

let %x3 = %x2(%x, %x1);

%x3

}

Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

]

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) {

%0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]);

%1 = add(%0, meta[relay.Constant][0]);

%2 = add(%1, meta[relay.Constant][1]);

%3 = add(%1, meta[relay.Constant][1]);

add(%2, %3)

}

Running pass: {} The meta data of the pass: pass name: PrintIRopt_level: 0required passes: [

]

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%1 = add(%0, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%3 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

add(%2, %3) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

]

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%1 = add(%0, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%3 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

add(%2, %3) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

Running pass: {} The meta data of the pass: pass name: EliminateCommonSubexpropt_level: 3required passes: [

InferType, ]

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%1 = add(%0, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%3 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

add(%2, %3) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

]

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%1 = add(%0, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

add(%2, %2)

}

Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

]

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%1 = add(%0, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

add(%2, %2) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

Running pass: {} The meta data of the pass: pass name: FuseOpsopt_level: 1required passes: [

InferType, ]

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%1 = add(%0, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

add(%2, %2) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

]

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%p0, %p1, padding=[0, 0, 0, 0]);

%1 = add(%0, %p2);

%2 = add(%1, %p3);

add(%2, %2)

};

%3(%x, %weight, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */)

}

Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

]

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%p0, %p1, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%1 = add(%0, %p2) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = add(%1, %p3) /* ty=Tensor[(1, 64, 54, 54), float32] */;

add(%2, %2) /* ty=Tensor[(1, 64, 54, 54), float32] */

};

%3(%x, %weight, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

Running pass: {} The meta data of the pass: pass name: AlterOpLayoutopt_level: 3required passes: [

InferType, ]

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%p0, %p1, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%1 = add(%0, %p2) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = add(%1, %p3) /* ty=Tensor[(1, 64, 54, 54), float32] */;

add(%2, %2) /* ty=Tensor[(1, 64, 54, 54), float32] */

};

%3(%x, %weight, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

]

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%7 = fn (%p0: Tensor[(1, 64, 56, 56), float32], %p1: Tensor[(64, 64, 3, 3), float32], %p2: Tensor[(1, 64, 54, 54), float32], %p3: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

%0 = layout_transform(%p0, src_layout="NCHW", dst_layout="NCHW16c");

%1 = nn.conv2d(%0, %p1, padding=[0, 0, 0, 0], data_layout="NCHW16c");

%2 = layout_transform(%p2, src_layout="NCHW", dst_layout="NCHW16c");

%3 = add(%1, %2);

%4 = layout_transform(%p3, src_layout="NCHW", dst_layout="NCHW16c");

%5 = add(%3, %4);

%6 = add(%5, %5);

layout_transform(%6, src_layout="NCHW16c", dst_layout="NCHW")

};

%7(%x, %weight, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */)

}

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%0 = layout_transform(%p0, src_layout="NCHW", dst_layout="NCHW16c") /* ty=Tensor[(1, 4, 56, 56, 16), float32] */;

%1 = nn.conv2d(%0, %p1, padding=[0, 0, 0, 0], data_layout="NCHW16c") /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

%2 = layout_transform(%p2, src_layout="NCHW", dst_layout="NCHW16c") /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

%3 = add(%1, %2) /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

%4 = layout_transform(%p3, src_layout="NCHW", dst_layout="NCHW16c") /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

%5 = add(%3, %4) /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

%6 = add(%5, %5) /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

layout_transform(%6, src_layout="NCHW16c", dst_layout="NCHW") /* ty=Tensor[(1, 64, 54, 54), float32] */

};

%7(%x, %weight, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

done

概括

本文介绍了如何使用pass基础，更加方便地在TVM中编写和调用pass。还讨论了调用pass的不同方法。使用tvm.transform.Sequential，可以极大地帮助用户简化处理多个优化过程及其依赖项的工作。另外，提供了一个示例来说明如何使用PrintIR和跟踪调试过程。

TVM Pass IR如何使用的更多相关文章

如何使用TVM Pass红外线
如何使用TVM Pass红外线随着Relay / tir中优化遍数的增加,执行并手动维护其依赖关系变得很棘手.引入了一个基础结构来管理优化过程,将其应用于TVM堆栈中IR的不同层. Relay / ...
rac one node在线relocation
1.查看数据库运行状态 $ srvctl status database -d rone Instance rone_2 is running on node rone2 Online relocat ...
转://ORA-00603,ORA-27501,ORA-27300,ORA-27301,ORA-27302故障案例一则
背景介绍: 这是一套windows的rac系统.数据库后台日志报ORA-00474:SMON process terminated with error.接着报ORA-00603,ORA-27501, ...
oracle 错误实例分析(ORA-01078)
01,问题描述心血来潮想看一下启动数据库的alert log.然后把数据库给关闭了,同时也在监听日志文件下面可谓是详细的描述了整个关机过程,也看到了无数的error [root@node1 ...
【体系结构】有关Oracle SCN知识点的整理
[体系结构]有关Oracle SCN知识点的整理 1 BLOG文档结构图 BLOG_Oracle_lhr_Oracle SCN的一点研究.pdf 2 前言部分 2.1 导读和注意事项各位技 ...
nbu还原集群数据库异常问题
集群数据库软件均已安装完毕,现在想从NBU上还原数据库,但在还原控制文件报错 [oracle@oracle-db1 ~]$ rman target / Recovery Manager: Releas ...
oracle数据库启动报错，不能启动ASM实例
数据库rac启动时报错,日志例如以下,后来使用 Sat Jun 7 06:02:11 2014 GATHER_STATS_JOB encountered errors. Check the tra ...
TVM:一个端到端的用于开发深度学习负载以适应多种硬件平台的IR栈
TVM:一个端到端的用于开发深度学习负载以适应多种硬件平台的IR栈本文对TVM的论文进行了翻译整理深度学习如今无处不在且必不可少.这次创新部分得益于可扩展的深度学习系统,比如 TensorFlo ...
用TVM在硬件平台上部署深度学习工作负载的端到端 IR 堆栈
用TVM在硬件平台上部署深度学习工作负载的端到端 IR 堆栈深度学习已变得无处不在,不可或缺.这场革命的一部分是由可扩展的深度学习系统推动的,如滕索弗洛.MXNet.咖啡和皮托奇.大多数现有系统针对 ...

随机推荐

1.7.8- HTML合并单元格
跨行与跨列
hdu3338 最大流
题意: 给你一个N*M的网格,上面有的有一些数字,要求填充数字,满足的规则是这样: 答案不唯一,只要满足和的关系就可以,还有就是只能用1--9之间的数字填充,而且每一行或一列可 ...
Android内核的编译和调试
本文博客地址:http://blog.csdn.net/qq1084283172/article/details/70500488 一.Android内核源码的选择 Android手机设备内核源码的调 ...
基于C++简单Windows API的socket编程（阻塞模式）
1. 概述:简单的基于Windows API的socket点对点聊天程序,为了方便初学者,本文代码均采用阻塞原理编写. 2. 代码样例 Server.cpp(服务端) #include <cst ...
开启Android应用调试选项的工具XDebug的介绍
本文博客地址:https://blog.csdn.net/QQ1084283172/article/details/81187769 最近这段时间比较郁闷,就不分享和学习比较复杂的Android逆向技 ...
Windowsw核心编程第13章 Windows内存结构
第1 3章 Wi n d o w s的内存结构 13.1 进程的虚拟地址空间每个进程都被赋予它自己的虚拟地址空间.对于 3 2位进程来说,这个地址空间是4 G B,因为3 2位指针可以拥有从0 x ...
声明：songzijian这个域名已经被抢注。大家别上了。不是我了。
声明:songzijian这个域名已经被抢注.大家别上了.不是我了.
【python】Leetcode每日一题-反转链表 II
[python]Leetcode每日一题-反转链表 II [题目描述] 给你单链表的头节点 head 和两个整数 left 和 right ,其中 left <= right .请你反转从位置 ...
简单聊聊VisualStudio的断点调试
这节聊聊如何使用VisualStudio进行断点调试. 在debug过程中,我们有时需要查看程序在运行到某一行代码时,上下文中的变量或者一些其他的数据是什么样的,我们就要设置断点(Breakpoint ...
Linux性能调优命令之free
功能说明 free 命令显示系统使用和空闲的内存情况,包括物理内存.交互区内存(swap)和内核缓冲区内存.共享内存将被忽略语法 free [参数] 参数 -b : 以Byte为单位显示内存使用情况 ...

TVM Pass IR如何使用

TVM Pass IR如何使用的更多相关文章

随机推荐

热门专题