TVM Pass IR如何使用

随着Relay / tir中优化遍数的增加,执行并手动维护其依赖关系变得很棘手。引入了一个基础结构来管理优化过程,并应用于TVM堆栈中IR的不同层。

Relay / tir程序的优化可以以各种粒度应用,即分别使用tvm.relay.transform.FunctionPasstvm.tir.transform.PrimFuncPass和的功能级别和模块级别tvm.transform.ModulePass。或者,用户可以依靠在tvm.transform.Sequential中继/ tir程序上应用一系列pass,其中pass之间的依赖性可以通过pass下文解决。有关这些pass的每种类型的更多详细信息,请参阅pass基础结构

本文主要说明开发人员如何使用pass infra进行特定的优化,创建用于Relay程序的优化管道。同样的方法也可以用于tir。

import numpy as np

import tvm

from tvm import te

import tvm.relay as relay

创建一个示例Relay中继程序

首先,创建一个简单的Relay程序。该程序将用于示例的各种优化。类似地,用户可以编写一个tir基本函数并应用tirpass。

def example():

shape = (1, 64, 54, 54)

c_data = np.empty(shape).astype("float32")

c = relay.const(c_data)

weight = relay.var("weight", shape=(64, 64, 3, 3))

x = relay.var("x", relay.TensorType((1, 64, 56, 56), "float32"))

conv = relay.nn.conv2d(x, weight)

y = relay.add(c, c)

y = relay.multiply(y, relay.const(2, "float32"))

y = relay.add(conv, y)

z = relay.add(y, c)

z1 = relay.add(y, c)

z2 = relay.add(z, z1)

return relay.Function([x, weight], z2)

让为conv2d op注册布局更改,以便可以在示例中应用布局更改通道。alter layout pass如何工作不在本文的讨论范围之内。

@relay.op.register_alter_op_layout("nn.conv2d", level=101)

def alter_conv2d(attrs, inputs, tinfos, out_type):

data, weight = inputs

new_attrs = dict(attrs)

new_attrs["data_layout"] = "NCHW16c"

return relay.nn.conv2d(data, weight, **new_attrs)

优化程序

现在要优化程序。Relay中继具有许多优化功能。将选择其中一些以应用于此示例程序。

有多种方法可以优化中继程序。下面将为每个示例提供示例。

手动应用优化pass

# Let's first create a relay Module which contains one or multiple Relay

# functions for optimization.

f = example()

mod = tvm.IRModule.from_expr(f)

# Now we can apply constant folding on the module.

# fold_const here is a callback that doesn't take any parameters.

fold_const = relay.transform.FoldConstant()

# Then, we can invoke the pass on the given module. Note that the constant

# folding pass works at the function-level. That being said, each function in

# the module will be applied with the optimization. Users don't need to iterate

# through individual functions manually to apply this pass.

mod = fold_const(mod)

# We can see from the updated program that the constants are folded.

print(mod)

输出:

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%1 = add(%0, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%3 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

add(%2, %3) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

可以以类似方式应用更多优化。例如,可以消除zz1使用的通用表达式。

mod = relay.transform.EliminateCommonSubexpr()(mod)

print(mod)

输出:

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%1 = add(%0, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

add(%2, %2) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

一些优化(例如融合)也是参数化的。例如,选择级别0不允许将算子融合在一起。用户可以传递 fuse_opt_level来启用此功能。

mod = relay.transform.FuseOps(fuse_opt_level=0)(mod)

# We can observe that the optimized module contains functions that only have

# a signle primitive op.

print(mod)

输出:

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%0 = fn (%p0: Tensor[(1, 64, 56, 56), float32], %p1: Tensor[(64, 64, 3, 3), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

nn.conv2d(%p0, %p1, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */

};

%1 = %0(%x, %weight) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = fn (%p01: Tensor[(1, 64, 54, 54), float32], %p11: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

add(%p01, %p11) /* ty=Tensor[(1, 64, 54, 54), float32] */

};

%3 = %2(%1, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%4 = fn (%p02: Tensor[(1, 64, 54, 54), float32], %p12: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

add(%p02, %p12) /* ty=Tensor[(1, 64, 54, 54), float32] */

};

%5 = %4(%3, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%6 = fn (%p03: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

add(%p03, %p03) /* ty=Tensor[(1, 64, 54, 54), float32] */

};

%6(%5) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

使用序列来应用pass序列

如上所述,应用pass实际上是乏味的,并且可能需要用户更好地了解依赖性。例如,融合目前不适用于let绑定。如果relay.transform.ToANormalForm()在融合之前应用算子,将无法将融合在一起,因为此过程会为每个表达式生成let绑定,以规范化Relay程序。

Relaytvm.transform.Sequential通过指定每个遍历,将打包为整体来缓解开发人员显式处理这些问题的麻烦。例如,可以使用以下序列样式应用相同遍历。tvm.transform.Sequentialtorch.nn.sequential 和mxnet.gluon.block类似。例如,torch.nn.sequential用于包含一系列PyTorch模块,这些模块将被添加,以构建网络,着重于网络层。取而代之的是tvm.transform.Sequential,下面的过程中的基础工作于优化过程。

# Now let's execute some passes through :py:class:`tvm.transform.Sequential`

f = example()

mod = tvm.IRModule.from_expr(f)

# Glob the interested passes.

seq = tvm.transform.Sequential(

[

relay.transform.FoldConstant(),

relay.transform.EliminateCommonSubexpr(),

relay.transform.FuseOps(fuse_opt_level=2),

]

)

mod1 = seq(mod)

print(mod1)

输出:

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%4 = fn (%p0: Tensor[(1, 64, 56, 56), float32], %p1: Tensor[(64, 64, 3, 3), float32], %p2: Tensor[(1, 64, 54, 54), float32], %p3: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%p0, %p1, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%1 = add(%0, %p2) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = add(%1, %p3) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%3 = add(%1, %p3) /* ty=Tensor[(1, 64, 54, 54), float32] */;

add(%2, %3) /* ty=Tensor[(1, 64, 54, 54), float32] */

};

%4(%x, %weight, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

从转换后的Relay程序中,可以看到仍然有两个相同的加法运算。这是因为EliminateCommonSubexpr 未实际执行。默认情况下,只有优化级别小于或等于2的过程才被执行 tvm.transform.Sequential。下面的pass提供了一个配置界面,供用户自定义要执行的优化级别。

with tvm.transform.PassContext(opt_level=3):

mod2 = seq(mod)

print(mod2)

输出:

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%3 = fn (%p0: Tensor[(1, 64, 56, 56), float32], %p1: Tensor[(64, 64, 3, 3), float32], %p2: Tensor[(1, 64, 54, 54), float32], %p3: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%p0, %p1, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%1 = add(%0, %p2) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = add(%1, %p3) /* ty=Tensor[(1, 64, 54, 54), float32] */;

add(%2, %2) /* ty=Tensor[(1, 64, 54, 54), float32] */

};

%3(%x, %weight, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

可以看到仅保留了两个相同的加法之一。

In addition, users can selectively disable some passes using the disabled_pass config, which is similar to the -fno-xxx option used the general purpose compilers, such as Clang and GCC. For example, we can disable EliminateCommonSubexpr as following. The printed module will again show two identical addition operations.

with tvm.transform.PassContext(opt_level=3, disabled_pass=["EliminateCommonSubexpr"]):

mod3 = seq(mod)

print(mod3)

Out:

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%4 = fn (%p0: Tensor[(1, 64, 56, 56), float32], %p1: Tensor[(64, 64, 3, 3), float32], %p2: Tensor[(1, 64, 54, 54), float32], %p3: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%p0, %p1, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%1 = add(%0, %p2) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = add(%1, %p3) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%3 = add(%1, %p3) /* ty=Tensor[(1, 64, 54, 54), float32] */;

add(%2, %3) /* ty=Tensor[(1, 64, 54, 54), float32] */

};

%4(%x, %weight, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

The passes applied so far are target independent. The pass infra also provides a means to make pass target-aware. For example, the layout alteration pass falls in such category.

with tvm.transform.PassContext(opt_level=3):

mod4 = seq(mod)

print(mod4)

seq1 = tvm.transform.Sequential([relay.transform.AlterOpLayout()])

with tvm.transform.PassContext(opt_level=3):

with tvm.target.Target("llvm"):

mod5 = seq1(mod)

print(mod5)

Out:

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%3 = fn (%p0: Tensor[(1, 64, 56, 56), float32], %p1: Tensor[(64, 64, 3, 3), float32], %p2: Tensor[(1, 64, 54, 54), float32], %p3: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%p0, %p1, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%1 = add(%0, %p2) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = add(%1, %p3) /* ty=Tensor[(1, 64, 54, 54), float32] */;

add(%2, %2) /* ty=Tensor[(1, 64, 54, 54), float32] */

};

%3(%x, %weight, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%0 = layout_transform(%x, src_layout="NCHW", dst_layout="NCHW16c") /* ty=Tensor[(1, 4, 56, 56, 16), float32] */;

%1 = nn.conv2d(%0, %weight, padding=[0, 0, 0, 0], data_layout="NCHW16c") /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

%2 = add(meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%3 = multiply(%2, 2f /* ty=float32 */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%4 = layout_transform(%3, src_layout="NCHW", dst_layout="NCHW16c") /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

%5 = add(%1, %4) /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

%6 = layout_transform(meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, src_layout="NCHW", dst_layout="NCHW16c") /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

%7 = add(%5, %6) /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

%8 = add(%5, %6) /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

%9 = add(%7, %8) /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

layout_transform(%9, src_layout="NCHW16c", dst_layout="NCHW") /* ty=Tensor[(1, 64, 54, 54), float32] */

}

Implement a Pass Using Python Decorator

下一个示例说明了如何使用Python装饰器,通过传递基础流程来编排定制的优化管道。此功能极大地简化了pass的实施。例如,用户可以简单地定义一个修饰的类,进行功能级别的优化,如以下示例所示。transform_function包装一个类,以用c的倍数替换所有常量。稍后,当调用自定义过程时,将访问给定模块中的每个函数,并且将替换函数中的每个常量。

@relay.transform.function_pass(opt_level=1)

class CustomPipeline:

"""Simple test function to replace one argument to another."""

def __init__(self, multiplier):

self.multiplier = multiplier

# This function can define a pass.

def transform_function(self, func, mod, ctx):

obj = self

class ReplaceConstant(tvm.relay.ExprMutator):

def visit_constant(self, c):

return relay.multiply(obj.multiplier, c)

return ReplaceConstant().visit(func)

f = example()

mod = tvm.IRModule.from_expr(f)

custom_pass = CustomPipeline(multiplier=relay.const(3, "float32"))

assert custom_pass.info.name == "CustomPipeline"

mod3 = custom_pass(mod)

print(mod3)

输出:

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%1 = multiply(3f /* ty=float32 */, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = add(%1, %1) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%3 = multiply(3f /* ty=float32 */, 2f /* ty=float32 */) /* ty=float32 */;

%4 = multiply(%2, %3) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%5 = add(%0, %4) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%6 = add(%5, %1) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%7 = add(%5, %1) /* ty=Tensor[(1, 64, 54, 54), float32] */;

add(%6, %7) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

调试pass

TVM为用户提供了一种即插即用式的调试通道,该通道在通过特殊通道(PrintIR)来转储整个模块的IR之后,将IR打印出来。序列传递示例的略微修改版本,可能类似于以下内容,以启用IR转储以进行FoldConstant优化。

f = example()

mod = tvm.IRModule.from_expr(f)

seq = tvm.transform.Sequential(

[

relay.transform.FoldConstant(),

tvm.transform.PrintIR(),

relay.transform.EliminateCommonSubexpr(),

relay.transform.FuseOps(),

relay.transform.AlterOpLayout(),

]

)

# By inserting the ``PrintIR`` pass after ``FoldConstant``, the pass infra will

# dump out the module IR when ``FoldConstant`` is done. Users can plug in this

# pass after any pass they want to debug for viewing the optimization effect.

#

# There is a more flexible debugging mechanism also exposed by the build configuration

# object. One can pass a tracing function which can be used to execute arbitrary code

# before and/or after each pass. A tracing function will receive a :py::class:`tvm.IRModule`,

# a :py:class:`tvm.transform.PassInfo` object,

# and a boolean indicating whether you are executing before, or after a pass.

# An example is below.

def print_ir(mod, info, is_before):

"""Print the name of the pass, the IR, only before passes execute."""

if is_before:

print("Running pass: {}", info)

print(mod)

with tvm.transform.PassContext(opt_level=3, trace=print_ir):

with tvm.target.Target("llvm"):

# Perform the optimizations.

mod = seq(mod)

print(mod)

print("done")

输出:

Running pass: {} The meta data of the pass: pass name: FoldConstantopt_level: 2required passes: [

]

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) {

%0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]);

%1 = add(meta[relay.Constant][0], meta[relay.Constant][0]);

%2 = multiply(%1, 2f);

%3 = add(%0, %2);

%4 = add(%3, meta[relay.Constant][0]);

%5 = add(%3, meta[relay.Constant][0]);

add(%4, %5)

}

Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

]

def @main() {

add(meta[relay.Constant][0], meta[relay.Constant][0])

}

Running pass: {} The meta data of the pass: pass name: FuseOpsopt_level: 1required passes: [

InferType, ]

def @main() -> Tensor[(1, 64, 54, 54), float32] {

add(meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

]

def @main() -> Tensor[(1, 64, 54, 54), float32] {

%0 = fn (%p0: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

add(%p0, %p0)

};

%0(meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */)

}

Running pass: {} The meta data of the pass: pass name: ToANormalFormopt_level: 1required passes: [

]

def @main() -> Tensor[(1, 64, 54, 54), float32] {

%0 = fn (%p0: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

add(%p0, %p0) /* ty=Tensor[(1, 64, 54, 54), float32] */

};

%0(meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

]

def @main() -> Tensor[(1, 64, 54, 54), float32] {

let %x = meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */;

let %x1 = fn (%p0: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

add(%p0, %p0) /* ty=Tensor[(1, 64, 54, 54), float32] */

};

let %x2 = %x1(%x);

%x2

}

Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

]

def @main() {

multiply(meta[relay.Constant][0], 2f)

}

Running pass: {} The meta data of the pass: pass name: FuseOpsopt_level: 1required passes: [

InferType, ]

def @main() -> Tensor[(1, 64, 54, 54), float32] {

multiply(meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, 2f /* ty=float32 */) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

]

def @main() -> Tensor[(1, 64, 54, 54), float32] {

%0 = fn (%p0: Tensor[(1, 64, 54, 54), float32], %p1: float32, Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

multiply(%p0, %p1)

};

%0(meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, 2f /* ty=float32 */)

}

Running pass: {} The meta data of the pass: pass name: ToANormalFormopt_level: 1required passes: [

]

def @main() -> Tensor[(1, 64, 54, 54), float32] {

%0 = fn (%p0: Tensor[(1, 64, 54, 54), float32], %p1: float32, Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

multiply(%p0, %p1) /* ty=Tensor[(1, 64, 54, 54), float32] */

};

%0(meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, 2f /* ty=float32 */) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

]

def @main() -> Tensor[(1, 64, 54, 54), float32] {

let %x = meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */;

let %x1 = 2f /* ty=float32 */;

let %x2 = fn (%p0: Tensor[(1, 64, 54, 54), float32], %p1: float32, Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

multiply(%p0, %p1) /* ty=Tensor[(1, 64, 54, 54), float32] */

};

let %x3 = %x2(%x, %x1);

%x3

}

Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

]

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) {

%0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]);

%1 = add(%0, meta[relay.Constant][0]);

%2 = add(%1, meta[relay.Constant][1]);

%3 = add(%1, meta[relay.Constant][1]);

add(%2, %3)

}

Running pass: {} The meta data of the pass: pass name: PrintIRopt_level: 0required passes: [

]

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%1 = add(%0, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%3 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

add(%2, %3) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

]

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%1 = add(%0, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%3 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

add(%2, %3) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

Running pass: {} The meta data of the pass: pass name: EliminateCommonSubexpropt_level: 3required passes: [

InferType, ]

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%1 = add(%0, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%3 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

add(%2, %3) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

]

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%1 = add(%0, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

add(%2, %2)

}

Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

]

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%1 = add(%0, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

add(%2, %2) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

Running pass: {} The meta data of the pass: pass name: FuseOpsopt_level: 1required passes: [

InferType, ]

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%x, %weight, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%1 = add(%0, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = add(%1, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */;

add(%2, %2) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

]

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%3 = fn (%p0: Tensor[(1, 64, 56, 56), float32], %p1: Tensor[(64, 64, 3, 3), float32], %p2: Tensor[(1, 64, 54, 54), float32], %p3: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%p0, %p1, padding=[0, 0, 0, 0]);

%1 = add(%0, %p2);

%2 = add(%1, %p3);

add(%2, %2)

};

%3(%x, %weight, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */)

}

Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

]

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%3 = fn (%p0: Tensor[(1, 64, 56, 56), float32], %p1: Tensor[(64, 64, 3, 3), float32], %p2: Tensor[(1, 64, 54, 54), float32], %p3: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%p0, %p1, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%1 = add(%0, %p2) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = add(%1, %p3) /* ty=Tensor[(1, 64, 54, 54), float32] */;

add(%2, %2) /* ty=Tensor[(1, 64, 54, 54), float32] */

};

%3(%x, %weight, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

Running pass: {} The meta data of the pass: pass name: AlterOpLayoutopt_level: 3required passes: [

InferType, ]

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%3 = fn (%p0: Tensor[(1, 64, 56, 56), float32], %p1: Tensor[(64, 64, 3, 3), float32], %p2: Tensor[(1, 64, 54, 54), float32], %p3: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

%0 = nn.conv2d(%p0, %p1, padding=[0, 0, 0, 0]) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%1 = add(%0, %p2) /* ty=Tensor[(1, 64, 54, 54), float32] */;

%2 = add(%1, %p3) /* ty=Tensor[(1, 64, 54, 54), float32] */;

add(%2, %2) /* ty=Tensor[(1, 64, 54, 54), float32] */

};

%3(%x, %weight, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

Running pass: {} The meta data of the pass: pass name: InferTypeopt_level: 0required passes: [

]

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%7 = fn (%p0: Tensor[(1, 64, 56, 56), float32], %p1: Tensor[(64, 64, 3, 3), float32], %p2: Tensor[(1, 64, 54, 54), float32], %p3: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

%0 = layout_transform(%p0, src_layout="NCHW", dst_layout="NCHW16c");

%1 = nn.conv2d(%0, %p1, padding=[0, 0, 0, 0], data_layout="NCHW16c");

%2 = layout_transform(%p2, src_layout="NCHW", dst_layout="NCHW16c");

%3 = add(%1, %2);

%4 = layout_transform(%p3, src_layout="NCHW", dst_layout="NCHW16c");

%5 = add(%3, %4);

%6 = add(%5, %5);

layout_transform(%6, src_layout="NCHW16c", dst_layout="NCHW")

};

%7(%x, %weight, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */)

}

def @main(%x: Tensor[(1, 64, 56, 56), float32], %weight: Tensor[(64, 64, 3, 3), float32]) -> Tensor[(1, 64, 54, 54), float32] {

%7 = fn (%p0: Tensor[(1, 64, 56, 56), float32], %p1: Tensor[(64, 64, 3, 3), float32], %p2: Tensor[(1, 64, 54, 54), float32], %p3: Tensor[(1, 64, 54, 54), float32], Primitive=1) -> Tensor[(1, 64, 54, 54), float32] {

%0 = layout_transform(%p0, src_layout="NCHW", dst_layout="NCHW16c") /* ty=Tensor[(1, 4, 56, 56, 16), float32] */;

%1 = nn.conv2d(%0, %p1, padding=[0, 0, 0, 0], data_layout="NCHW16c") /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

%2 = layout_transform(%p2, src_layout="NCHW", dst_layout="NCHW16c") /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

%3 = add(%1, %2) /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

%4 = layout_transform(%p3, src_layout="NCHW", dst_layout="NCHW16c") /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

%5 = add(%3, %4) /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

%6 = add(%5, %5) /* ty=Tensor[(1, 4, 54, 54, 16), float32] */;

layout_transform(%6, src_layout="NCHW16c", dst_layout="NCHW") /* ty=Tensor[(1, 64, 54, 54), float32] */

};

%7(%x, %weight, meta[relay.Constant][0] /* ty=Tensor[(1, 64, 54, 54), float32] */, meta[relay.Constant][1] /* ty=Tensor[(1, 64, 54, 54), float32] */) /* ty=Tensor[(1, 64, 54, 54), float32] */

}

done

概括

本文介绍了如何使用pass基础,更加方便地在TVM中编写和调用pass。还讨论了调用pass的不同方法。使用tvm.transform.Sequential,可以极大地帮助用户简化处理多个优化过程及其依赖项的工作。另外,提供了一个示例来说明如何使用PrintIR和跟踪调试过程。

TVM Pass IR如何使用的更多相关文章

  1. 如何使用TVM Pass红外线

    如何使用TVM Pass红外线 随着Relay / tir中优化遍数的增加,执行并手动维护其依赖关系变得很棘手.引入了一个基础结构来管理优化过程,将其应用于TVM堆栈中IR的不同层. Relay / ...

  2. rac one node在线relocation

    1.查看数据库运行状态 $ srvctl status database -d rone Instance rone_2 is running on node rone2 Online relocat ...

  3. 转://ORA-00603,ORA-27501,ORA-27300,ORA-27301,ORA-27302故障案例一则

    背景介绍: 这是一套windows的rac系统.数据库后台日志报ORA-00474:SMON process terminated with error.接着报ORA-00603,ORA-27501, ...

  4. oracle 错误实例分析(ORA-01078)

    01,问题描述 心血来潮想看一下启动数据库的alert log.然后把数据库给关闭了,同时也在监听日志文件    下面可谓是详细的描述了整个关机过程,也看到了无数的error [root@node1 ...

  5. 【体系结构】有关Oracle SCN知识点的整理

    [体系结构]有关Oracle SCN知识点的整理 1  BLOG文档结构图   BLOG_Oracle_lhr_Oracle SCN的一点研究.pdf 2  前言部分 2.1  导读和注意事项 各位技 ...

  6. nbu还原集群数据库异常问题

    集群数据库软件均已安装完毕,现在想从NBU上还原数据库,但在还原控制文件报错 [oracle@oracle-db1 ~]$ rman target / Recovery Manager: Releas ...

  7. oracle数据库启动报错,不能启动ASM实例

    数据库rac启动时报错,日志例如以下,后来使用 Sat Jun  7 06:02:11 2014 GATHER_STATS_JOB encountered errors.  Check the tra ...

  8. TVM:一个端到端的用于开发深度学习负载以适应多种硬件平台的IR栈

    TVM:一个端到端的用于开发深度学习负载以适应多种硬件平台的IR栈  本文对TVM的论文进行了翻译整理 深度学习如今无处不在且必不可少.这次创新部分得益于可扩展的深度学习系统,比如 TensorFlo ...

  9. 用TVM在硬件平台上部署深度学习工作负载的端到端 IR 堆栈

    用TVM在硬件平台上部署深度学习工作负载的端到端 IR 堆栈 深度学习已变得无处不在,不可或缺.这场革命的一部分是由可扩展的深度学习系统推动的,如滕索弗洛.MXNet.咖啡和皮托奇.大多数现有系统针对 ...

随机推荐

  1. 2-7 Java基础数据类型之字符型

    代码中输入如下部分: /* char的取值范围0-65535 */ public class DataType06 { public static void main(String[]args){ c ...

  2. 从苏宁电器到卡巴斯基第27篇:难忘的三年硕士时光 V

    一发不可收拾 安全领域的公司都喜欢在看雪或者是吾爱破解这样的网站上发布招聘贴,因为这样的话很容易就能够招到适合的人才,也算是精准营销了.而像我这种想进入安全圈的,也会在这里发布自己的求职简历,以期望能 ...

  3. Swift系列五 - 可选项

    可选项,一般也叫可选类型,它允许将值设为nil. 一.定义可选项 平时开发中,如果我们需要把一个变量置空时只需要把变量赋值一个nil即可: 上面尝试后不行,那怎么把一个变量置空呢? 答案:把变量设置可 ...

  4. 报错com.github.pagehelper.PageHelper cannot be cast to com.github.pagehelper.Dialect

    报错com.github.pagehelper.PageHelper cannot be cast to com.github.pagehelper.Dialect spring以及mybatis版本 ...

  5. Mac 解压缩软件-keka

    去官网 GitHub地址 功能预览

  6. promise用法解析

    Promise的理解 Promise是对异步操作的一种解决方案,一般情况下,如果有异步操作,就需要使用Promise对这个异步操作进行封装 使用Promise后可以使代码看起来更加优雅并且易于维护 使 ...

  7. C++ primer plus读书笔记——第9章 内存模型和名称空间

    第9章 内存模型和名称空间 1. 头文件常包含的内容: 函数原型. 使用#define或const定义的符号常量. 结构声明. 类声明. 模板声明. 内联函数. 2. 如果文件名被包含在尖括号中,则C ...

  8. Envoy :V3APi 开启 TLS

    方案架构 本次实例与官方Envoy front_proxy Example相似,首先会有一个Envoy单独运行.ingress的工作是给其他地方提供一个入口.来自外部的传入连接请求到这里,前端代理将会 ...

  9. [Java] 数据分析 -- 回归分析

    线性回归 需求:从文件读取数据对,计算回归函数及系数 实现1:commons.math的SimpleRegression,定义函数getData从文件读取数据返回SimpleRegression类 1 ...

  10. [bug] IDEA 创建springboot项目 “Initialization failed for ‘https://start.spring.io‘

    原因 网络问题,更换阿里云服务器,或自己搭建服务器 参考 https://blog.csdn.net/soulofball/article/details/107157872 https://blog ...