TensorFlow 2.0 新特性
安装 TensorFlow 2.0 Alpha
本文仅仅介绍 Windows 的安装方式:
pip install tensorflow==2.0.0-alpha0
# cpu 版本pip install tensorflow==2.0.0-alpha0
# gpu 版本
针对 GPU 版的安装完毕后还需要设置环境变量:
SET PATH=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\bin;%PATH%
SET PATH=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\extras\CUPTI\libx64;%PATH%
SET PATH=C:\tools\cuda\bin;%PATH%
更多细节与其他平台的安装教程见:GPU support
新功能简介
tf.control_dependencies()
不再被需要,因为 TensorFlow 的所有代码都是有序执行的。借助 @tf.function
和 AutoGraph
实现更加 Pythonic 的编程方式。比如,下面的等价形式:
for/while -> tf.while_loop (break and continue are supported)
if -> tf.cond
for _ in dataset -> dataset.reduce
autorgraph 支持控制流的任意嵌套, 这使得能够以高性能、简洁的方式实现许多复杂的 ML 程序, 如序列模型、增强学习、自定义训练循环等。
将代码重构(Refactor)为更小的函数
一般情况下, 不需要用 tf.function
来装饰这些较小的函数中的每一个;只使用 tf.function
来装饰高级计算-例如, 训练的一个步骤, 或者模型的正向传递。
使用 keras 的 layers 和 models 管理变量
keras 的 layers
和 models
提供了方便的 variables
和 trainable_variables
属性, 这些属性递归地收集所有因变量。这样可以很容易地在本地将变量管理到正在使用的位置。对比如下:
普通的 tf:
def dense(x, W, b):
return tf.nn.sigmoid(tf.matmul(x, W) + b)
@tf.function
def multilayer_perceptron(x, w0, b0, w1, b1, w2, b2 ...):
x = dense(x, w0, b0)
x = dense(x, w1, b1)
x = dense(x, w2, b2)
...
# You still have to manage w_i and b_i, and their shapes are defined far away from the code.
带有 Keras 的 tf:
# Each layer can be called, with a signature equivalent to linear(x)
layers = [tf.keras.layers.Dense(hidden_size, activation=tf.nn.sigmoid) for _ in range(n)]
perceptron = tf.keras.Sequential(layers)
# layers[3].trainable_variables => returns [w3, b3]
# perceptron.trainable_variables => returns [w0, b0, ...]
Keras layers/models inherit from tf.train.Checkpointable
and are integrated with @tf.function
, which makes it possible to directly checkpoint or export SavedModels from Keras objects. You do not necessarily have to use Keras's .fit()
API to take advantage of these integrations(兼容方式).
Here's a transfer learning example that demonstrates how Keras makes it easy to collect a subset of relevant variables. Let's say you're training a multi-headed model with a shared trunk:
trunk = tf.keras.Sequential([...])
head1 = tf.keras.Sequential([...])
head2 = tf.keras.Sequential([...])
path1 = tf.keras.Sequential([trunk, head1])
path2 = tf.keras.Sequential([trunk, head2])
# Train on primary dataset
for x, y in main_dataset:
with tf.GradientTape() as tape:
prediction = path1(x)
loss = loss_fn_head1(prediction, y)
# Simultaneously optimize trunk and head1 weights.
gradients = tape.gradients(loss, path1.trainable_variables)
optimizer.apply_gradients(gradients, path1.trainable_variables)
# Fine-tune second head, reusing the trunk
for x, y in small_dataset:
with tf.GradientTape() as tape:
prediction = path2(x)
loss = loss_fn_head2(prediction, y)
# Only optimize head2 weights, not trunk weights
gradients = tape.gradients(loss, head2.trainable_variables)
optimizer.apply_gradients(gradients, head2.trainable_variables)
# You can publish just the trunk computation for other people to reuse.
tf.saved_model.save(trunk, output_path)
联合 tf.data.Datasets
and @tf.function
When iterating over training data that fits in memory, feel free to use regular Python iteration. Otherwise, tf.data.Dataset
is the best way to stream training data from disk. Datasets are iterables (not iterators), and work just like other Python iterables in Eager mode. You can fully utilize dataset async prefetching/streaming features by wrapping your code in tf.function()
, which replaces Python iteration with the equivalent graph operations using AutoGraph.
@tf.function
def train(model, dataset, optimizer):
for x, y in dataset:
with tf.GradientTape() as tape:
prediction = model(x)
loss = loss_fn(prediction, y)
gradients = tape.gradients(loss, model.trainable_variables)
optimizer.apply_gradients(gradients, model.trainable_variables)
If you use the Keras .fit()
API, you won't have to worry about dataset iteration.
model.compile(optimizer=optimizer, loss=loss_fn)
model.fit(dataset)
AutoGraph with Python control flow 的优点
AutoGraph provides a way to convert data-dependent control flow into graph-mode equivalents like tf.cond
and tf.while_loop
.
One common place where data-dependent control flow appears is in sequence models. tf.keras.layers.RNN
wraps an RNN cell, allowing you to either statically or dynamically unroll the recurrence. For demonstration's sake, you could reimplement dynamic unroll as follows:
class DynamicRNN(tf.keras.Model):
def __init__(self, rnn_cell):
super(DynamicRNN, self).__init__(self)
self.cell = rnn_cell
def call(self, input_data):
# [batch, time, features] -> [time, batch, features]
input_data = tf.transpose(input_data, [1, 0, 2])
outputs = tf.TensorArray(tf.float32, input_data.shape[0])
state = self.cell.zero_state(input_data.shape[1], dtype=tf.float32)
for i in tf.range(input_data.shape[0]):
output, state = self.cell(input_data[i], state)
outputs = outputs.write(i, output)
return tf.transpose(outputs.stack(), [1, 0, 2]), state
Use tf.metrics
to aggregate data and tf.summary
to log it
To log summaries, use tf.summary
.(scalar|histogram|...
) and redirect it to a writer using a context manager. (If you omit the context manager, nothing will happen.) Unlike TF 1.x, the summaries are emitted directly to the writer; there is no separate "merge" op and no separate add_summary()
call, which means that the step
value must be provided at the callsite.
summary_writer = tf.summary.create_file_writer('/tmp/summaries')
with summary_writer.as_default():
tf.summary.scalar('loss', 0.1, step=42)
To aggregate data before logging them as summaries, use tf.metrics
. Metrics are stateful; they accumulate values and return a cumulative result when you call .result()
. Clear accumulated values with .reset_states()
.
def train(model, optimizer, dataset, log_freq=10):
avg_loss = tf.keras.metrics.Mean(name='loss', dtype=tf.float32)
for images, labels in dataset:
loss = train_step(model, optimizer, images, labels)
avg_loss.update_state(loss)
if tf.equal(optimizer.iterations % log_freq, 0):
tf.summary.scalar('loss', avg_loss.result(), step=optimizer.iterations)
avg_loss.reset_states()
def test(model, test_x, test_y, step_num):
loss = loss_fn(model(test_x), test_y)
tf.summary.scalar('loss', loss, step=step_num)
train_summary_writer = tf.summary.create_file_writer('/tmp/summaries/train')
test_summary_writer = tf.summary.create_file_writer('/tmp/summaries/test')
with train_summary_writer.as_default():
train(model, optimizer, dataset)
with test_summary_writer.as_default():
test(model, test_x, test_y, optimizer.iterations)
Visualize the generated summaries by pointing TensorBoard at the summary log directory: tensorboard --logdir /tmp/summaries
.
tf 2.x 下的 1.x
如果你想要运行 1.X 的代码(except for contrib)在 TensorFlow 2.0 中无需修改代码的实现,仅仅做如下改变即可:
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior() # 关掉 v2 版本
1. 取代 tf.Session.run
calls
每个 tf.Session.run
call 都应该被一个 python function 所取代:
- The
feed_dict
, andtf.placeholder
s becomes the function arguments. - The
fetches
become the function's return value.
You can step-through and debug the function using standard python tools like pdb
.
When you're satisfied that it works, add a tf.function
decorator to make it run efficiently, in graph mode. See the Autograph Guide, and the tf.function tutorial for more on how this works.
2. Use objects to track variables and losses
Use tf.Variable
instead of tf.get_variable
.
Every variable_scope
can be converted to a python obejct. Typically this will be a tf.keras.layers.Layer
, tf.keras.Model
or a tf.Module
.
If you need to aggregate lists of variables, like tf.Graph.get_collection(tf.GraphKeys.VARIABLES)
, use the .variables
and .trainable_variables
attributes of the Layer
and Model
objects.
These classes Layer
and Model
classes implement several other properties that remove the need for global collections. For example, their .losses
property replaces the tf.GraphKeys.LOSSES
collection.
See the keras guides for details.
Warning: Many tf.compat.v1
symbols use the global collections implicitly.
3. Upgrade your training loops
Use the highest level api that works for your use case: Prefer tf.keras.Model.fit
over building your own training loops.
These high level functions manage a lot of the low-level details that might be easy to miss if you write your own training loop. For example, they automatically collect the regularization losses, and set the training=True
argument when calling the model.
Use tf.data
datasets for data input. Thse objects are efficient, expressive, and integrate well with tensorflow.
They can be passed directly to the tf.keras.Model.fit
method.
model.fit(dataset, epochs=5)
They can be iterated over directly standard python:
for example_batch, label_batch in dataset:
break
import tensorflow as tf
import tensorflow.compat.v1 as tf_v1
tf.compat.v1 = tf_v1
Low-level variables & operator execution
We'll first look at handle TensorFlow 1.x code that is using lower-level variables and TensorFlow operators rather than higher-level layer APIs.
If your existing codebase falls into this category, your existing code probably uses variable scopes to control reuse, and creates variables with tf.get_variable
. You are also likely accessing collections either explicitly, or implicitly (with methods like tf.global_variables
and tf.losses.get_regularization_loss
)
Your code is likely using tf.placeholder
s to set up inputs to your graph and session.run
to execute it. You are also most likely initializing variables manually before you run the graph.
Below is a sample of how lower-level TensorFlow 1.x code implemented with these patterns looks:
Before converting
in_a = tf.placeholder(dtype=tf.float32, shape=(2))
in_b = tf.placeholder(dtype=tf.float32, shape=(2))
def forward(x):
with tf.variable_scope("matmul", reuse=tf.AUTO_REUSE):
W = tf.get_variable("W", initializer=tf.ones(shape=(2,2)),
regularizer=tf.contrib.layers.l2_regularizer(0.04))
b = tf.get_variable("b", initializer=tf.zeros(shape=(2)))
return x * train_data + b
out_a = model(in_a)
out_b = model(in_b)
reg_loss = tf.losses.get_regularization_loss(scope="matmul")
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
outs = sess.run([out_a, out_b, reg_loss],
feed_dict={in_a: [1, 0], in_b: [0, 1]})
In the converted code:
- The variables are local python objects.
- The
forward
function still defines the calculation. - The
sess.run
call is replaced with a call toforward
- The optional
tf.function
decorator can be added for performance. - The regularizations are calculated manually, without referring to any global collection.
W = tf.Variable(tf.ones(shape=(2, 2)), name="W")
b = tf.Variable(tf.zeros(shape=(2)), name="b")
@tf.function
def forward(x):
return W * x + b
out_a = forward([1, 0])
print(out_a)
out_b = forward([0,1])
regularizer = tf.keras.regularizers.l2(0.02)
reg_loss = regularizer(W)
tf.Tensor(
[[1. 0.]
[1. 0.]], shape=(2, 2), dtype=float32)
No session or placeholders!
For tf.layers
based models
The tf.layers
module used to contain layer-functions that relied on variable_scopes
to define and reuse variables.
Before converting
def model(x, training, scope='model'):
with tf.variable_scope(scope, reuse=tf.AUTO_REUSE):
x = tf.layers.conv2d(x, 32, 3, activation=tf.nn.relu,
kernel_regularizer=tf.contrib.layers.l2_regularizer(0.04))
x = tf.layers.max_pooling2d(x, (2, 2), 1)
x = tf.layers.flatten(x)
x = tf.layers.dropout(x, 0.1, training=training)
x = tf.layers.dense(x, 64, activation=tf.nn.relu)
x = tf.layers.batch_normalization(x, training=training)
x = tf.layers.dense(x, 10, activation=tf.nn.softmax)
return x
train_out = model(train_data, training=True)
test_out = model(test_data, training=False)
After converting
The resulting code is below. For the converted model note:
- It was a simple stack of layers. So it fits neatly into a
tf.keras.Sequential
. - For more complex models see custom layers and models, and the functional api
- The model tracks the variables, and regularization losses.
- The conversion was one-to-one because there is a direct mapping from
tf.layers
totf.keras.layers
.
Most arguments stayed the same, the main differences are:
- The
training
argument is passed to each layer by the model when it runs. - The first argument to the function-layers, the input
x
, is gone because object layers separate building the model from calling the model.
Also note that:
- If you were using regularizers of initializers from
tf.contrib
these have more argument changes than others. - The code no longer writes to collections, so functions like
tf.losses.get_regularization_loss
will no longer return these values, potentially breaking your training loops.
model = tf.keras.Sequential([
tf.keras.layers.Conv2D(32, 3, activation='relu',
kernel_regularizer=tf.keras.regularizers.l2(0.02),
input_shape=(28, 28, 1)),
tf.keras.layers.MaxPooling2D(),
tf.keras.layers.Flatten(),
tf.keras.layers.Dropout(0.1),
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.Dense(10, activation='softmax')
])
train_data = tf.ones(shape=(1, 28, 28, 1))
test_data = tf.ones(shape=(1, 28, 28, 1))
train_out = model(train_data, training=True)
print(train_out)
tf.Tensor([[0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1]], shape=(1, 10), dtype=float32)
test_out = model(test_data, training=False)
print(test_out)
tf.Tensor(
[[0.09657966 0.09668106 0.12381785 0.13422377 0.10953731 0.08846541
0.08248153 0.08863612 0.08141313 0.09816417]], shape=(1, 10), dtype=float32)
# Here are all the trainable variables.
len(model.trainable_variables)
8
# Here is the regularization loss.
model.losses
[<tf.Tensor: id=920, shape=(), dtype=float32, numpy=0.041291103>]
Mixed variables & tf.layers
Your existing projects might mix lower-level TF 1.x variables and operations with higher-level tf.layers
. Sample code that does this in TF 1.x is shown below.
Before converting
def model(x, training, scope='model'):
with tf.variable_scope(scope, reuse=tf.AUTO_REUSE):
W = tf.get_variable(
"W", dtype=tf.float32,
initializer=tf.ones(shape=x.shape),
regularizer=tf.contrib.layers.l2_regularizer(0.04),
trainable=True)
if training:
x = x + W
else:
x = x + W * 0.5
x = tf.layers.conv2d(x, 32, 3, activation=tf.nn.relu)
x = tf.layers.max_pooling2d(x, (2, 2), 1)
x = tf.layers.flatten(x)
return x
train_out = model(train_data, training=True)
test_out = model(test_data, training=False)
After converting
To convert this code, follow the pattern of mapping layers to layers as in the previous example.
The tf.variable_scope
is effectively a layer of its own. So rewrite it as a tf.keras.layers.Layer
. See the guide for details.
The general pattern is:
- Collect layer parameters in
__init__
. - Build the variables in
build
. - Execute the calculations in
call
, and return the result.
Some things to note:
Subclassed Keras models & layers need to run in both v1 graphs (no automatic control dependencies) and in eager mode
- So, wrap the
call()
in atf.function()
to get autograph and automatic control dependencies
- So, wrap the
Don't forget to accept a
training
argument tocall
.- Sometimes it is a
tf.Tensor
- Sometimes it is a python boolean.
- Sometimes it is a
Create model variables in constructor or
def build()
usingself.add_weight()
.- In
build
you have access to the input shape, so can create weights with matching shape. - Using
tf.keras.layers.Layer.add_weight
allows Keras to track regularization losses.
- In
Don't keep
tf.Tensors
in your objects.- They might get created either in a
tf.function
or in the eager context, and these tensors behave differently. - Use
tf.Variable
s for state, they are always usable from both contexts tf.Tensors
are only for intermediate values.
- They might get created either in a
# Create a custom layer for part of the model
class CustomLayer(tf.keras.layers.Layer):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
def build(self, input_shape):
self.w = self.add_weight(
shape=input_shape[1:],
dtype=tf.float32,
initializer=tf.keras.initializers.ones(),
regularizer=tf.keras.regularizers.l2(0.02),
trainable=True)
# Call method will sometimes get used in graph mode,
# training will get turned into a tensor
@tf.function
def call(self, inputs, training=None):
if training:
return inputs + self.w
else:
return inputs + self.w * 0.5
custom_layer = CustomLayer()
print(custom_layer([1]).numpy())
print(custom_layer([1], training=True).numpy())
[1.5]
[2.]
train_data = tf.ones(shape=(1, 28, 28, 1))
test_data = tf.ones(shape=(1, 28, 28, 1))
# Build the model including the custom layer
model = tf.keras.Sequential([
CustomLayer(input_shape=(28, 28, 1)),
tf.keras.layers.Conv2D(32, 3, activation='relu'),
tf.keras.layers.MaxPooling2D(),
tf.keras.layers.Flatten(),
])
train_out = model(train_data, training=True)
test_out = model(test_data, training=False)
A note on Slim & contrib.layers
A large amount of older TensorFlow 1.x code uses the Slim library, which was packaged with TensorFlow 1.x as tf.contrib.layers
. As a contrib
module, this is no longer available in TensorFlow 2.0, even in tf.compat.v1
. Converting code using Slim to TF 2.0 is more involved than converting repositories that use tf.layers
. In fact, it may make sense to convert your Slim code to tf.layers
first, then convert to Keras!
- Remove
arg_scopes
, all args need to be explicit - If you use them, split
normalizer_fn
andactivation_fn
into their own layers - Separable conv layers map to one or more different Keras layers (depthwise, pointwise, and separable Keras layers)
- Slim and
tf.layers
have different arg names & default values - Some args have different scales
- If you use Slim pre-trained models, try out
tf.keras.applications
or TFHub
Some tf.contrib
layers might not have been moved to core TensorFlow but have instead been moved to the TF add-ons package.
TensorFlow 2.0 新特性的更多相关文章
- 浅谈Tuple之C#4.0新特性那些事儿你还记得多少?
来源:微信公众号CodeL 今天给大家分享的内容基于前几天收到的一条留言信息,留言内容是这样的: 看了这位网友的留言相信有不少刚接触开发的童鞋们也会有同样的困惑,除了用新建类作为桥梁之外还有什么好的办 ...
- Java基础和JDK5.0新特性
Java基础 JDK5.0新特性 PS: JDK:Java Development KitsJRE: Java Runtime EvironmentJRE = JVM + ClassLibary JV ...
- Visual Studio 2015速递(1)——C#6.0新特性怎么用
系列文章 Visual Studio 2015速递(1)——C#6.0新特性怎么用 Visual Studio 2015速递(2)——提升效率和质量(VS2015核心竞争力) Visual Studi ...
- atitit.Servlet2.5 Servlet 3.0 新特性 jsp2.0 jsp2.1 jsp2.2新特性
atitit.Servlet2.5 Servlet 3.0 新特性 jsp2.0 jsp2.1 jsp2.2新特性 1.1. Servlet和JSP规范版本对应关系:1 1.2. Servlet2 ...
- 背水一战 Windows 10 (1) - C# 6.0 新特性
[源码下载] 背水一战 Windows 10 (1) - C# 6.0 新特性 作者:webabcd 介绍背水一战 Windows 10 之 C# 6.0 新特性 介绍 C# 6.0 的新特性 示例1 ...
- C# 7.0 新特性2: 本地方法
本文参考Roslyn项目中的Issue:#259. 1. C# 7.0 新特性1: 基于Tuple的“多”返回值方法 2. C# 7.0 新特性2: 本地方法 3. C# 7.0 新特性3: 模式匹配 ...
- C# 7.0 新特性1: 基于Tuple的“多”返回值方法
本文基于Roslyn项目中的Issue:#347 展开讨论. 1. C# 7.0 新特性1: 基于Tuple的“多”返回值方法 2. C# 7.0 新特性2: 本地方法 3. C# 7.0 新特性3: ...
- C# 7.0 新特性3: 模式匹配
本文参考Roslyn项目Issue:#206,及Docs:#patterns. 1. C# 7.0 新特性1: 基于Tuple的“多”返回值方法 2. C# 7.0 新特性2: 本地方法 3. C# ...
- C# 7.0 新特性4: 返回引用
本文参考Roslyn项目中的Issue:#118. 1. C# 7.0 新特性1: 基于Tuple的“多”返回值方法 2. C# 7.0 新特性2: 本地方法 3. C# 7.0 新特性3: 模式匹配 ...
随机推荐
- django(二)中间件与面向切面编程
一.中间件概念 django 自带函数可以在几个环节调节收到请求.处理请求.处理异常.以及发送请求. 看这里给的链接好了,这是一个大佬的讲django中间件的博客,非常清楚:https://www.c ...
- mysql 原理 ~ 事务隔离机制
简介: 事务隔离知多少内容 一 基础知识 1 事务特性 ACID A 原子性 C 一致性 I 隔离性 D 持久性 2 并行事务出现的问题 1 脏读 读取了其他事务未提交的数据 ...
- python - setitem/getitem/delitem类的内置方法
# class 内置方法: # __setitem__ # __getitem__ # __delitem__ class Test(): X = 100 def __getitem__(self, ...
- Maven私服
1.关于中央仓库注意事项地址: 目前来说: http://repo1.maven.org/maven2/是真正的 Maven 中央仓库的地址,该地址内置在Maven 的源码中,其他的都是镜像.索引: ...
- Jetson tk1 hash sum mismatch
sudo apt-get update遭遇Hash Sum Mismatch 修改DNS服务器地址: sudo gedit /etc/resolv.conf 解决办法: 在装有goagent的情况下: ...
- SpringMVC跨重定向请求传递数据
(1)使用URL模板以路径变量和查询参数的形式传递数据(一些简单的数据) @GetMapping("/home/index") public String index(Model ...
- 【CTF MISC】隐写术wireshark找出图片-“强网杯”网络安全挑战赛writeup
这场CTF中有一道题是分析pcap包的.. 13.大黑阔: 从给的pcap包里把图片提取出来,是一张中国地图. 题目提示是黑阔在聊天,从数据里可以找出几段话. 思路:主要考察wireshark的过滤规 ...
- VC++文件拖放
属性Accept Files 设置True,消息WM_DROPFILES 设置事件OnDropFiles void CNWiReworkDlg::OnDropFiles(HDROP hDropInfo ...
- PHP导出CVS格式文件
$csvContent="csvzero,csvone,csvtwo,csvthree,csvfour,csvfive"; header("Content-Type: a ...
- 实现自己的Koa2
这部分的代码在https://github.com/zhaobao1830/koa2中demo文件夹中 Koa就是基于node自带的http模块,经过封装,监听端口,实现ctx(上下文)管理,中间件管 ...