发表在2018年CVPR。

摘要

Despite that convolutional neural networks (CNN) have recently demonstrated high-quality reconstruction for single-image super-resolution (SR), recovering natural and realistic texture remains a challenging problem. In this paper, we show that it is possible to recover textures faithful to semantic classes. In particular, we only need to modulate features of a few intermediate layers in a single network conditioned on semantic segmentation probability maps. This is made possible through a novel Spatial Feature Transform (SFT) layer that generates affine transformation parameters for spatial-wise feature modulation. SFT layers can be trained end-to-end together with the SR network using the same loss function. During testing, it accepts an input image of arbitrary size and generates a high-resolution image with just a single forward pass conditioned on the categorical priors. Our final results show that an SR network equipped with SFT can generate more realistic and visually pleasing textures in comparison to state-of-the-art SRGAN [27] and EnhanceNet [38].

结论

We have explored the use of semantic segmentation maps as categorical prior for constraining the plausible solution space in SR. A novel Spatial Feature Transform (SFT) layer has been proposed to efficiently incorporate the categorical conditions into a CNN-based SR network. Thanks to the SFT layers, our SFT-GAN is capable of generating distinct and rich textures for multiple semantic regions in a super-resolved image in just a single forward pass. Extensive comparisons and a user study demonstrate the capability of SFT-GAN in generating realistic and visually pleasing textures, outperforming previous GAN-based methods [27, 38]. Our work currently focuses on SR of outdoor scenes.
Despite robust to out-of-category images, it does not consider priors of finer categories, especially for indoor scenes, e.g., furniture, appliance and silk. In such a case, it puts forward challenging requirements for segmentation tasks from an LR image. Future work aims at addressing these shortcomings. Furthermore, segmentation and SR may benefit from each other and jointly improve the performance.

要点

本文的重点，是在SR时更好地恢复自然纹理信息。
具体而言，通过输入语义分割概率图（semantic segmentation probability maps），为CNN提供类别先验（categorical priors），从而让纹理与类别一一对应。
实现该功能的网络层称为空域特征转换层（spatial feature transform layer）。它可以生成对空域特征进行仿射变换的参数，并且与SR网络一起训练。
尽管SFT-GAN对于未知类别的图像也是健壮的，但未知类别确实是一个问题。

亮点

这算是一个借助语义分割信息的超分辨工作，思想符合逻辑，实验效果也好。Fig. 1给出了说明：
这种思想还可以拓展到其他先验，例如深度图（depth map），从而增强纹理的颗粒度（granularity）。
类似于BN，对特征进行正则化，从而置入类别先验。

局限

语义分割图是LR图像经过双三次插值后，输入已训练好的分割网络[31]得到的，与超分辨网络独立。
作者通过仿射变换特征的方式，置入类别先验。这种方式有效果，但可能还有更好的方式。

故事背景

如上图，如果缺乏对类别的先验，我们的解空间是很难约束的。特别是对于两个相似的场景，如上图的植物和砖块。

历史工作中，就有人专门对不同的分类训练各自的模型。但在这里，作者想让语义分割图作为CNN的输入。关键就在于如何输入。如果只是简单地输入分割图，或者在中间层输入分割图，效果是不好的。

空域特征转换

为了解决语义分割图的输入有效性问题，我们引出了空域特征转换（SFT）层。

实际上，SFT的思想起源于BN。BN是对特征作仿射变换。条件正则化（conditional normalization, CN）则是采用在某条件下学习得到的函数，代替BN中的仿射变换。那么SFT是怎么做的呢？

具体而言，SFT基于先验，输出调整参数对（modulation parameter pair）\((\gamma, \beta)\)。该调整参数对将会对中间层的特征\(F\)进行仿射变换：\(SFT(F|\gamma, \beta) = \gamma \odot F + \beta\)，其中\(\odot\)是哈达玛乘积（逐点点乘）。换句话说：借助SFT，原本关于类别的先验，就转化为了调整参数信息。

在网络中是这么实现的：

我们先关注SFT结构。

如图，分割概率图没有直接输入网络，而是先经过一个浅层CNN学习，我们称之为condition network。
网络的输出（conditions）会在整个网络的每一个中间层共享。在内部，如图，conditions分别经过2层CNN，得到参数对即可。然后执行仿射变换，完毕。

4.3节实验发现，直接拼接分割信息图，效果是很差的。

超分辨率网络

我们首先看一看分割网络。

LR图像先经过了双三次插值升采样，然后经过分割网络[31]，得到语义分割概率图。该网络是独立训练的，与我们现在的工作独立。
实验发现，哪怕经过放缩因子为4的降采样，分割效果也是不错的（如图4）。如果类别未知，那么该目标会落入背景（background）。

整体结构是一个GAN，参见3.2节。

实验略。

Paper | Recovering Realistic Texture in Image Super-resolution by Deep Spatial Feature Transform的更多相关文章

使用深度学习的超分辨率介绍 An Introduction to Super Resolution using Deep Learning
使用深度学习的超分辨率介绍关于使用深度学习进行超分辨率的各种组件,损失函数和度量的详细讨论. 介绍超分辨率是从给定的低分辨率(LR)图像恢复高分辨率(HR)图像的过程.由于较小的空间分辨率(即尺寸 ...
Computer Vision Applied to Super Resolution
Capel, David, and Andrew Zisserman. "Computer vision applied to super resolution." Signal ...
Super Resolution
Super Resolution Accepted : 121 Submit : 187 Time Limit : 1000 MS Memory Limit : 65536 KB Super ...
ASRWGAN: Wasserstein Generative Adversarial Network for Audio Super Resolution
ASEGAN:WGAN音频超分辨率这篇文章并不具有权威性,因为没有发表,说不定是外国的某个大学的毕业设计,或者课程结束后的作业.或者实验报告. CS230: Deep Learning, Sprin ...
Speech Super Resolution Generative Adversarial Network
博客作者:凌逆战博客地址:https://www.cnblogs.com/LXP-Never/p/10874993.html 论文作者:Sefik Emre Eskimez , Kazuhito K ...
Google Pixel 超分辨率--Super Resolution Zoom
Google Pixel 超分辨率--Super Resolution Zoom Google 的Super Res Zoom技术,主要用于在zoom时增强画面细节以及提升在夜景下的效果. 文章的主要 ...
RAISR: rapid and accurate image super resolution
准确地说,RAISR并不是用来压缩图像的,而是用来upsample图像的. 众所周知,图片缩小到半分辨率后,在拉回原大小,会出现强烈的锯齿.从80年代开始就有很多super sampling的方法 ...
【论文学习】A Fuzzy-Rule-Based Approach for Single Frame Super Resolution
加尔各答印度统计研究所,作者: Pulak Purkait (pulak_r@isical.ac.in) 2013 年代码:CodeForge.cn http://www.codeforge.cn/ ...
paper 124：【转载】无监督特征学习——Unsupervised feature learning and deep learning
来源:http://blog.csdn.net/abcjennifer/article/details/7804962 无监督学习近年来很热,先后应用于computer vision, audio c ...

随机推荐

Python连载50-贪婪匹配、XPath介绍
一.贪婪和非贪婪 1.贪婪:尽可能多的匹配,(*)表示贪婪匹配 2.非贪婪:找到符合条件的最小内容即可,(?)表示非贪婪 3.正则默认使用贪婪匹配 import re title = u"& ...
牛客网sql刷题解析-完结
查找最晚入职员工的所有信息解题步骤: 题目:查询最晚入职员工的所有信息目标:查询员工的所有信息筛选条件:最晚入职答案: SELECT *--查询所有信息就用* ...
如何使用numpy实现一个全连接神经网络？（上）
全连接神经网络的概念我就不介绍了,对这个不是很了解的朋友,可以移步其他博主的关于神经网络的文章,这里只介绍我使用基本工具实现全连接神经网络的方法. 所用工具: numpy == 1.16.4 matp ...
【文本处理命令】之awk命令详解
一.awk命令简介 awk 是一种很棒的语言,它适合文本处理和报表生成,其语法较为常见,借鉴了某些语言的一些精华,如 C 语言等.在 linux 系统日常处理工作中,发挥很重要的作用,掌握了 awk将 ...
Spring提供JdbcTemplate&NamedParameterJdbcTemplate
JdbcTemplate主要提供以下五类方法: execute方法:可以用于执行任何SQL语句,一般用于执行DDL语句: update方法及batchUpdate方法:update方法用于执行新增.修 ...
10、Fiddler中设置断点修改Response
当然Fiddler中也能修改Response 第一种:打开Fiddler 点击Rules-> Automatic Breakpoint ->After Response (这种方法会中 ...
【51Nod1769】Clarke and math2（数论，组合数学）
[51Nod1769]Clarke and math2(数论,组合数学) 题面 51Nod 题解考虑枚举一个\(i_k\),枚举一个\(i\),怎么计算\(i_k\)对\(i\)的贡献. 把\(\f ...
Razor_02 第一个应用程序+Model+EF 添加
第一个应用程序+Model+EF 添加小试牛刀今天也试了试边说边写,但是有时候编辑器不给力,或者网路不给力,倒是浪费大家时间,所以今天录制完了就裁切了部分视频,如果有不清楚的地方,可以留 ...
Z从壹开始前后端分离【 .NET Core2.2/3.0 +Vue2.0 】框架之九 || 依赖注入IoC学习 + AOP界面编程初探
本文梯子本文3.0版本文章更新代码已上传Github+Gitee,文末有地址零.今天完成的绿色部分一.依赖注入的理解和思考二.常见的IoC框架有哪些 1.Autofac+原生 2.三种注入 ...
MongoDB 高级教程
MongoDB 关系 MongoDB 的关系表示多个文档之间在逻辑上的相互联系. 文档间可以通过嵌入和引用来建立联系. MongoDB 中的关系可以是: 1:1 (1对1) 1: N (1对多) N: ...

Paper | Recovering Realistic Texture in Image Super-resolution by Deep Spatial Feature Transform

故事背景

空域特征转换

超分辨率网络

Paper | Recovering Realistic Texture in Image Super-resolution by Deep Spatial Feature Transform的更多相关文章

随机推荐

热门专题