《Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks》论文笔记

Code Address：https://github.com/junyanz/CycleGAN.

Abstract

引出Image Translating的概念（greyscale to color, image to semantic labels, edge-map to photograph.），并申明了本作的动机，不使用 image pairs来训练图片的风格转换：We present an approach for learning to translate an image from a source domain X to a target domain Y in the absence of paired examples.作者希望能学习一个映射maping G，将域A中的图片转换到域B的图片中，，反之，也建立一个映射F，将域B中的图片转成域A中的图片，两个域的训练集图片并不是成对出现。转换后的图片需要分别定义自己的D来做训练，达到欺骗和识别的对抗训练，使得生成在本域的图片y'和实际属于本域的图片y不可被分辨，这样在训练时，可以将原有的GAN结构扩展为cycle的形式（and vice versa）.

Introduction

可能是计算机paper里最富诗情画意的introduction：，随后作者用一定篇幅剖析了人类可以将任何现实中看到的场景映射成莫奈风格的画作，哪怕莫奈从没画过这些场景，那么计算机是否也可以做到这一点呢？这样得以解决现实中成批出现的训练集需要耗费极高的采集、制作、标注成本的难题。接着进一步阐述了为什么要用循环的方式来扩展GAN，因为从A到B域映射出来的图片可能有非常多的可能，并且都满足B域的分布，加入一个反向映射的循环，可以加强转换的约束性，同时还能避免GAN中常见的mode collapse的问题，作者称其为cycle consistent。

Relate Work

作者借鉴的RelatedWork包括： GAN、Image-to-Image Translation、Unpaired Image-to-Image Translation、Neural Style Transfer、Cycle Consistency

Model

模型的Loss方面分为两个部分：

（1）Adversarial Loss：

　　　　对于G:X->Y的映射有

　　　　　对于F：Y->X的映射也有类似的一个对抗损失

（2）Cycle Consistency Loss：

最终目标函数：

在后面的实验中，将这几个loss的作用都进行了直观的展示，表明缺一不可。

实现

模型架构基于[3],在风格转换和超分辨率上都表现不错，使用了instance normalization。并且对D，使用了70*70的PatchGANs，判别70*70的像素的真伪，相对于全像素判别的D减少了参数[4,5,6]。

具体实现中，作者使用了更稳定，生成质量更高的最小二乘GAN的Loss来替换原始GAN（least square loss）[2]：

并且为了避免模式震荡（mode oscillation）[1]，作者对Dx和Dy做了一个滞后更新，用之前生成的50张左右图片来训练D而不是实时用G生成的图片来生成

实验结果（略）

不足

CycleGAN对非成对图片集的转换成功主要集中在色彩和贴图转换上，在几何形态上的转换大多以失败告终（猫->狗）。此外，与成对数据集的训练结果相比，依然存在不足。

1.Y. Taigman, A. Polyak, and L. Wolf. Unsupervised cross-domain image generation. arXiv preprint arXiv:1611.02200, 2016

2.Multiclass generative adversarial networks with the l2 loss function.

3.J. Johnson, A. Alahi, and L. Fei-Fei. Perceptual losses for real-time style transfer and super-resolution. In ECCV, pages 694–711. Springer, 2016.

4.P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros. Imageto-image translation with conditional adversarial networks. arXiv preprint arXiv:1611.07004, 2016

5. C. Ledig, L. Theis, F. Husz´ar, J. Caballero, A. Cunningham,A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. Wang, et al. Photo-realistic single image superresolution using a generative adversarial network. arXiv preprint arXiv:1609.04802, 2016. 5
6.C. Li and M. Wand. Precomputed real-time texture synthesis with markovian generative adversarial networks. ECCV, 2016. 5

《Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks》论文笔记的更多相关文章

《Vision Permutator: A Permutable MLP-Like ArchItecture For Visual Recognition》论文笔记
论文题目:<Vision Permutator: A Permutable MLP-Like ArchItecture For Visual Recognition> 论文作者:Qibin ...
[place recognition]NetVLAD: CNN architecture for weakly supervised place recognition 论文翻译及解析（转）
https://blog.csdn.net/qq_32417287/article/details/80102466 abstract introduction method overview Dee ...
论文笔记系列-Auto-DeepLab:Hierarchical Neural Architecture Search for Semantic Image Segmentation
Pytorch实现代码:https://github.com/MenghaoGuo/AutoDeeplab 创新点 cell-level and network-level search 以往的NAS ...
论文笔记——Rethinking the Inception Architecture for Computer Vision
1. 论文思想 factorized convolutions and aggressive regularization. 本文给出了一些网络设计的技巧. 2. 结果用5G的计算量和25M的参数. ...
论文笔记：Fast Neural Architecture Search of Compact Semantic Segmentation Models via Auxiliary Cells
Fast Neural Architecture Search of Compact Semantic Segmentation Models via Auxiliary Cells 2019-04- ...
论文笔记：ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware
ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware 2019-03-19 16:13:18 Pape ...
论文笔记：DARTS: Differentiable Architecture Search
DARTS: Differentiable Architecture Search 2019-03-19 10:04:26accepted by ICLR 2019 Paper:https://arx ...
论文笔记：Progressive Neural Architecture Search
Progressive Neural Architecture Search 2019-03-18 20:28:13 Paper:http://openaccess.thecvf.com/conten ...
论文笔记：Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation
Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation2019-03-18 14:4 ...
论文笔记系列-DARTS: Differentiable Architecture Search
Summary 我的理解就是原本节点和节点之间操作是离散的,因为就是从若干个操作中选择某一个,而作者试图使用softmax和relaxation(松弛化)将操作连续化,所以模型结构搜索的任务就转变成了 ...

随机推荐

HDU 2352 Verdis Quo
罗马数字转化为十进制的值题目非常的长提取有效信息并且介绍很多规则但是事实上有用的信息就是如何加什么时候减当当前字母小于下一个字母时减去当前字母的值 #include <iostre ...
loadrunner 多用户并发操作解读
假设存在: 数据:A.B.C 虚拟用户:Vuser1.Vuser2.Vuser3 脚本中参数出现三次,脚本迭代三次怎样取下一行数据? Sequential:顺序,所有虚拟用户按照顺序读取数据表 Ra ...
isNaN+parseFloat进行统计以及对NaN的处理【JS验证数字】
今天遇到这么一个需求: 对数据进行统计,可是在统计的时候parseFloat的时候出来一个NaN.后来用isNaN判断,如果是NaN,就给其设置一个初值. 先看对两个方法的解释 parseFloat: ...
显示倒计时的Button按钮
package com.pingyijinren.helloworld.activity; import android.os.CountDownTimer; import android.suppo ...
Codeforces Round #258 (Div. 2) B. Sort the Array（简单题）
题目链接:http://codeforces.com/contest/451/problem/B --------------------------------------------------- ...
Bag-of-words模型、TF-IDF模型
Bag-of-words model (BoW model) 最早出现在NLP和IR(information retrieval)领域. 该模型忽略掉文本的语法和语序, 用一组无序的单词(words) ...
【stl学习笔记】list
list使用双向链表来管理元素. 与vector.deque的区别: 1.list不支持随机存取,在list中随机遍历任意元素,是很缓慢的行为 2.任何位置上执行元素的安插和移除都非常快,始终是常数时 ...
CentOS5 忘记root密码的解决办法
方法/步骤 1 开机启动的时候,按“E”进入如下界面. 2 选择相应的内核,再次按“E”,出现下图,选择第二项,再次按“E”键 3 在尾部加:“空格+single”(如图),Enter.图如下: ...
webpack-Hot Module Replacement（热更新）
模块热替换(Hot Module Replacement) 模块热替换(HMR - Hot Module Replacement)功能会在应用程序运行过程中替换.添加或删除模块,而无需重新加载整个页面 ...
分享：APK高级保护方法解析（三）
刷朋友圈.玩游戏.看新闻,智能手机正在以我们无法想象的速度飞快发展,可是随之而来的安全问题也越来越引人关注,APP二次打包.反编译.盗版的现象屡见不鲜.因此须要对APK进行加固保护. 眼下市面上常见的 ...

《Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks》论文笔记

《Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks》论文笔记的更多相关文章

随机推荐

热门专题