An intriguing failing of convolutional neural networks and the CoordConv solution

NeurIPS 2018

2019-10-10 15:01:48

Paperhttps://arxiv.org/pdf/1807.03247.pdf

Official TensorFlow Codehttps://github.com/uber-research/coordconv

Unofficial PyTorch Codehttps://github.com/walsvid/CoordConv

 

机器之心:卷积神经网络「失陷」,CoordConv 来填坑https://zhuanlan.zhihu.com/p/39665894

Uber提出CoordConv:解决普通CNN坐标变换问题: https://zhuanlan.zhihu.com/p/39919038

要拯救CNN的CoordConv受嘲讽,翻译个坐标还用训练? https://zhuanlan.zhihu.com/p/39841356

1. 给定 feature map and 坐标(x, y)如何生成对应的 relative CoordinateMap?

The following code is from: [ICCV19] AdaptIS: Adaptive Instance Selection Network [Github]

    def get_instances_maps(self, F, points, adaptive_input, controller_input):
if isinstance(points, mx.nd.NDArray):
self.num_points = points.shape[1] if getattr(self.controller_net, 'return_map', False):
w = self.eqf(controller_input, points)
else:
w = self.eqf(controller_input, points)
w = self.controller_net(w) points = F.reshape(points, shape=(-1, 2))
x = F.repeat(adaptive_input, self.num_points, axis=0)
x = self.add_coord_features(x, points) x = self.block0(x)
x = self.adain(x, w)
x = self.block1(x) return x
class AppendCoordFeatures(gluon.HybridBlock):
def __init__(self, norm_radius, append_dist=True, spatial_scale=1.0):
super(AppendCoordFeatures, self).__init__()
self.xs = None
self.spatial_scale = spatial_scale
self.norm_radius = norm_radius
self.append_dist = append_dist def _ctx_kwarg(self, x):
if isinstance(x, mx.nd.NDArray):
return {"ctx": x.context}
return {} def get_coord_features(self, F, points, rows, cols, batch_size, **ctx_kwarg):
row_array = F.arange(start=0, stop=rows, step=1, **ctx_kwarg)
col_array = F.arange(start=0, stop=cols, step=1, **ctx_kwarg)
coord_rows = F.repeat(F.reshape(row_array, (1, 1, rows, 1)), repeats=cols, axis=3)
coord_cols = F.repeat(F.reshape(col_array, (1, 1, 1, cols)), repeats=rows, axis=2) coord_rows = F.repeat(coord_rows, repeats=batch_size, axis=0)
coord_cols = F.repeat(coord_cols, repeats=batch_size, axis=0) coords = F.concat(coord_rows, coord_cols, dim=1) add_xy = F.reshape(points * self.spatial_scale, shape=(0, 0, 1))
add_xy = F.reshape(F.repeat(add_xy, rows * cols, axis=2),
shape=(0, 0, rows, cols)) coords = (coords - add_xy) / (self.norm_radius * self.spatial_scale)
if self.append_dist:
dist = F.sqrt(F.sum(F.square(coords), axis=1, keepdims=1))
coord_features = F.concat(coords, dist, dim=1)
else:
coord_features = coords coord_features = F.clip(coord_features, a_min=-1, a_max=1)
return coord_features def hybrid_forward(self, F, x, coords):
if isinstance(x, mx.nd.NDArray):
self.xs = x.shape batch_size, rows, cols = self.xs[0], self.xs[2], self.xs[3]
coord_features = self.get_coord_features(F, coords, rows, cols, batch_size, **self._ctx_kwarg(x)) return F.concat(coord_features, x, dim=1)
    def get_coord_features(self, F, points, rows, cols, batch_size, **ctx_kwarg):

        # (Pdb) points, rows, cols, batch_size
# ([[61. 71.]] <NDArray 1x2 @gpu(0)>, 96, 96, 1) # row_array and col_array:
# [ 0. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17.
# 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35.
# 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. 51. 52. 53.
# 54. 55. 56. 57. 58. 59. 60. 61. 62. 63. 64. 65. 66. 67. 68. 69. 70. 71.
# 72. 73. 74. 75. 76. 77. 78. 79. 80. 81. 82. 83. 84. 85. 86. 87. 88. 89.
# 90. 91. 92. 93. 94. 95.]
# <NDArray 96 @gpu(0)> # (Pdb) coord_rows
# [[[[ 0. 0. 0. ... 0. 0. 0.]
# [ 1. 1. 1. ... 1. 1. 1.]
# [ 2. 2. 2. ... 2. 2. 2.]
# ...
# [93. 93. 93. ... 93. 93. 93.]
# [94. 94. 94. ... 94. 94. 94.]
# [95. 95. 95. ... 95. 95. 95.]]]]
# <NDArray 1x1x96x96 @gpu(0)> # (Pdb) coord_cols
# [[[[ 0. 1. 2. ... 93. 94. 95.]
# [ 0. 1. 2. ... 93. 94. 95.]
# [ 0. 1. 2. ... 93. 94. 95.]
# ...
# [ 0. 1. 2. ... 93. 94. 95.]
# [ 0. 1. 2. ... 93. 94. 95.]
# [ 0. 1. 2. ... 93. 94. 95.]]]]
# <NDArray 1x1x96x96 @gpu(0)> # (Pdb) add_xy
# [[[[61. 61. 61. ... 61. 61. 61.]
# [61. 61. 61. ... 61. 61. 61.]
# [61. 61. 61. ... 61. 61. 61.]
# ...
# [61. 61. 61. ... 61. 61. 61.]
# [61. 61. 61. ... 61. 61. 61.]
# [61. 61. 61. ... 61. 61. 61.]] # [[71. 71. 71. ... 71. 71. 71.]
# [71. 71. 71. ... 71. 71. 71.]
# [71. 71. 71. ... 71. 71. 71.]
# ...
# [71. 71. 71. ... 71. 71. 71.]
# [71. 71. 71. ... 71. 71. 71.]
# [71. 71. 71. ... 71. 71. 71.]]]]
# <NDArray 1x2x96x96 @gpu(0)> # (Pdb) if self.append_dist, then coord_features is:
# [[[[-1. -1. -1. ... -1. -1.
# -1. ]
# [-1. -1. -1. ... -1. -1.
# -1. ]
# [-1. -1. -1. ... -1. -1.
# -1. ]
# ...
# [ 0.7619048 0.7619048 0.7619048 ... 0.7619048 0.7619048
# 0.7619048 ]
# [ 0.78571427 0.78571427 0.78571427 ... 0.78571427 0.78571427
# 0.78571427]
# [ 0.8095238 0.8095238 0.8095238 ... 0.8095238 0.8095238
# 0.8095238 ]] # [[-1. -1. -1. ... 0.52380955 0.54761904
# 0.5714286 ]
# [-1. -1. -1. ... 0.52380955 0.54761904
# 0.5714286 ]
# [-1. -1. -1. ... 0.52380955 0.54761904
# 0.5714286 ]
# ...
# [-1. -1. -1. ... 0.52380955 0.54761904
# 0.5714286 ]
# [-1. -1. -1. ... 0.52380955 0.54761904
# 0.5714286 ]
# [-1. -1. -1. ... 0.52380955 0.54761904
# 0.5714286 ]] # [[ 1. 1. 1. ... 1. 1.
# 1. ]
# [ 1. 1. 1. ... 1. 1.
# 1. ]
# [ 1. 1. 1. ... 1. 1.
# 1. ]
# ...
# [ 1. 1. 1. ... 0.9245947 0.9382886
# 0.95238096]
# [ 1. 1. 1. ... 0.944311 0.9577231
# 0.9715336 ]
# [ 1. 1. 1. ... 0.96421224 0.97735125
# 0.99088824]]]]
# <NDArray 1x3x96x96 @gpu(0)> pdb.set_trace()
row_array = F.arange(start=0, stop=rows, step=1, **ctx_kwarg) ## (96,)
col_array = F.arange(start=0, stop=cols, step=1, **ctx_kwarg) ## (96,)
coord_rows = F.repeat(F.reshape(row_array, (1, 1, rows, 1)), repeats=cols, axis=3)
coord_cols = F.repeat(F.reshape(col_array, (1, 1, 1, cols)), repeats=rows, axis=2) coord_rows = F.repeat(coord_rows, repeats=batch_size, axis=0)
coord_cols = F.repeat(coord_cols, repeats=batch_size, axis=0) coords = F.concat(coord_rows, coord_cols, dim=1) ## (1, 2, 96, 96) add_xy = F.reshape(points * self.spatial_scale, shape=(0, 0, 1)) ## [[[61.] [71.]]] <NDArray 1x2x1 @gpu(0)>
add_xy = F.reshape(F.repeat(add_xy, rows * cols, axis=2), shape=(0, 0, rows, cols)) ## self.norm_radius: 42
coords = (coords - add_xy) / (self.norm_radius * self.spatial_scale) ## <NDArray 1x2x96x96 @gpu(0)>
if self.append_dist:
dist = F.sqrt(F.sum(F.square(coords), axis=1, keepdims=1)) ## <NDArray 1x1x96x96 @gpu(0)>
coord_features = F.concat(coords, dist, dim=1)
else:
coord_features = coords coord_features = F.clip(coord_features, a_min=-1, a_max=1) return coord_features

I also write one PyTorch version according to the MXNet version:

class AddCoords(nn.Module):

    def __init__(self, ):
super().__init__() def forward(self, input_tensor, points):
_, x_dim, y_dim = input_tensor.size()
batch_size = 1 xx_channel = torch.arange(x_dim).repeat(1, y_dim, 1) ## torch.Size([1, 9, 9])
yy_channel = torch.arange(y_dim).repeat(1, x_dim, 1).transpose(1, 2) ## torch.Size([1, 9, 9]) xx_channel = xx_channel.repeat(batch_size, 1, 1, 1).transpose(2, 3)
yy_channel = yy_channel.repeat(batch_size, 1, 1, 1).transpose(2, 3) coords = torch.cat((xx_channel, yy_channel), dim=1) ## torch.Size([20, 2, 9, 9])
coords = coords.type(torch.FloatTensor) add_xy = torch.reshape(points, (1, 2, 1)) ## torch.Size([1, 2, 1])
add_xy_ = add_xy.repeat(1, 1, x_dim * y_dim) ## torch.Size([1, 2, 81])
add_xy_ = torch.reshape(add_xy_, (1, 2, x_dim, y_dim)) ## torch.Size([1, 2, 9, 9])
add_xy_ = add_xy_.type(torch.FloatTensor) coords = (coords - add_xy_) ## torch.Size([1, 2, 9, 9])
coord_features = np.clip(np.array(coords), -1, 1) ## (1, 2, 9, 9)
coord_features = torch.from_numpy(coord_features).cuda() return coord_features

 

An intriguing failing of convolutional neural networks and the CoordConv solution的更多相关文章

  1. Understanding the Effective Receptive Field in Deep Convolutional Neural Networks

    Understanding the Effective Receptive Field in Deep Convolutional Neural Networks 理解深度卷积神经网络中的有效感受野 ...

  2. Deep learning_CNN_Review:A Survey of the Recent Architectures of Deep Convolutional Neural Networks——2019

    CNN综述文章 的翻译 [2019 CVPR] A Survey of the Recent Architectures of Deep Convolutional Neural Networks 翻 ...

  3. tensorfolw配置过程中遇到的一些问题及其解决过程的记录(配置SqueezeDet: Unified, Small, Low Power Fully Convolutional Neural Networks for Real-Time Object Detection for Autonomous Driving)

    今天看到一篇关于检测的论文<SqueezeDet: Unified, Small, Low Power Fully Convolutional Neural Networks for Real- ...

  4. Notes on Convolutional Neural Networks

    这是Jake Bouvrie在2006年写的关于CNN的训练原理,虽然文献老了点,不过对理解经典CNN的训练过程还是很有帮助的.该作者是剑桥的研究认知科学的.翻译如有不对之处,还望告知,我好及时改正, ...

  5. 《ImageNet Classification with Deep Convolutional Neural Networks》 剖析

    <ImageNet Classification with Deep Convolutional Neural Networks> 剖析 CNN 领域的经典之作, 作者训练了一个面向数量为 ...

  6. 卷积神经网络CNN(Convolutional Neural Networks)没有原理只有实现

    零.说明: 本文的所有代码均可在 DML 找到,欢迎点星星. 注.CNN的这份代码非常慢,基本上没有实际使用的可能,所以我只是发出来,代表我还是实践过而已 一.引入: CNN这个模型实在是有些年份了, ...

  7. A Beginner's Guide To Understanding Convolutional Neural Networks(转)

    A Beginner's Guide To Understanding Convolutional Neural Networks Introduction Convolutional neural ...

  8. 阅读笔记 The Impact of Imbalanced Training Data for Convolutional Neural Networks [DegreeProject2015] 数据分析型

    The Impact of Imbalanced Training Data for Convolutional Neural Networks Paulina Hensman and David M ...

  9. 读convolutional Neural Networks Applied to House Numbers Digit Classification 的收获。

    本文以下内容来自读论文以后认为有价值的地方,论文来自:convolutional Neural Networks Applied to House Numbers Digit Classificati ...

随机推荐

  1. 前端用js获取本地文件的内容

    这里要写成input的形式 调用upload函数 传递的参数就表示所选的文件<input type="file" onchange="upload(this)&qu ...

  2. 常用的本地存储-----cookie篇

    1.引言 随着浏览器的处理能力不断增强,越来越多的网站开始考虑将数据存储在「客户端」,那么久不得不谈本地存储了. 本地存储的好处: 一是避免取回数据前页面一片空白,如果不需要最新数据也可以减少向服务器 ...

  3. html, js,css应用文件路径规则

    web前端一般常用文件 .html .css .js.但是当用css文件和html引入资源(比如图片)时,路径可能不相同.下面总结了几条. 使用相对路径引入规则: html或者js引入图片,按照htm ...

  4. Springboot+事务

    项目小 自己没有实际应用,但实际用起来不难. 参照 作者孙林峰的就可以了 http://blog.coocap.com/?p=610 其截图备份如下:

  5. dapi 基于Django的轻量级测试平台一 设计思想

    GitHub:https://github.com/yjlch1016/dapi 一.项目命名: dapi:即Django+API测试的缩写 二.设计思想: 模拟性能测试工具JMeter的思路, 实现 ...

  6. hadoop安装教程,分布式配置 CentOS7 Hadoop3.1.2

    安装前的准备 1. 准备4台机器.或虚拟机 4台机器的名称和IP对应如下 master:192.168.199.128 slave1:192.168.199.129 slave2:192.168.19 ...

  7. new.target元属性 | 分别用es5、es6 判断一个函数是否使用new操作符

    函数内部有两个方法 [[call]] 和 [[construct]] (箭头函数没有这个方法),当使用new 操作符时, 函数内部调用 [[construct]], 创建一个新实例,this指向这个实 ...

  8. node.js封装数据库增删改查

    数据库增删改查的封装 小编不容易 const sql = { insert: function (Collection, insertData) { return new Promise((resol ...

  9. Hibernate学习:Exception in thread "main" java.lang.NullPointerException

    1.在学习Hibernate多对多关系的时候遇到了一下异常: 主函数出现了空指针异常: public static void testadd() { Session session = Hiberna ...

  10. Django 基础篇(二)视图与模板

    视图 在django中,视图对WEB请求进行回应 视图接收reqeust对象作为第一个参数,包含了请求的信息 视图就是一个Python函数,被定义在views.py中 #coding:utf- fro ...