【论文阅读】DGCNN：Dynamic Graph CNN for Learning on Point Clouds

　　毕设进了图网络的坑，感觉有点难，一点点慢慢学吧，本文方法是《Rethinking Table Recognition using Graph Neural Networks》中关系建模环节中的主要方法。

## 概述

　　本文是对经典的PointNet进行改进，主要目标是设计一个可以直接使用点云作为输入的CNN架构，可适用于分类、分割等任务。主要的创新点是提出了一个新的可微网络模块EdgeConv（边卷积操作）来提取局部邻域信息。

　　其整体的网络结构如下所示，值得注意的有：

整体的网络结构与PointNet的结构类似，最重要的区别就是使用EdgeConv代替MLP；
对于每个EdgeConv模块，我们即考虑全局特征，又考虑局部特征，（图2左）聚合函数（图2右）；
EdgeConv模块中KNN图的K值是一个超参，分类网络中K=20，而在分割网络中K=30；在做表格识别任务时，k=10；
在分割网络中，将global descripter和每层的local descripter进行连接后对每个点输出一个预测分数；
每层后的mlp全连接都是为了计算边特征(edge features)，实现动态的图卷积。

## Edge Convolution

假设一个F维点$\mathbf{X}=\left\{\mathbf{x}_{1}, \ldots, \mathbf{x}_{n}\right\} \subseteq \mathbb{R}^{F}$，最简单的$\mathrm{F}=3$(即x y z位置信息)，另外还可能引入每个点颜色、法线等信息。
给定一个有向图 $\mathcal{G}=(\mathcal{V}, \mathcal{E})$ ，用来表示点云结构信息，其中顶点为$\mathcal{V}=\{1, \ldots, n\}$，边为 $\mathcal{E} \subseteq \mathcal{V} \times \mathcal{V}$，边特征函数$e_{i j}=h_{\Theta}\left(x_{i}, x_{j}\right)$，其中 $h$是 $\mathbb{R}^{F} \times \mathbb{R}^{F} \rightarrow \mathbb{R}^{F^{\prime}}$的映射(从结点信息获取边特征信息)
图2左就描述了一个点$x_{i}$和其邻近点$x_{j}$的边特征$e_{i j}$求解过程，$h$使用三层全连接，用tf.layers.dense实现。(注：Dense and fully connected are two names for the same thing.)
图2右描述的是结点参数更新的过程(结点聚合函数)，定义为$\square$，其定义是：$\mathbf{x}_{i}^{\prime}=\square_{j:(i, j) \in \mathcal{E}} h_{\Theta}\left(\mathbf{x}_{i}, \mathbf{x}_{j}\right)$，根据不同的需求，h和□有四种不同的选择
- 认为$x_{i}$的特征是周围所有点的加权求和，这一点类似于图像的卷积操作，其中每个卷积核为${\theta}_{m}$，他的维度与x的维度相同，$\Theta=\left(\theta_{1}, \ldots, \theta_{M}\right)$表示所有卷积核的集合。其公式如下：$x_{i m}^{\prime}=\sum_{i=(j, j) \in \mathcal{E}} \boldsymbol{\theta}_{m} \cdot \mathbf{x}_{j}$
- 若只考虑全局特征，即PointNet的使用方法，公式如下：$h_{\Theta}\left(\mathbf{x}_{i}, \mathbf{x}_{j}\right)=h_{\Theta}\left(\mathbf{x}_{i}\right)$
- 若只考虑局部特征，即输入的仅为点和周围点的差，则公式如下：$h_{\Theta}\left(\mathbf{x}_{i}, \mathbf{x}_{j}\right)=h_{\Theta}\left(\mathbf{x}_{j}-\mathbf{x}_{i}\right)$
- 同时关注全局特征和局部特征，这也是本文中的主要形式：$h_{\Theta}\left(\mathbf{x}_{i}, \mathbf{x}_{j}\right)=\bar{h}_{\Theta}\left(\mathbf{x}_{i}, \mathbf{x}_{j}-\mathbf{x}_{i}\right)$
  - 在本文中，h函数使用$e_{i j m}^{\prime}=\operatorname{ReLU}\left(\boldsymbol{\theta}_{m} \cdot\left(\mathbf{x}_{j}-\mathbf{x}_{i}\right)+\boldsymbol{\phi}_{m} \cdot \mathbf{x}_{i}\right)$，关系聚合函数选用 max

## 在表格识别任务中实现的代码

　　在我的毕设，即表格识别任务中，主要借用edge conv的思想，和分割部分的网络结构。

　　其流程目的是将结点的特征进行提取，即输入为(25,900,133)（batch_size, node_num, feature_num），输出为经DGCNN处理后的结点信息,size为(25,900,128)，整体流程如下：

 1 def edge_conv_layer(vertices_in, num_neighbors=30,

 2                           mpl_layers=[64, 64, 64],

 3                           aggregation_function=tf.reduce_max,

 4                           share_keyword=None,  # TBI,

 5                           edge_activation=None

 6                           ):

 7     trans_space = vertices_in  # (25,900,64)

 8     indexing, _ = indexing_tensor(trans_space, num_neighbors)  # (25,900,10,2)

 9     # change indexing to be not self-referential

10     neighbour_space = tf.gather_nd(vertices_in, indexing)  # (25, 900, 10, 64)

11

12     expanded_trans_space = tf.expand_dims(trans_space, axis=2)

13     expanded_trans_space = tf.tile(expanded_trans_space, [1, 1, num_neighbors, 1])  # (25, 900 , 10（null）, 64)

14

15     diff = expanded_trans_space - neighbour_space # (25, 900, 10, 64)

16     edge = tf.concat([expanded_trans_space, diff], axis=-1)  # (25, 900, 10, 128)

17

18     for f in mpl_layers:

19         edge = tf.layers.dense(edge, f, activation=tf.nn.relu)  # 三层全连接 (25,900,10,64)

20     if edge_activation is not None:

21         edge = edge_activation(edge)

22

23     vertex_out = aggregation_function(edge, axis=2)  # (25,900,64)

24     # print("vertex_out:", vertex_out.shape)

25     return vertex_out

【论文阅读】DGCNN：Dynamic Graph CNN for Learning on Point Clouds的更多相关文章

论文笔记：（TOG2019）DGCNN : Dynamic Graph CNN for Learning on Point Clouds
目录摘要一.引言二.相关工作三.我们的方法 3.1 边缘卷积Edge Convolution 3.2动态图更新 3.3 性质 3.4 与现有方法比较四.评估 4.1 分类 4.2 模型复杂度 ...
论文阅读 Real-Time Streaming Graph Embedding Through Local Actions 11
9 Real-Time Streaming Graph Embedding Through Local Actions 11 link:https://scholar.google.com.sg/sc ...
论文阅读 Continuous-Time Dynamic Network Embeddings
1 Continuous-Time Dynamic Network Embeddings Abstract 描述一种将时间信息纳入网络嵌入的通用框架,该框架提出了从CTDG中学习时间相关嵌入 Co ...
【CV论文阅读】Dynamic image networks for action recognition
论文的重点在于后面approximation部分. 在<Rank Pooling>的论文中提到,可以通过训练RankSVM获得参数向量d,来作为视频帧序列的representation.而 ...
【论文阅读】Deep Clustering for Unsupervised Learning of Visual Features
文章:Deep Clustering for Unsupervised Learning of Visual Features 作者:Mathilde Caron, Piotr Bojanowski, ...
论文阅读 A Data-Driven Graph Generative Model for Temporal Interaction Networks
13 A Data-Driven Graph Generative Model for Temporal Interaction Networks link:https://scholar.googl ...
论文阅读笔记十六：DeconvNet:Learning Deconvolution Network for Semantic Segmentation(ICCV2015)
论文源址:https://arxiv.org/abs/1505.04366 tensorflow代码:https://github.com/fabianbormann/Tensorflow-Decon ...
论文阅读 Predicting Dynamic Embedding Trajectory in Temporal Interaction Networks
6 Predicting Dynamic Embedding Trajectory in Temporal Interaction Networks link:https://arxiv.org/ab ...
【论文阅读】PBA-Population Based Augmentation:Efficient Learning of Augmentation Policy Schedules
参考 1. PBA_paper; 2. github; 3. Berkeley_blog; 4. pabbeel_berkeley_EECS_homepage; 完

随机推荐

Artwork (Gym - 102346A)【DFS、连通块】
Artwork (Gym - 102346A) 题目链接算法 DFS,连通块时间复杂度:O(k*n + k * k) 1.这道题就是让你判断从(0,0)到(m,n),避开中途所有的传感器(传感器的 ...
简单两步实现Android app 本地设置信息的保存与调用
调用值 SharedPreferences preferences = this.getSharedPreferences("mspda", 0); lblVer.setText( ...
WordPress用户角色权限
[转自:群燕小站(http://www.zqunyan.com):原文链接: http://www.zqunyan.com/158.html] WordPress默认的用户角色有5个:订阅者,投稿者, ...
Python-TypeError: not all arguments converted during string formatting
Where? 运行Python程序,报错出现在这一行 return "Unknow Object of %s" % value Why? %s 表示把 value变量装换为字符串, ...
免费开源工作流Smartflow-Sharp v2.0
@font-face { font-family: 宋体 } @font-face { font-family: "Cambria Math" } @font-face { fon ...
C\C++中strcat（）函数
转载:https://blog.csdn.net/smf0504/article/details/52055971 C\C++中strcat()函数 ...
《C++ primer plus》第5章练习题
1.输入两个整数,输出两个整数之间所有整数的和,包括两个整数. #include<iostream> using namespace std; int main() { int num1, ...
frp内网穿透
原理 frp(fast reverse proxy)分为Server端和Client端,Server端安装在带有公网IP的服务器上,Client安装在内网环境但能上网的普通PC中. 流程: Serve ...
Tomcat配置Gizp 客户端使用okHttp3
找到tomcat 在 server.xml 新增如下配置 <Connector connectionTimeout="20000" port="8088" ...
Java bean 链式获取成员变量无需判空的工具设计
Java bean 链式获取成员变量无需判空的工具设计本篇文章已发布至公众号 Hollis 对于Java程序员来说,null是令人头痛的东西.时常会受到空指针异常(NPE)的骚扰.连Java的发明者 ...