Paper Read: Robust Deep Multi-modal Learning Based on Gated Information Fusion Network

Robust Deep Multi-modal Learning Based on Gated Information Fusion Network

2018-07-27 14:25:26

Paper：https://arxiv.org/pdf/1807.06233.pdf

Related Papers:

1. Infrared and visible image fusion methods and applications: A survey 　　Paper

2. Chenglong Li, Xiao Wang, Lei Zhang, Jin Tang, Hejun Wu, and Liang Lin. WELD: Weighted Low-rank Decomposition or Robust Grayscale-Thermal Foreground Detection. IEEE Transactions on Circuits and Systems for Video Technology (T-CSVT), 27(4): 725-738, 2017. [Project pagewith Dataset and Code]

3. Chenglong Li, Xinyan Liang, Yijuan Lu, Nan Zhao, and Jin Tang. RGB-T Object Tracking: Benchmark and Baseline.[arXiv] [Dataset: Google drive, Baidu cloud] [Project page]

本文针对多模态融合问题（Multi-modal），提出一种基于 gate 机制的融合策略，能够自适应的进行多模态信息的融合。作者将该方法用到了物体检测上，其大致流程图如下所示：

如上图所示，作者分别用两路 Network 来提取两个模态的特征。该网络是由标准的 VGG-16 和 8 extra convolutional layers 构成。另外，作者提出新的 GIF（Gated Information Fusion Network）网络进行多个模态之间信息的融合，以取得更好的结果。动机当然就是多个模态的信息，是互补的，但是有的信息帮助会更大，有的可能就质量比较差，功效比较小，于是就可以自适应的来融合，达到更好的效果。

Gated Information Fusion Network (GIF)：

如上图所示：

该 GIF 网络的输入是：已经提取的 CNN feature map，这里是 F1, F2. 然后，将这两个 feature 进行 concatenate，得到 $F_G$. 该网络包含两个部分：

1. information fusion network（图2，虚线框意外的部分）；

2. weight generation network （WG Network，即：图2，虚线处）；

Weight Generation Network 分别用两个 3*3*1 的卷积核对组合后的 feature map $F_G$ 进行操作，然后输入到 sigmoid 函数中，即：gate layer，然后输出对应的权重 $w_1$，$w_2$。

Information fusion network 分别用得到的两个权重，点乘原始的 feature map，得到加权以后的特征图，将两者进行 concatenate 后，用 1*1*2k 的卷积核，得到最终的 feature map。

总结整个过程，可以归纳为：

== Done !

Paper Read: Robust Deep Multi-modal Learning Based on Gated Information Fusion Network的更多相关文章

Exploring Architectural Ingredients of Adversarially Robust Deep Neural Networks
目录概主要内容深度宽度代码 Huang H., Wang Y., Erfani S., Gu Q., Bailey J. and Ma X. Exploring architectural ...
【论文简读】 Deep web data extraction based on visual
<Deep web data extraction based on visual information processing>作者 J Liu 上海海事大学 2017 AIHC会议登载 ...
Paper List ABOUT Deep Learning
Deep Learning 方向的部分 Paper ,自用.一 RNN 1 Recurrent neural network based language model RNN用在语言模型上的开山之作 ...
【RS】Deep Learning based Recommender System: A Survey and New Perspectives - 基于深度学习的推荐系统：调查与新视角
[论文标题]Deep Learning based Recommender System: A Survey and New Perspectives ( ACM Computing Surveys ...
[转]Deep Reinforcement Learning Based Trading Application at JP Morgan Chase
Deep Reinforcement Learning Based Trading Application at JP Morgan Chase https://medium.com/@ranko.m ...
论文笔记: Deep Learning based Recommender System: A Survey and New Perspectives
(聊两句,突然记起来以前一个学长说的看论文要能够把论文的亮点挖掘出来,合理的进行概括23333) 传统的推荐系统方法获取的user-item关系并不能获取其中非线性以及非平凡的信息,获取非线性以及非平 ...
Predicting effects of noncoding variants with deep learning–based sequence model | 基于深度学习的序列模型预测非编码区变异的影响
Predicting effects of noncoding variants with deep learning–based sequence model PDF Interpreting no ...
论文翻译：2021_Towards model compression for deep learning based speech enhancement
论文地址:面向基于深度学习的语音增强模型压缩论文代码:没开源,鼓励大家去向作者要呀,作者是中国人,在语音增强领域深耕多年引用格式:Tan K, Wang D L. Towards model c ...
Deep High-Resolution Representation Learning for Human Pose Estimation
Deep High-Resolution Representation Learning for Human Pose Estimation 2019-08-30 22:05:59 Paper: CV ...

随机推荐

html5-section元素
<!DOCTYPE html><html lang="en"><head> <meta charset="UTF-8&qu ...
OS Tools-GO富集分析工具的使用与解读详细教程
我们的云平台上的GO富集分析工具,需要输入的文件表格和参数很简单,但很多同学都不明白其中的原理与结果解读,这个帖子就跟大家详细解释~ 一.GO富集介绍: Gene Ontology(简称G ...
python os.path.splitext()
# Split the file extension 可以把扩展名获取出来
Linux下实现免密登录
过程如下: 1.Linux下生成密钥通过命令”ssh-keygen -t rsa“ 2.1 通过ssh-copy-id的方式命令: ssh-copy-id -i ~/.ssh/id_rsa.put ...
Set接口——LinkedHashSet集合
底层是由哈希表+链表:
ClassOne__HomeWork
1,static类型 static类型定义有两类,一类是静态数据,另一类是静态函数. 静态数据跟成员变量不同,它可以通过类名直接访问,而不需要通过定义对象来访问.它的的生成也和成员变量不一样,它只生成 ...
vue：vuex详解
一.什么是Vuex? https://vuex.vuejs.org/zh-cn 官方说法:Vuex 是一个专为 Vue.js应用程序开发的状态管理模式.它采用集中式存储管理应用的所有组件的状态,并以相 ...
JustOj 1414: 潘神的排序
题目描述老潘,袁少,小艾都是江理的大个子,他们想按身高排队,现在给你他们的身高,请你算出队伍中站在第二的有多高. 输入输入三个整数,分别表示三个人的身高.(单位:纳米) 输出输出身高排第二的身高 ...
Autel MaxiSYS PRO MS908P Diagnostic System with Wireless VCI J-2534
You’re a professional mechanic, an enthusiast or and mechanic shop owner? Then you are here on the r ...
关于JSONObject和JSONArray所需要的jar
jakarta commons-lang 2.5 jakarta commons-beanutils 1.8.0 jakarta commons-collections 3.2.1 jakarta c ...

Paper Read: Robust Deep Multi-modal Learning Based on Gated Information Fusion Network

Paper Read: Robust Deep Multi-modal Learning Based on Gated Information Fusion Network的更多相关文章

随机推荐

热门专题