Transformers for Graph Representation
Do Transformers Really Perform Badfor Graph Representation?
1 Introduction
作者们发现关键问题在于如何补回Transformer模型的自注意力层丢失掉的图结构信息!不同于序列数据(NLP, Speech)或网格数据(CV),图的结构信息是图数据特有的属性,且对图的性质预测起着重要的作用。
There are many attempts of leveraging Transformer into the graph domain, but the only effective way is replacing some key modules (e.g., feature aggregation) in classic GNN variants by the softmax attention[47,7,22,48,58,43,13]
- [47] Graph attention networks. ICLR, 2018.
- [7] Graph transformer for graph-to-sequence learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 7464–7471, 2020.
- [22] Heterogeneous graph transformer. In Proceedings of The Web Conference 2020, pages 2704–2710, 2020.
- [48] Direct multi-hop attention based graph neural network.arXiv preprint arXiv:2009.14332, 2020.
- [58] Graph-bert: Only attention is needed forlearning graph representations.arXiv preprint arXiv:2001.05140, 2020.
- [43] Self-supervised graph transformer on large-scale molecular data. Advances in Neural Information ProcessingSystems, 33, 2020.
- [13] generalization of transformer networks to graphs. AAAI Workshop on Deep Learning on Graphs: Methods and Applications, 2021
- Centrality Encoding: capture the node importance in the graph. In particular, we leverage the degree centrality for the centrality encoding, where a learnable vectoris assigned to each node according to its degree and added to the node features in the input layer.
- Spatial Encoding: capture the structural relation between nodes.
- Edge Encoding
2 Graphormer
2.1 Structural Encodings in Graphormer
2.1.1 a Centrality Encoding
In Graphormer, we use the degree centrality, which is one of the standard centrality measures inliterature, as an additional signal to the neural network. To be specific, we develop a Centrality Encoding which assigns each node two real-valued embedding vectors according to its indegree and outdegree.
2.1.2 a Centrality Encoding
An advantage of Transformer is its global receptive field.
Spatial Encoding:
In this paper, we choose φ(vi,vj) to be the distance of the shortest path (SPD) between vi and vj if the two nodes are connected. If not, we set the output ofφto be a special value, i.e., -1. We assign each (feasible) output value a learnable scalar which will serve as a bias term in the self-attention module. Denote Aij as the (i,j)-element of the Query-Key product matrix A, we have:
2.1.3 Edge Encoding in the Attention
In many graph tasks, edges also have structural features.
In the first method, the edge features areadded to the associated nodes’ features [21,29].
- [21] Open graph benchmark: Datasets for machine learning on graphs.arXiv preprintarXiv:2005.00687, 2020.
- [29] Deepergcn: All you need to train deepergcns.arXiv preprint arXiv:2006.07739, 2020
In the second method, for each node, its associated edges’ features will be used together with the node features in the aggregation [15,51,25].
- [51] How powerful are graph neural networks?InInternational Conference on Learning Representations, 2019.
- [25] Semi-supervised classification with graph convolutional networks.arXiv preprint arXiv:1609.02907, 2016
However, such ways of using edge feature only propagate the edge information to its associated nodes, which may not be an effective way to leverage edge information in representation of the whole graph.
a new edge encoding method in Graphormer:
3.2 Implementation Details of Graphormer
Graphormer Layer:
- MHA: multi-head self-attention (MHA)
- FFN: the feed-forward blocks
- LN: the layer normalization
Special Node:
生成一个VNODE连接图中所有的点,而它与所有节点的 spatial encodings 是 a distinct learnable scalar
3 Experiments
3.1 OGB Large-Scale Challenge
3.2 Graph Representation
Transformers for Graph Representation的更多相关文章
- 论文解读(Graphormer)《Do Transformers Really Perform Bad for Graph Representation?》
论文信息 论文标题:Do Transformers Really Perform Bad for Graph Representation?论文作者:Chengxuan Ying, Tianle Ca ...
- 论文解读GALA《Symmetric Graph Convolutional Autoencoder for Unsupervised Graph Representation Learning》
论文信息 Title:<Symmetric Graph Convolutional Autoencoder for Unsupervised Graph Representation Learn ...
- 论文解读(SUGRL)《Simple Unsupervised Graph Representation Learning》
Paper Information Title:Simple Unsupervised Graph Representation LearningAuthors: Yujie Mo.Liang Pen ...
- 论文解读(GMI)《Graph Representation Learning via Graphical Mutual Information Maximization》2
Paper Information 论文作者:Zhen Peng.Wenbing Huang.Minnan Luo.Q. Zheng.Yu Rong.Tingyang Xu.Junzhou Huang ...
- 论文解读(GMI)《Graph Representation Learning via Graphical Mutual Information Maximization》
Paper Information 论文作者:Zhen Peng.Wenbing Huang.Minnan Luo.Q. Zheng.Yu Rong.Tingyang Xu.Junzhou Huang ...
- 论文解读(GRCCA)《 Graph Representation Learning via Contrasting Cluster Assignments》
论文信息 论文标题:Graph Representation Learning via Contrasting Cluster Assignments论文作者:Chun-Yang Zhang, Hon ...
- 论文解读(MERIT)《Multi-Scale Contrastive Siamese Networks for Self-Supervised Graph Representation Learning》
论文信息 论文标题:Multi-Scale Contrastive Siamese Networks for Self-Supervised Graph Representation Learning ...
- 论文解读(SUBG-CON)《Sub-graph Contrast for Scalable Self-Supervised Graph Representation Learning》
论文信息 论文标题:Sub-graph Contrast for Scalable Self-Supervised Graph Representation Learning论文作者:Yizhu Ji ...
- 论文阅读 Dynamic Graph Representation Learning Via Self-Attention Networks
4 Dynamic Graph Representation Learning Via Self-Attention Networks link:https://arxiv.org/abs/1812. ...
随机推荐
- Day003 类型转换
类型转换 由于java是强类型语言,所以要进行有些运算的时候,需要用到类型转换 低------------------------------------------------------> ...
- java+selenium使用JS、键盘滑动滚动条
本篇文章介绍如何使用JS和键盘对象对页面进行滑动滚动条-------------主要针对java做自动化测试的同学 一:使用键盘对象操作滚动条 //导包 import org.openqa.selen ...
- 10.Debug
1.Debug模式 1.1 什么是Debug模式 是供程序员使用的程序调试工具,它可以用于查看程序的执行流程,也可以用于追踪程序执行过程来调试程序. 1.2 Debug介绍与操作流程 Debug调式, ...
- golang:TCP总结
在TCP/IP协议中,"IP地址+TCP或UDP端口号"唯一标识网络通讯中的一个进程."IP地址+端口号"就对应一个socket.欲建立连接的两个进程各自有一个 ...
- [bug] Springboot JPA使用Sort排序时的问题
参考 https://blog.csdn.net/qq_44039966/article/details/102713779
- [Windows] 屏幕截图 - FastStone Capture(FSCapture) v9.4 飞扬时空汉化绿色版(官方地址) 【清晰好用 已验证】
[Windows] 屏幕截图 - FastStone Capture(FSCapture) v9.4 飞扬时空汉化绿色版(官方地址) [复制链接] 愤怒の葡萄 电梯直达 楼主 发表于 2 ...
- Select Screen 0 with xrandr Ask QuestionScreen 0" here describes your whole virtual display made of these two outputs: eDP-1-
Screen 0" here describes your whole virtual display made of these two outputs: eDP-1-1: physica ...
- Linux下Shell实现服务器IP监测
实验室有一个服务器放在机房,装的是Ubuntu Server,IP为自动分配,因此一旦IP有变化就无法远程操作,必须去机房记录新的IP.学了几天Shell之后想,是不是可以定时检测其IP的变化,一旦有 ...
- 强哥MySQL学习笔记
数据库服务器:1.数据库2.数据表 数据表:1.表结构(字段)2.表数据(记录)3.表索引(加快检索) 表引擎:1.myisam2.innodb 查看表字段desc table;删除数据库:drop ...
- python 判断对象是否相等以及eq函数
当对两个点的实例进行值的比较时,比如p1=Point(1,1) p2=Point(1,2),判断p1==p2时__eq__()会被调用,用以判断两个实例是否相等.在上述代码中定义了只要x和y的坐标相同 ...