论文信息

论文标题：Rumor Detection with Self-supervised Learning on Texts and Social Graph
论文作者：Yuan Gao, Xiang Wang, Xiangnan He, Huamin Feng, Yongdong Zhang
论文来源：2202，arXiv
论文地址：download
论文代码：download

1 Introduction

　　出发点：考虑异构信息；

　　本文的贡献描述：看看就行...............................

2 Methodology

　　整体框架：

　　模块：

　　(1) propagation representation learning, which applies a GNN model on the propagation tree;

　　(2) semantic representation learning, which employs a text CNN model on the post contents;

　　(3) contrastive learning, which models the co-occurring relations among propagation and semantic representations;

　　(4) rumor prediction, which builds a predictor model upon the event representations.

2.1 Propagation Representation Learning

　　考虑结构特征

　　对于帖子特征编码：

　　　　$\mathbf{H}^{(l)}=\sigma\left(\mathbf{D}^{-\frac{1}{2}} \hat{\mathbf{A}} \mathbf{D}^{-\frac{1}{2}} \mathbf{H}^{(l-1)} \mathbf{W}^{(l)}\right)\quad\quad\quad(2)$

　　帖子图级表示：

　　　　$\mathbf{g}=f_{\text {mean-pooling }}\left(\mathbf{H}^{(L)}\right)\quad\quad\quad(3)$

2.2 Semantic Representation Learning

　　考虑语义特征

　　首先：在帖子特征上使用多头注意力机制得到初始词嵌入 $\mathbf{Z} \in \mathbb{R}^{l \times d_{\text {model }}}$ （$l$ 代表着帖子数，$d_{\text {model }}$ 代表帖子的维度）：

　　　　$\boldsymbol{Z}_{i}=f_{\text {attention }}\left(\boldsymbol{Q}_{i}, \boldsymbol{K}_{i}, \boldsymbol{V}_{i}\right)=f_{\text {softmax }}\left(\frac{\boldsymbol{Q}_{i} \boldsymbol{K}_{i}^{T}}{\sqrt{d_{k}}}\right) \boldsymbol{V}_{i}\quad\quad\quad(4)$

　　　　$\boldsymbol{Z}=f_{\text {multi-head }}(\boldsymbol{Q}, \boldsymbol{K}, \boldsymbol{V})=f_{\text {concatenate }}\left(\boldsymbol{Z}_{1}, \ldots, \boldsymbol{Z}_{h}\right) \boldsymbol{W}^{O}\quad\quad\quad(5)$

　　接着：使用 CNN 进一步提取文本信息

　　考虑感受野大小为 $h$ ，得到 feature vector $\boldsymbol{v}_{i}$

　　　　$\boldsymbol{v}_{i}=\sigma\left(\boldsymbol{w} \cdot \boldsymbol{z}_{i: i+h-1}+\boldsymbol{b}\right)\quad\quad\quad(6)$

　　在 sentence 中遍历，得到词向量集合：

　　　　$\boldsymbol{v}=\left[\boldsymbol{v}_{1}, \boldsymbol{v}_{2}, \ldots, \boldsymbol{v}_{n-h+1}\right]$

　　在词向量集合 $\boldsymbol{v}$ 采用 max-pooling 得到全局表示 $\hat{\boldsymbol{v}}\quad\quad\quad(7)$：

　　　　$\hat{\boldsymbol{v}}=f_{\text {max-pooling }}(\boldsymbol{v})\quad\quad\quad(8)$

　　考虑使用 $n$ 个 feature map ，并拼接表示得到文本表示 $\mathbf{t}$：

　　　　$\mathbf{t}=f_{\text {concatenate }}\left(\hat{\boldsymbol{v}}_{1}, \hat{\boldsymbol{v}}_{2}, \ldots, \hat{\boldsymbol{v}}_{n}\right)\quad\quad\quad(4)$

2.3 Contrastive Learning

　　本文认为同一帖子的基于结构的表示 $\boldsymbol{g}_{i}$ 和基于语义 $\boldsymbol{t}_{i}$ 的表示是正对：

2.3.1 Propagation-Semantic Instance Discrimination (PSID)

　　　　${\large \mathcal{L}_{\mathrm{ssl}}=\sum\limits_{i \in C}-\log \left[\frac{\exp \left(s\left(\boldsymbol{g}_{i}, \boldsymbol{t}_{i}\right) / \tau\right)}{\sum\limits _{j \in C} \exp \left(s\left(\boldsymbol{g}_{i}, \boldsymbol{t}_{j}\right) / \tau\right)}\right]} \quad\quad\quad(10)$

2.3.2 Propagation-Semantic Cluster Discrimination (PSCD)

　　聚类级对比学习：

　　　　$\begin{array}{l}\underset{\mathbf{S}_{G}}{\text{min}}\quad \sum\limits _{c \in C} \underset{\mathbf{a}_{1}}{\text{min}} \left\|E_{1}(\mathbf{g})-\mathbf{S}_{G} \mathbf{a}_{1}\right\|_{2}^{2}+\underset{\mathbf{S}_{T}}{\text{min}} \sum\limits _{c \in C} \underset{\mathbf{a}_{2}}{\text{min}} \left\|E_{2}(\mathbf{t})-\mathbf{S}_{T} \mathbf{a}_{2}\right\|_{2}^{2}\\\text { s.t. } \quad \mathbf{a}_{1}^{\top} \mathbf{1}=1, \quad \mathbf{a}_{2}^{\top} \mathbf{1}=1\end{array}\quad\quad\quad(11)$

　　其中：

- $\mathbf{S}_{G}\in \mathbb{R}^{d \times K}$ 和 $\mathbf{S}_{T} \in \mathbb{R}^{d \times K}$ 分别代表了基于结构信息和基于语义信息的可训练质心矩阵；
- $\mathbf{a}_{1}\in\{0,1\}^{K}$ 和 $\mathbf{a}_{2} \in\{0,1\}^{K}$ 代表了聚类分配；
- $E_{1}$ 和 $E_{2}$ 代表了编码器；

　　因此，建立了一组最优的集群分配 $\left\{\mathbf{a}_{1}^{*}, \mathbf{a}_{2}^{*} \mid c \in \mathcal{C}\right\}$ 作为伪标签或监督信号，以增强帖子的表示，如下所示：

　　　　$\mathcal{L}_{\mathrm{ssl}}=\sum\limits _{c \in C} l\left(f_{1}\left(E_{1}(\mathbf{g})\right), \mathbf{a}_{2}\right)+l\left(f_{2}\left(E_{2}(\mathbf{t})\right), \mathbf{a}_{1}\right) \quad\quad\quad(12)$

　　其中，$l(\cdot)$ 是 negative log-softmax function $l(\cdot) = -\operatorname{LogSoftmax}\left(x_{i}\right)=\log \left(\frac{\exp \left(x_{i}\right)}{\sum\limits _{j} \exp \left(x_{j}\right)}\right)$，$f_{1}(\cdot)$ 、$f_{2}(\cdot)$是一个可训练的分类器。

　　PSCD 和 PSID 的处理过程如 Figure 3 ：

2.4 Rumor Prediction

　　我们在 post $c$ 的传播表示 $g$ 的基础上建立了一个谣言检测器模型：

　　　　$p(c)=\sigma(\mathbf{W} \mathbf{g}+\mathbf{b})\quad\quad\quad(13)$

　　采用交叉熵做分类损失：

　　　　$\mathcal{L}_{\text {main }}=-\sum\limits_{c \in C} y \log (p(c))\quad\quad\quad(14)$

　　总损失：

　　　　$\mathcal{L}=\mathcal{L}_{\text {main }}+\lambda \mathcal{L}_{\text {ssl }}\quad\quad\quad(15)$

　　算法流程如 Algorithm 1 所示：

3 Experiments and Analyses

3.1 Dataset

3.2 Result

3.3 Ablation Analysis

谣言检测（）《Rumor Detection with Self-supervised Learning on Texts and Social Graph》的更多相关文章

谣言检测——《MFAN: Multi-modal Feature-enhanced Attention Networks for Rumor Detection》
论文信息论文标题:MFAN: Multi-modal Feature-enhanced Attention Networks for Rumor Detection论文作者:Jiaqi Zheng, ...
论文解读（FedGAT）《Federated Graph Attention Network for Rumor Detection》
论文信息论文标题:Federated Graph Attention Network for Rumor Detection论文作者:Huidong Wang, Chuanzheng Bai, Ji ...
谣言检测（ClaHi-GAT）《Rumor Detection on Twitter with Claim-Guided Hierarchical Graph Attention Networks》
论文信息论文标题:Rumor Detection on Twitter with Claim-Guided Hierarchical Graph Attention Networks论文作者:Erx ...
谣言检测（PSIN）——《Divide-and-Conquer: Post-User Interaction Network for Fake News Detection on Social Media》
论文信息论文标题:Divide-and-Conquer: Post-User Interaction Network for Fake News Detection on Social Media论 ...
谣言检测（）——《Debunking Rumors on Twitter with Tree Transformer》
论文信息论文标题:Debunking Rumors on Twitter with Tree Transformer论文作者:Jing Ma.Wei Gao论文来源:2020,COLING论文地址: ...
谣言检测（PLAN）——《Interpretable Rumor Detection in Microblogs by Attending to User Interactions》
论文信息论文标题:Interpretable Rumor Detection in Microblogs by Attending to User Interactions论文作者:Ling Min ...
谣言检测（RDEA）《Rumor Detection on Social Media with Event Augmentations》
论文信息论文标题:Rumor Detection on Social Media with Event Augmentations论文作者:Zhenyu He, Ce Li, Fan Zhou, Y ...
目标检测系列 --- RCNN: Rich feature hierarchies for accurate object detection and semantic segmentation Tech report
目标检测系列 --- RCNN: Rich feature hierarchies for accurate object detection and semantic segmentation Te ...
谣言检测——（PSA）《Probing Spurious Correlations in Popular Event-Based Rumor Detection Benchmarks》
论文信息论文标题:Probing Spurious Correlations in Popular Event-Based Rumor Detection Benchmarks论文作者:Jiayin ...

随机推荐

SpringBoot中如何使用自带的定时任务
随便创建一个类,@Component交给spring管理,用注解@EnableScheduling,让定时任务生效方法上加注解:@Scheduled(cron = "你的cron表达式&q ...
[ARC119E] Pancakes （二维偏序，分类讨论）
题面一个长为 N N N 的序列 S S S ,最多翻转序列中一个区间,最小化 ∑ i = 2 N ∣ S i − S i − 1 ∣ \sum_{i=2}^{N}|S_i-S_{i-1}| i=2 ...
pod资源的健康检查-liveness探针的httpGet使用
使用liveness探针httpget方式检测pod健康,httpGet方式使用的最多 [root@k8s-master1 tanzhen]# cat nginx_pod_httpGet.yaml a ...
JMeter测试dubbo接口总结
Jmeter 测试dubbo 接口 1. 安装JMeter 安装到/usr/local下 2. github上下载 jmeter-plugins-dubbo-x.x.x-jar-with-depend ...
K8S_三种Port区别总结
nodePort: 外部流量访问K8S集群中Service入口的一种方式比如外部用户要访问k8s集群中的一个Web应用,那么我们可以配置对应service的type=NodePort,nodePor ...
redis的简单学习记录
安装 1 brew install redis 启动redis服务 1 redis-server & 启动命令 1 redis-cli -h 127.0.0.1 -p 6379 利用gored ...
2021年3月-第02阶段-前端基础-Flex 伸缩布局-移动WEB开发_flex布局
移动web开发--flex布局 1.0 传统布局和flex布局对比 1.1 传统布局兼容性好布局繁琐局限性,不能再移动端很好的布局 1.2 flex布局操作方便,布局极其简单,移动端使用比较广 ...
2021年3月-第03阶段-前端基础-JavaScript基础语法-JavaScript基础第01天
1 - 编程语言 1.1 编程编程: 就是让计算机为解决某个问题而使用某种程序设计语言编写程序代码,并最终得到结果的过程. 计算机程序: 就是计算机所执行的一系列的指令集合,而程序全部都是用我们所掌 ...
从 Yum 更新中排除特定/某些包的三种方法
方法 1:手动或临时用 yum 命令排除包要排除单个包: # yum update --exclude=kernel 或者 # yum update -x 'kernel' 要排除多个包.以下命令将 ...
第二章：视图层 - 9：动态生成CSV文件
CSV (Comma Separated Values),以纯文本形式存储数字和文本数据的存储方式.纯文本意味着该文件是一个字符序列,不含必须像二进制数字那样的数据.CSV文件由任意数目的记录组成,记 ...

谣言检测（）《Rumor Detection with Self-supervised Learning on Texts and Social Graph》