NLP文本多标签分类---HierarchicalAttentionNetwork

最近一直在做多标签分类任务，学习了一种层次注意力模型，基本结构如下：

简单说，就是两层attention机制，一层基于词，一层基于句。

首先是词层面：

输入采用word2vec形成基本语料向量后，采用双向GRU抽特征：

一句话中的词对于当前分类的重要性不同，采用attention机制实现如下：

tensorflow代码实现如下：

···

def attention_word_level(self, hidden_state):

    """

    input1:self.hidden_state: hidden_state:list,len:sentence_length,element:[batch_size*num_sentences,hidden_size*2]

    input2:sentence level context vector:[batch_size*num_sentences,hidden_size*2]

    :return:representation.shape:[batch_size*num_sentences,hidden_size*2]

    """

    hidden_state_ = tf.stack(hidden_state, axis=1)  # shape:[batch_size*num_sentences,sequence_length,hidden_size*2]

    # 0) one layer of feed forward network

    hidden_state_2 = tf.reshape(hidden_state_, shape=[-1,

                                                      self.hidden_size * 2])  # shape:[batch_size*num_sentences*sequence_length,hidden_size*2]

    # hidden_state_:[batch_size*num_sentences*sequence_length,hidden_size*2];W_w_attention_sentence:[,hidden_size*2,,hidden_size*2]

    hidden_representation = tf.nn.tanh(tf.matmul(hidden_state_2,

                                                 self.W_w_attention_word) + self.W_b_attention_word)  # shape:[batch_size*num_sentences*sequence_length,hidden_size*2]

    hidden_representation = tf.reshape(hidden_representation, shape=[-1, self.sequence_length,

                                                                     self.hidden_size * 2])  # shape:[batch_size*num_sentences,sequence_length,hidden_size*2]

    # attention process:1.get logits for each word in the sentence. 2.get possibility distribution for each word in the sentence. 3.get weighted sum for the sentence as sentence representation.

    # 1) get logits for each word in the sentence.

    hidden_state_context_similiarity = tf.multiply(hidden_representation,

                                                   self.context_vecotor_word)  # shape:[batch_size*num_sentences,sequence_length,hidden_size*2]

    attention_logits = tf.reduce_sum(hidden_state_context_similiarity,

                                     axis=2)  # shape:[batch_size*num_sentences,sequence_length]

    # subtract max for numerical stability (softmax is shift invariant). tf.reduce_max:Computes the maximum of elements across dimensions of a tensor.

    attention_logits_max = tf.reduce_max(attention_logits, axis=1,

                                         keep_dims=True)  # shape:[batch_size*num_sentences,1]

    # 2) get possibility distribution for each word in the sentence.

    p_attention = tf.nn.softmax(

        attention_logits - attention_logits_max)  # shape:[batch_size*num_sentences,sequence_length]

    # 3) get weighted hidden state by attention vector

    p_attention_expanded = tf.expand_dims(p_attention, axis=2)  # shape:[batch_size*num_sentences,sequence_length,1]

    # below sentence_representation'shape:[batch_size*num_sentences,sequence_length,hidden_size*2]<----p_attention_expanded:[batch_size*num_sentences,sequence_length,1];hidden_state_:[batch_size*num_sentences,sequence_length,hidden_size*2]

    sentence_representation = tf.multiply(p_attention_expanded,

                                          hidden_state_)  # shape:[batch_size*num_sentences,sequence_length,hidden_size*2]

    sentence_representation = tf.reduce_sum(sentence_representation,

                                            axis=1)  # shape:[batch_size*num_sentences,hidden_size*2]

    return sentence_representation  # shape:[batch_size*num_sentences,hidden_size*2]

···

句子层面和词层面基本相同

双向GRU输入，softmax计算attention

最后基于句子层面的输出，计算分类

指数损失

github源代码：https://github.com/zhaowei555/multi_label_classify/tree/master/han

NLP文本多标签分类---HierarchicalAttentionNetwork的更多相关文章

fastText、TextCNN、TextRNN……这里有一套NLP文本分类深度学习方法库供你选择
https://mp.weixin.qq.com/s/_xILvfEMx3URcB-5C8vfTw 这个库的目的是探索用深度学习进行NLP文本分类的方法. 它具有文本分类的各种基准模型,还支持多标签分 ...
NLP文本分类方法汇总
模型: FastText TextCNN TextRNN RCNN 分层注意网络(Hierarchical Attention Network) 具有注意的seq2seq模型(seq2seq with ...
NLP文本分类
引言其实最近挺纠结的,有一点点焦虑,因为自己一直都期望往自然语言处理的方向发展,梦想成为一名NLP算法工程师,也正是我喜欢的事,而不是为了生存而工作.我觉得这也是我这辈子为数不多的剩下的可以自己去追 ...
浅谈NLP 文本分类/情感分析任务中的文本预处理工作
目录浅谈NLP 文本分类/情感分析任务中的文本预处理工作前言 NLP相关的文本预处理浅谈NLP 文本分类/情感分析任务中的文本预处理工作前言之所以心血来潮想写这篇博客,是因为最近在关注N ...
LM-MLC 一种基于完型填空的多标签分类算法
LM-MLC 一种基于完型填空的多标签分类算法 1 前言本文主要介绍本人在全球人工智能技术创新大赛[赛道一]设计的一种基于完型填空(模板)的多标签分类算法:LM-MLC,该算法拟合能力很强能感知标签 ...
CSS.02 -- 样式表及标签分类（块、行、行内块元素）、CSS三大特性、背景属性
样式表书写位置内嵌式写法 <head> <style type="text/css"> 样式表写法 </style> </head&g ...
html(常用标签,标签分类)，页面模板， CSS(css的三种引入方式)，三种引入方式优先级
HTML 标记语言为非编程语言负责完成页面的结构组成: 标签:被<>包裹的由字母开头,可以结合合法字符( -|数字 ),能被浏览器解析的特殊符号,标签有头有尾指令:被<>包 ...
从零开始学 Web 之 CSS（二）文本、标签、特性
大家好,这里是「 Daotin的梦呓」从零开始学 Web 系列教程.此文首发于「 Daotin的梦呓」公众号,欢迎大家订阅关注.在这里我会从 Web 前端零基础开始,一步步学习 Web 相关的知识 ...
Python-HTML 最强标签分类
编程: 使用(展示)数据存储数据处理数据前端 1. 前端是做什么的? 2. 我们为什么要学前端? 3. 前端都有哪些内容? 1. HTML 2. CSS 3. JavaScript 4.jQue ...

随机推荐

Spring学习（三）--Spring的IOC
1.BeanFactory和FactoryBean BeanFactory是一个接口类,定义了IOC容器最基本的形式,提供了IOC容器所应该遵守的基本服务契约. FactoryBean是一个能产生或者 ...
Centos-强制将内存中数据写入磁盘-sync
sync 强制将内存中数据写入磁盘,以免数据丢失.在linux系统中,修改过的操作并不会立即写入磁盘,而是先写到内存中,通过buffer队列当达到指定时间或者指定大小再一次性写入磁盘,提高IO效率,正 ...
C/C++的二分查找
假设有一种温度传感器,已经测得它的电压和温度的对应关系,将电压值以ADC转换后的数字量的值表示,形成温度-AD值的对照表,如下. 大致成一条反比关系的曲线. ADC的底层驱动已经写好,对外有一个接口可 ...
如何选择JVM垃圾回收器？
明确垃圾回收器组合 -XX:+UseSerialGC 年轻代和老年代都用串行收集器 -XX:+UseParNewGC 年轻代使用ParNew,老年代使用 Serial Old -XX:+UsePara ...
ubuntu 18.04 搭建flask服务器(大合集，个人实操)
ubuntu 18.04 搭建flask服务器(大合集) Ubuntu python flask 服务器本次使用的Ubuntu版本为:Ubuntu 18.04.5 LTS (GNU/Linux 4. ...
Docker笔记1：Docker 的介绍
目录 1.Docker 简介 2.Docker 特性 3.Docker 应用场景 4.Docker 优点 1.Docker 简介 Docker 提供了一个可以运行你的应用程序的封套(env ...
mysql5.7.23 解压版密码忘记了咋办？？
mysql 5.7.23 err文件: 查看log中保存的密码记下密码,重新启动MySQL服务,并进入CMD命令行,在此使用mysql -u root -p登陆,键入密码进入数据库后,使用se ...
ssh登录二次验证，让服务器更安全。
码云地址 sshdTwoVerification 介绍 ssh登录二次验证问题:现在很多人的Linux服务器可能会被攻击,只校验一次后台用户名密码登录变得不再保险. 当然大家首先要做的是修改ssh服 ...
用算法去扫雷(go语言)
最初的准备首先得完成数据的录入,及从扫雷的程序读取界面数据成为我的算法可识别的数据其次是设计扫雷的算法,及如何才能判断格子是雷或者可以点击鼠标左键和中键. 然后将步骤2的到的结果通过我的程序实现鼠 ...
keccak和sha3的区别
keccak应用在以太坊中,用keccak哈希算法来计算公钥的256位哈希,再截取这256位哈希的后160位哈希作为地址值. keccak和sha3的区别 sha3由keccak标准化而来,在很多场 ...

NLP文本多标签分类---HierarchicalAttentionNetwork

NLP文本多标签分类---HierarchicalAttentionNetwork的更多相关文章

随机推荐

热门专题