Open-domain QA

Overview

The whole system is consisted with Document Retriever and Document Reader. The Document Retriever returns top five Wikipedia articles given any question, then the Document Reader will process these articles.

Document Retriever

The Retriever compares the TF-IDF weighted bag of word vectors between the articles and questions. And if take the word order into account with n-gram features, the performence will be better. In the paper, useing bigram counts performed best. It used hashing of (Weinberger et al., 2009) to map the bigrams to $2^{24}$ bins with an unsigned murmur3 hash to preserv speed and memory efficiency.

Document Reader

The Document Reader was consisted of a multi-layer BiLSTM and a RNN layer. The input first was processed by a RNN, and then a multi-layer BiLSTM.

Paragraph encoding was comprisied of the following pars:

Word embeddings:
- 300d Glove, only fine-tune the 1000 most frequent question words because the representations of some keu words such as what, how, whick, many could be crucial for QA systems.
Exact match:
- Three simple features, indicating whether $p_i$ can be exactly matched to one question word in $q$, either in its original, lowercase or lemma form. It's helpful as the ablation analysis.
Token features:
- POS, NER TF
Aligned question embedding:
- the embedding is actually an attention mechanism between question and paragraph. It was computed as following: ($\alpha(\cdot)$ is a single dense layer with ReLU nonlinearity.)
  \[\begin{aligned}
  &a_{i,j} = \frac{exp(\alpha(E(p_i))\cdot \alpha(E(q_j)))}{\sum_{j^`}exp(\alpha(E(p_i)) \cdot \alpha(E(q_{j^`})))}\\
  &f_{align(p_i)} = \sum_j a_{i,j}E(q_j)
  \end{aligned}\]

Question encoding

Only apply a recurrent NN on top of word embedding of $q_i$ and combine the resulting hidden units into one single vector: $\{q_1, \cdots, q_l\} \rightarrow q$. The $q$ was computed as following:
\[
\begin{aligned}
& b_j = \frac{exp(w \cdot q_j)}{\sum_{j^`} exp(w \cdot q_{j^`})}\\
& q = \sum_j b_jq_j
\end{aligned}
\]
where $b_j$ encodes the importance of each question word. I think the computation is very similar with the question self attention.

Prediction

Take the $p$ and $q$ as input to train a classifier to predict the correct span positions.
\[\begin{aligned}
P_{start}(i) & \propto exp(p_iW_sq)\\
P_{end}(i) & \propto exp(p_iW_eq)
\end{aligned}
\]
Then select the best span from token $i$ and token $i^`$ such that $i \leq i^` \leq i+15$ and $P_{start}(i) \times P_{end}(i^`)$ is maximized.

Analysis

The ablation analysis result:

As the result showing, the aligned feature and exact_match feature are complementary and similar role as it does not matter when removing them respectively, but the performance drops dramatically wehn removing both of them.

Open-Domain QA -paper的更多相关文章

EasyMesh - A Two-Dimensional Quality Mesh Generator
EasyMesh - A Two-Dimensional Quality Mesh Generator eryar@163.com Abstract. EasyMesh is developed by ...
深度学习课程笔记（十七）Meta-learning (Model Agnostic Meta Learning)
深度学习课程笔记(十七)Meta-learning (Model Agnostic Meta Learning) 2018-08-09 12:21:33 The video tutorial can ...
(转) AdversarialNetsPapers
本文转自:https://github.com/zhangqianhui/AdversarialNetsPapers AdversarialNetsPapers The classical Pap ...
Official Program for CVPR 2015
From: http://www.pamitc.org/cvpr15/program.php Official Program for CVPR 2015 Monday, June 8 8:30am ...
生成对抗网络资源 Adversarial Nets Papers
来源:https://github.com/zhangqianhui/AdversarialNetsPapers AdversarialNetsPapers The classical Papers ...
RAC的QA
RAC: Frequently Asked Questions [ID 220970.1] 修改时间 13-JAN-2011 类型 FAQ 状态 PUBLISHED Appli ...
How to implement an algorithm from a scientific paper
Author: Emmanuel Goossaert 翻译 This article is a short guide to implementing an algorithm from a scie ...
paper 54 :图像频率的理解
我一直在思考一个问题,图像增强以后,哪些方面的特征最为显著,思来想去,无果而终!翻看了一篇知网的paper,基于保真度(VIF)的增强图像质量评价,文章中指出无参考质量评价,可以从三个方面考虑:平均梯 ...
如何写出优秀的研究论文 Chapter 1. How to Write an A+ Research Paper
This Chapter outlines the logical steps to writing a good research paper. To achieve supreme excelle ...

随机推荐

PMP知识点（一）——风险登记册
一.Reference: [管理心得之四十八]<风险登记册>本身的风险问题日志与风险登记册的区别与联系 PMBOK重要概念梳理之二十六风险登记册风险登记单-MBAlib 二.Atta ...
windows bat 批处理执行 for 循环无法执行？
示例: cmd 命令行可以执行.但是写成 bat 却不能执行, for /f "delims==" %a in ('dir /b /s F:\F\*.TXT')do copy ...
2018-2019-2 20165231王杨鸿永《网络对抗》Exp1 PC平台逆向破解
实践目标本次实践的对象是一个名为pwn1的linux可执行文件. 该程序正常执行流程是:main调用foo函数,foo函数会简单回显任何用户输入的字符串. 该程序同时包含另一个代码片段,getShe ...
debug 2
Red Hat Developer Toolsetdelivers the latest stable versions of essential GCC C, C++, Fortran, and s ...
webpack4学习笔记
执行webpack-dev-server的时候不会自动生成dist打包目录,怪不得公司的项目里面都没用到webpack-dev-server呢执行webpack的时候会生成dist目录 watch的 ...
js原型链+继承浅析
名称: prototype--原型对象 __proto__--属性原型链与继承网上搜索定义,看起来挺绕的 .先说继承: 所有的对象实例都可以共享原型对象包含的属性和方法例如一个实例A ...
详解MariaDB数据库的存储过程
1.什么是存储过程很多时候,SQL语句都是针对一个或多个表的单条语句.但是也有时候有的查询语句需要进行多次联表查询才能完成,此时就需要用到存储过程了. 存储过程(Stored Procedure)是 ...
搭建jenkins实现自动化部署
搭建jenkins实现自动化部署一.安装jenkins 1.添加yum repos,然后安装 sudo wget -O /etc/yum.repos.d/jenkins.repo https://p ...
python——Pycharm的简单介绍
一.什么是Pycharm? Pycharm是一种python IDE,带有一整套可以帮助用户在使用Python语言开发时提高其效率的工具,比如调试.语法高亮.Project管理.代码跳转.智能提示.自 ...
javaweb c3p0连接oracle12c
最近在搞javaweb,在连接池上碰到了一系列的问题,在Junit测试时,oracle12c报错: ORA-28040: 没有匹配的验证协议百度解决:修改 $ORACLE_HOME/network/ ...

Open-Domain QA -paper