

这篇论文用了多种技术的组合: reinforcement learning, word embedding, attention, question and answer, bidirection RNN等。模型挺复杂的,但看下面这张图能够大致弄明白。要是还能加上去年VIN(NIPS2016 best paper: Value Iteration Networks 解析)的模型就更好了。


