Saw a tweet from Andrew Liam Trask, sounds like Oxford DeepNLP 2017 class have all videos slides practicals all up. Thanks Andrew for the tip!
Saw a tweet from Andrew Liam Trask, sounds like Oxford DeepNLP 2017 class have all videos/slides/practicals all up.
Thanks Andrew for the tip!
Preamble
This repository contains the lecture slides and course description for the Deep Natural Language Processing course offered in Hilary Term 2017 at the University of Oxford.
This is an advanced course on natural language processing. Automatically processing natural language inputs and producing language outputs is a key component of Artificial General Intelligence. The ambiguities and noise inherent in human communication render traditional symbolic AI techniques ineffective for representing and analysing language data. Recently statistical techniques based on neural networks have achieved a number of remarkable successes in natural language processing leading to a great deal of commercial and academic interest in the field
This is an applied course focussing on recent advances in analysing and generating speech and text using recurrent neural networks. We introduce the mathematical definitions of the relevant machine learning models and derive their associated optimisation algorithms. The course covers a range of applications of neural networks in NLP including analysing latent dimensions in text, transcribing speech to text, translating between languages, and answering questions. These topics are organised into three high level themes forming a progression from understanding the use of neural networks for sequential language modelling, to understanding their use as conditional language models for transduction tasks, and finally to approaches employing these techniques in combination with other mechanisms for advanced applications. Throughout the course the practical implementation of such models on CPU and GPU hardware is also discussed.
This course is organised by Phil Blunsom and delivered in partnership with the DeepMind Natural Language Research Group.
Lecturers
- Phil Blunsom (Oxford University and DeepMind)
- Chris Dyer (Carnegie Mellon University and DeepMind)
- Edward Grefenstette (DeepMind)
- Karl Moritz Hermann (DeepMind)
- Andrew Senior (DeepMind)
- Wang Ling (DeepMind)
- Jeremy Appleyard (NVIDIA)
TAs
- Yannis Assael
- Yishu Miao
- Brendan Shillingford
- Jan Buys
Timetable
Practicals
- Group 1 - Monday, 9:00-11:00 (Weeks 2-8), 60.05 Thom Building
- Group 2 - Friday, 16:00-18:00 (Weeks 2-8), Room 379
- Practical 1: word2vec
- Practical 2: text classification
- Practical 3: recurrent neural networks for text classification and language modelling
- Practical 4: open practical
Lectures
Public Lectures are held in Lecture Theatre 1 of the Maths Institute, on Tuesdays and Thursdays (except week 8), 16:00-18:00 (Hilary Term Weeks 1,3-8).
Lecture Materials
1. Lecture 1a - Introduction [Phil Blunsom]
This lecture introduces the course and motivates why it is interesting to study language processing using Deep Learning techniques.
2. Lecture 1b - Deep Neural Networks Are Our Friends [Wang Ling]
This lecture revises basic machine learning concepts that students should know before embarking on this course.
3. Lecture 2a- Word Level Semantics [Ed Grefenstette]
Words are the core meaning bearing units in language. Representing and learning the meanings of words is a fundamental task in NLP and in this lecture the concept of a word embedding is introduced as a practical and scalable solution.
Reading
Embeddings Basics
- Firth, John R. "A synopsis of linguistic theory, 1930-1955." (1957): 1-32.
- Curran, James Richard. "From distributional to semantic similarity." (2004).
- Collobert, Ronan, et al. "Natural language processing (almost) from scratch." Journal of Machine Learning Research 12. Aug (2011): 2493-2537.
- Mikolov, Tomas, et al. "Distributed representations of words and phrases and their compositionality." Advances in neural information processing systems. 2013.
Datasets and Visualisation
- Finkelstein, Lev, et al. "Placing search in context: The concept revisited." Proceedings of the 10th international conference on World Wide Web. ACM, 2001.
- Hill, Felix, Roi Reichart, and Anna Korhonen. "Simlex-999: Evaluating semantic models with (genuine) similarity estimation." Computational Linguistics (2016).
- Maaten, Laurens van der, and Geoffrey Hinton. "Visualizing data using t-SNE." Journal of Machine Learning Research 9.Nov (2008): 2579-2605.
Blog posts
- Deep Learning, NLP, and Representations, Christopher Olah.
- Visualizing Top Tweeps with t-SNE, in Javascript, Andrej Karpathy.
Further Reading
- Hermann, Karl Moritz, and Phil Blunsom. "Multilingual models for compositional distributed semantics." arXiv preprint arXiv:1404.4641 (2014).
- Levy, Omer, and Yoav Goldberg. "Neural word embedding as implicit matrix factorization." Advances in neural information processing systems. 2014.
- Levy, Omer, Yoav Goldberg, and Ido Dagan. "Improving distributional similarity with lessons learned from word embeddings." Transactions of the Association for Computational Linguistics 3 (2015): 211-225.
- Ling, Wang, et al. "Two/Too Simple Adaptations of Word2Vec for Syntax Problems." HLT-NAACL. 2015.
4. Lecture 2b - Overview of the Practicals [Chris Dyer]
This lecture motivates the practical segment of the course.
5. Lecture 3 - Language Modelling and RNNs Part 1 [Phil Blunsom]
Language modelling is important task of great practical use in many NLP applications. This lecture introduces language modelling, including traditional n-gram based approaches and more contemporary neural approaches. In particular the popular Recurrent Neural Network (RNN) language model is introduced and its basic training and evaluation algorithms described.
Reading
Textbook
Blogs
- The Unreasonable Effectiveness of Recurrent Neural Networks, Andrej Karpathy.
- The unreasonable effectiveness of Character-level Language Models, Yoav Goldberg.
- Explaining and illustrating orthogonal initialization for recurrent neural networks, Stephen Merity.
6. Lecture 4 - Language Modelling and RNNs Part 2 [Phil Blunsom]
This lecture continues on from the previous one and considers some of the issues involved in producing an effective implementation of an RNN language model. The vanishing and exploding gradient problem is described and architectural solutions, such as Long Short Term Memory (LSTM), are introduced.
Reading
Textbook
Vanishing gradients, LSTMs etc.
- On the difficulty of training recurrent neural networks. Pascanu et al., ICML 2013.
- Long Short-Term Memory. Hochreiter and Schmidhuber, Neural Computation 1997.
- Learning Phrase Representations using RNN EncoderDecoder for Statistical Machine Translation. Cho et al, EMNLP 2014.
- Blog: Understanding LSTM Networks, Christopher Olah.
Dealing with large vocabularies
- A scalable hierarchical distributed language model. Mnih and Hinton, NIPS 2009.
- A fast and simple algorithm for training neural probabilistic language models. Mnih and Teh, ICML 2012.
- On Using Very Large Target Vocabulary for Neural Machine Translation. Jean et al., ACL 2015.
- Exploring the Limits of Language Modeling. Jozefowicz et al., arXiv 2016.
- Efficient softmax approximation for GPUs. Grave et al., arXiv 2016.
- Notes on Noise Contrastive Estimation and Negative Sampling. Dyer, arXiv 2014.
- Pragmatic Neural Language Modelling in Machine Translation. Baltescu and Blunsom, NAACL 2015
Regularisation and dropout
- A Theoretically Grounded Application of Dropout in Recurrent Neural Networks. Gal and Ghahramani, NIPS 2016.
- Blog: Uncertainty in Deep Learning, Yarin Gal.
Other stuff
- Recurrent Highway Networks. Zilly et al., arXiv 2016.
- Capacity and Trainability in Recurrent Neural Networks. Collins et al., arXiv 2016.
7. Lecture 5 - Text Classification [Karl Moritz Hermann]
This lecture discusses text classification, beginning with basic classifiers, such as Naive Bayes, and progressing through to RNNs and Convolution Networks.
Reading
- Recurrent Convolutional Neural Networks for Text Classification. Lai et al. AAAI 2015.
- A Convolutional Neural Network for Modelling Sentences, Kalchbrenner et al. ACL 2014.
- Semantic compositionality through recursive matrix-vector, Socher et al. EMNLP 2012.
- Blog: Understanding Convolution Neural Networks For NLP, Denny Britz.
- Thesis: Distributional Representations for Compositional Semantics, Hermann (2014).
8. Lecture 6 - Deep NLP on Nvidia GPUs [Jeremy Appleyard]
This lecture introduces Graphical Processing Units (GPUs) as an alternative to CPUs for executing Deep Learning algorithms. The strengths and weaknesses of GPUs are discussed as well as the importance of understanding how memory bandwidth and computation impact throughput for RNNs.
Reading
- Optimizing Performance of Recurrent Neural Networks on GPUs. Appleyard et al., arXiv 2016.
- Persistent RNNs: Stashing Recurrent Weights On-Chip, Diamos et al., ICML 2016
- Efficient softmax approximation for GPUs. Grave et al., arXiv 2016.
9. Lecture 7 - Conditional Language Models [Chris Dyer]
In this lecture we extend the concept of language modelling to incorporate prior information. By conditioning an RNN language model on an input representation we can generate contextually relevant language. This very general idea can be applied to transduce sequences into new sequences for tasks such as translation and summarisation, or images into captions describing their content.
Reading
- Recurrent Continuous Translation Models. Kalchbrenner and Blunsom, EMNLP 2013
- Sequence to Sequence Learning with Neural Networks. Sutskever et al., NIPS 2014
- Multimodal Neural Language Models. Kiros et al., ICML 2014
- Show and Tell: A Neural Image Caption Generator. Vinyals et al., CVPR 2015
10. Lecture 8 - Generating Language with Attention [Chris Dyer]
This lecture introduces one of the most important and influencial mechanisms employed in Deep Neural Networks: Attention. Attention augments recurrent networks with the ability to condition on specific parts of the input and is key to achieving high performance in tasks such as Machine Translation and Image Captioning.
Reading
- Neural Machine Translation by Jointly Learning to Align and Translate. Bahdanau et al., ICLR 2015
- Show, Attend, and Tell: Neural Image Caption Generation with Visual Attention. Xu et al., ICML 2015
- Incorporating structural alignment biases into an attentional neural translation model. Cohn et al., NAACL 2016
- BLEU: a Method for Automatic Evaluation of Machine Translation. Papineni et al, ACL 2002
11. Lecture 9 - Speech Recognition (ASR) [Andrew Senior]
Automatic Speech Recognition (ASR) is the task of transducing raw audio signals of spoken language into text transcriptions. This talk covers the history of ASR models, from Gaussian Mixtures to attention augmented RNNs, the basic linguistics of speech, and the various input and output representations frequently employed.
12. Lecture 10 - Text to Speech (TTS) [Andrew Senior]
This lecture introduces algorithms for converting written language into spoken language (Text to Speech). TTS is the inverse process to ASR, but there are some important differences in the models applied. Here we review traditional TTS models, and then cover more recent neural approaches such as DeepMind's WaveNet model.
13. Lecture 11 - Question Answering [Karl Moritz Hermann]
Reading
- Teaching machines to read and comprehend. Hermann et al., NIPS 2015
- Deep Learning for Answer Sentence Selection. Yu et al., NIPS Deep Learning Workshop 2014
14. Lecture 12 - Memory [Ed Grefenstette]
Reading
- Hybrid computing using a neural network with dynamic external memory. Graves et al., Nature 2016
- Reasoning about Entailment with Neural Attention. Rocktäschel et al., ICLR 2016
- Learning to transduce with unbounded memory. Grefenstette et al., NIPS 2015
- End-to-End Memory Networks. Sukhbaatar et al., NIPS 2015
15. Lecture 13 - Linguistic Knowledge in Neural Networks
Piazza
We will be using Piazza to facilitate class discussion during the course. Rather than emailing questions directly, I encourage you to post your questions on Piazza to be answered by your fellow students, instructors, and lecturers. However do please do note that all the lecturers for this course are volunteering their time and may not always be available to give a response.
Find our class page at: https://piazza.com/ox.ac.uk/winter2017/dnlpht2017/home
Assessment
The primary assessment for this course will be a take-home assignment issued at the end of the term. This assignment will ask questions drawing on the concepts and models discussed in the course, as well as from selected research publications. The nature of the questions will include analysing mathematical descriptions of models and proposing extensions, improvements, or evaluations to such models. The assignment may also ask students to read specific research publications and discuss their proposed algorithms in the context of the course. In answering questions students will be expected to both present coherent written arguments and use appropriate mathematical formulae, and possibly pseudo-code, to illustrate answers.
The practical component of the course will be assessed in the usual way.
Acknowledgements
This course would not have been possible without the support of DeepMind, The University of Oxford Department of Computer Science, Nvidia, and the generous donation of GPU resources from Microsoft Azure.
Saw a tweet from Andrew Liam Trask, sounds like Oxford DeepNLP 2017 class have all videos slides practicals all up. Thanks Andrew for the tip!的更多相关文章
- Andrew NG 自动化所演讲(20140707):DeepLearning Overview and Trends
出处 以下内容转载于 网友 Fiona Duan,感谢作者分享 (原作的图片显示有问题,所以我从别处找了一些附上,小伙伴们可以看看).最近越来越觉得人工智能,深度学习是一个很好的发展方向,应该也是未来 ...
- How do I learn machine learning?
https://www.quora.com/How-do-I-learn-machine-learning-1?redirected_qid=6578644 How Can I Learn X? ...
- English Phrases with THE – Linking the TH Sound
English Phrases with THE – Linking the TH Sound Share Tweet Share Tagged With: The Word THE Study En ...
- 2016CVPR论文集
http://www.cv-foundation.org/openaccess/CVPR2016.py ORAL SESSION Image Captioning and Question Answe ...
- 深度学习哪家强?吴恩达、Udacity和Fast.ai的课程我们替你分析好了
http://www.jianshu.com/p/28f5473c66a3 翻译 | AI科技大本营(rgznai100) 参与 | reason_W 引言 过去2年,我一直积极专注于深度学习领域.我 ...
- Elasticsearch之基本操作
elasticsearch是一个是开源的(Apache2协议),分布式的,RESTful的,构建在Apache Lucene之上的的搜索引擎. 它有很多特点例如Schema Free,Document ...
- CVPR2016 Paper list
CVPR2016 Paper list ORAL SESSIONImage Captioning and Question Answering Monday, June 27th, 9:00AM - ...
- 操作系统Unix、Windows、Mac OS、Linux的故事
电脑,计算机已经成为我们生活中必不可少的一部分.无论是大型的超级计算机,还是手机般小巧的终端设备,都跑着一个操作系统.正是这些操作系统,让那些硬件和芯片得意组合起来,让那些软件得以运行,让我们的世界在 ...
- CF455C Civilization (并查集)
CF456E Codeforces Round #260 (Div. 1) C Codeforces Round #260 (Div. 2) E http://codeforces.com/conte ...
随机推荐
- 取消 windows2008 server 禁ping
windows 2008 server 默认是禁ping的,取消方法如下: 依次打开: 服务器管理器——配置——高级安全windows防火墙——入站规则 找到“文件和打印机共享(回显请求-ICMPv4 ...
- BZOJ1725】[Usaco2006 Nov]Corn Fields牧场的安排 状压DP
Description Farmer John新买了一块长方形的牧场,这块牧场被划分成M列N行(1<=M<=12; 1<=N<=12),每一格都是一块正方形的土地.FJ打算在牧 ...
- 数据恢复(Data recovery)
定义数据恢复: 当存储介质出现损伤或由于人员误操作.操作系统故障本身故障所造成的数据不可见,无法读取.丢失. 工程师通过特殊的手段读取却在正常状态下不可见,不可读,无法读的数据. 数据恢复(Data ...
- shell脚本:Ctrl+C终止的是哪个进程
aa.sh中的内容如下图: 运行sh aa.sh, 显示aa.txt后面几行, 此时开启了两个进程:一个sh运行,一个tail -f运行 按Ctrl+C 会终止此sh进程, 父进程死了,里面的tail ...
- eclipse使用STS插件 报错:SocketTimeoutException: Read timed out
新建boot项目后,提示: SocketTimeoutException: Read timed out 解决: 在eclipse.ini末尾,追加: -Djava.net.preferIPv4Sta ...
- Deep Visualization:可视化并理解CNN(转)
转载地址:https://zhuanlan.zhihu.com/p/24833574 一.前言 CNN作为一个著名的深度学习领域的“黑盒”模型,已经在计算机视觉的诸多领域取得了极大的成功,但是,至今没 ...
- my.兽决_等_价格
1.20170411 音乐洒水车,升50级 送了 兽决 隐身,摆摊推荐价格 20000金,大家都卖26000金 2.20170417 音乐洒水车 挖到 必杀 魔决,推荐价格 19820金,我以 -10 ...
- 终极版clearFix——支持IE6+
/*兼容IE6.7*/ /*这段代码非常暴力,from internet,墙裂推荐*/ .clearFix:before,.clearFix:after{ content:""; ...
- web三大组件
1.Servlet Servlet是用来处理客户端请求的动态资源,也就是当我们在浏览器中键入一个地址回车跳转后,请求就会被发送到对应的Servlet上进行处理. Servlet的任务有: 接收请求数据 ...
- linux终端没有GUI时python使用matplotlib如何画图
import matplotlib as mpl mpl.use('Agg') #而且必须添加在import matplotlib.pyplot之前,否则无效 ======== ======== == ...