动人的DL我们有六个月的时间,积累了一定的经验,实验,也DL有了一些自己的想法和理解。曾经想扩大和加深DL相关方面的一些知识。

然后看到了一个MIT按有关的对出版物DL图书http://www.iro.umontreal.ca/~bengioy/dlbook/,所以就有了读一下这本书然后做点笔记攒点知识量的念头。这一系列的博客将是笔记型的,有什么写的不好之处还望广大博友见谅,也欢迎各位同行能指点一二。

这是本书的第一章,下面是个人感觉蛮重要的一些点:

logistic regression can determine whether to recommend cesarean delivery(应用方向)

naive Bayes can separate legitimate e-mail from spam e-mail(应用方向)

Feature Related:

It is not surprising that the choice of representation has an enormous effect on the performance of machine learning algorithms.Input x is often true for input x + epsilon for a small epsilon. This is called the smoothness prior
and is exploited in most applications of machine learning that involve real numbers.Many artificial intelligence tasks can be solved by designing the right set of features to extract for that task, then providing these features to a simple machine learning
algorithm. For example,a useful feature for speaker identification from sound is the pitch. One solution to this problem is to use machine learning to discover not only the map-ping from representation to output but also
the representation itself. This approach is known as representation learning. When designing features or algorithms for learning features, our goal is usually to separate the factors of variation that explain the observed data. Deeplearning is a particular
kind of machine learning that achieves great power and fiexibility by learning to represent the world as a nested hierarchy of concepts, with each concept defined in relation to simpler concepts. Representation learning algorithms can either be supervised,
unsupervised, or a combination of both (semi-supervised). Deep learning has not only changed the field of machine learning and influenced our understanding of human perception, it has revolutionized application areas such as speech recognition and image understanding.
Pylearn2 is a machine learning and deep learning library. A live online resourcehttp://www.deeplearning.net/book/guidelines
allows practitioners and researchers to share their questions and experience and keep abreast of developments in the art of deep learning.

1.2 Machine Learning

Human brains also observe their own actions, which infiuence the world around them, and it appears that human brains try to learn the statistical dependencies between these actions and their consequences, so as to maximize future
rewards. Bayesian machine learning attempts to formalize these priors as probability distributions and once this is done, Bayes theorem and the laws of probability (discussed in Chapter 3) dictates what the right predictions should be.Overfitting
occurs when capacityis too large compared to the number of examples, so that the learner does a good job on the training examples (it correctly guesses that they are likely configurations) but a very poor one on new examples (it does not discriminate well
between the likely configurations and the unlikely one). Underfitting occurs when instead the learner does not have enough capacity, so that even on the training examples it is not able to make good guesses: it does not
manage to capture enough of the information present in the training examples, maybe because it does not have enough degrees of freedom to fit all the training examples. The main reason we get underfitting
(especially with deep learning) is not that we choose to have insuficient capacity but because obtaining high capacity in a learner that has strong priors often involves dificult numerical optimization. Numerical optimization methods attempt to find a configuration
of some variables (often called parameters, in machine learning) that minimizes or maximizes some given function of these parameters, which we call objective function or training criterion.  In the case of most deep learning algorithms, this dificulty in optimizing
the training criterion is related to the fact that it is not convex in the parameters of the model.We believe that the issue of underfitting is central in deep learning algorithms
and deserves a lot more attention from researchers.
Another machine learning concept that turns out to be important to understand many deep learning algorithms is that of manifold learning.
The manifold learning hypothesis (Cayton, 2005; Narayanan and Mitter, 2010) states that probability is concentrated around regions called manifolds, i.e., that most configurations are unlikely
and that probable configurations are neighbors of other probable configurations. We define the dimension of a manifold as the number of orthogonal directions in which one can move and stay among probable configurations. This hypothesis of probability concentration
seems to hold for most AI tasks of interest, as can be verified by the fact that most configurations of input variables are unlikely (pick pixel values randomly and you will almost never obtain a natural-looking image).

1.3 Historical Perspective and Neural Networks

Modern deep learning research takes a lot of its inspiration from neural network research of previous decades. Other major intellectual sources of concepts found in deep learning research include works on probabilistic modeling
and graphical models, as well as works on manifold learning. The breakthrough came from a semi-supervised procedure:using unsupervised learning to learn one layer of features at a time and then fine-tuning the whole system with labeled data (Hinton et al.,
2006; Bengio et al., 2007; Ranzatoet al., 2007), described in Chapter 10. This initiated a lot of new research and other ways of successfully training deep nets emerged. Even though unsupervised pre-trainingis sometimes unnecessary for datasets with a very
large number of labels, it was the early success of unsupervised pre-training that led many new researchers to investigate deep neural networks. In particular, the use of rectifiers (Nair and Hinton,
2010b) as non-linearity and appropriate initialization allowing information to fiow well both forward(to produce predictions from input) and backward (to propagate error signals) were later  shown to enable training very deep supervised networks (Glorot et
al., 2011a) without unsupervised pre-training
.

1.4 Recent Impact of Deep Learning Research

Since 2010, deep learning has had spectacular practical successes. It has led to much better acoustic models that have dramatically improved the state of the art in speech recognition. Deep neural nets are now used in deployed speech
recognition systems including voice search on the Android (Dahl et al., 2010; Deng et al., 2010; Seide et al.,2011; Hinton et al., 2012). Deep convolutional nets have led
to major advances in the state of the art for recognizing large numbers of difierent types of objects in images(now deployed in Google+ photo search). They have also had spectacular successes for pedestrian detection and image segmentation (Sermanet et al.,
2013; Farabet et al.,2013; Couprie et al., 2013) and yielded superhuman performance in trafic sign classification (Ciresan et al., 2012). An organization called Kaggle runs machine learning competitions on the web. Deep learning has had numerous successes
in these competitions:

http://blog.kaggle.com/2012/11/01/deep-learning-how-i-did-it-merck-1st-place-interview

http://www.nytimes.com/2012/11/24/science/scientists-see-advances-in-deep-learning-a-part-of-artificial-intelligence.html

http://deeplearning.net/deep-learning-research-groups-and-labs


This has led Yann LeCun and Yoshua Bengio to create a new conference on the subject. They called it the
International Conference on Learning Representations (ICLR) , to broaden the scope from just deep learning to the more general subject of representation
learning (which includes topics such as sparse coding, that learns shallow representations, because shallow representation-learners can be used as building blocks for deep representation-learners).  In the examples of outstanding applications of deep learning
described above, the impressive breakthroughs have mostly been achieved with supervised learning techniques for deep architectures. We believe that some of the most important future progressin deep learning will hinge on achieving a similar impact in the unsupervised
and semi-supervised cases. Even though the scaling behavior of stochastic gradient descent is theoretically very good in terms of computations per update, these observations suggest a numerical optimization
challenge that must be addressed
. In addition to these numerical optimization dificulties, scaling up large and deep neural networks as they currently stand would require a substantial increase in computing power,
which remains a limiting factor of our research. To train much larger models with the current hardware (or the hardware likely to be available in the next few years) will require a change in design and/or the ability to efiectively exploit parallel computation.
These raise non-obvious questions where fundamental research is also needed. Furthermore, some of the biggest challenges remain in front of us regarding unsupervised deep learning. Powerful unsupervised learning is important for many reasons:fi Unsupervised
learning allows a learner to take advantage of unlabeled data. Most of the data available to machines (and to humans and animals) is unlabeled, i.e.,without a precise and symbolic characterization of its semantics and of the outputs desired from a learner.
Humans and animals are also motivated, and this guidesresearch into learning algorithms based on a reinforcement signal, which is much weaker than the signal required for supervised learning.





To summarize, some of the challenges we view as important for future break throughsin deep learning are the following:

1. How should we deal with the fundamental challenges behind unsupervised learning,such as intractable inference and sampling See Chapters 15, 16, and 17.

2. How can we build and train much larger and more adaptive and reconfigurable deep architectures, thus maximizing the advantage one can draw from larger datasets See Chapter
8.

3. How can we improve the ability of deep learning algorithms to disentangle the underlying factors of variation, or put more simply, make sense of the world around us
See Chapter 14 on this very basic question about what is involved in learning a good representation.

版权声明:本文博主原创文章。博客,未经同意不得转载。

[DEEP LEARNING An MIT Press book in preparation]Deep Learning for AI的更多相关文章

  1. [DEEP LEARNING An MIT Press book in preparation]Linear algebra

    线性代数是数学的一个重要分支,它经常被施加到project问题,要了解学习和工作深入研究的深度,因此,对于线性代数的深刻理解是非常重要的.下面是我总结的距离DL book性代数中抽取出来的比較有意思的 ...

  2. Deep Learning论文笔记之(八)Deep Learning最新综述

    Deep Learning论文笔记之(八)Deep Learning最新综述 zouxy09@qq.com http://blog.csdn.net/zouxy09 自己平时看了一些论文,但老感觉看完 ...

  3. Learning to Track at 100 FPS with Deep Regression Networks ECCV 2016 论文笔记

    Learning to Track at 100 FPS with Deep Regression Networks   ECCV 2016  论文笔记 工程网页:http://davheld.git ...

  4. PredNet --- Deep Predictive coding networks for video prediction and unsupervised learning --- 论文笔记

    PredNet --- Deep Predictive coding networks for video prediction and unsupervised learning   ICLR 20 ...

  5. 课程一(Neural Networks and Deep Learning),第一周(Introduction to Deep Learning)—— 0、学习目标

    1. Understand the major trends driving the rise of deep learning.2. Be able to explain how deep lear ...

  6. Targeted Learning R Packages for Causal Inference and Machine Learning(转)

    Targeted learning methods build machine-learning-based estimators of parameters defined as features ...

  7. 读paper:Deep Convolutional Neural Network using Triplets of Faces, Deep Ensemble, andScore-level Fusion for Face Recognition

    今天给大家带来一篇来自CVPR 2017关于人脸识别的文章. 文章题目:Deep Convolutional Neural Network using Triplets of Faces, Deep ...

  8. 【论文笔记】A Survey on Federated Learning: The Journey From Centralized to Distributed On-Site Learning and Beyond(综述)

    A Survey on Federated Learning: The Journey From Centralized to Distributed On-Site Learning and Bey ...

  9. 课程一(Neural Networks and Deep Learning),第一周(Introduction to Deep Learning)—— 2、10个测验题

    1.What does the analogy “AI is the new electricity” refer to?  (B) A. Through the “smart grid”, AI i ...

随机推荐

  1. 通过Java反射调用方法

    这是个测试用的例子,通过反射调用对象的方法.     TestRef.java import java.lang.reflect.Method; import java.lang.reflect.In ...

  2. WPF界面设计技巧(2)—自定义漂亮的按钮样式

    原文:WPF界面设计技巧(2)-自定义漂亮的按钮样式 上次做了个很酷的不规则窗体,这次我们来弄点好看的按钮出来,此次将采用纯代码来设计按钮样式,不需要 Microsoft Expression Des ...

  3. linuxserver启动过程

    随着Linux的应用日益广泛.特别是在网络应用方面,有大量的网络server使用Linux操作系统.因为Linux的桌面应用和Windows相比另一 定的差距.所以在企业应用中往往是Linux和Win ...

  4. AS3.0下去除flash右键菜单

    这两天工作中遇到一个问题,就是网页中内嵌的flash小游戏的用户体验,当鼠标在flash上点击右键时,出现的右键菜单中会有播放,停止等选项,虽然不会造成什么漏洞,但是体验非常差.在寻找解决方案的时候, ...

  5. pygame系列_游戏窗口显示策略

    在这篇blog中,我将给出一个demo演示: 当我们按下键盘的‘f’键的时候,演示的窗口会切换到全屏显示和默认显示两种显示模式 并且在后台我们可以看到相关的信息输出: 上面给出了一个简单的例子,当然在 ...

  6. [Unity3D]Unity3D持久性数据的游戏开发PlayerPrefs采用

    大家好,我是秦培,欢迎关注我的博客,我的博客地址">blog.csdn.net/qinyuanpei. 博主今天研究了在Unity3D中的数据持久化问题.数据持久化在不论什么一个开发领 ...

  7. android client随机验证码生成函数

    由于该项目使用验证码.自己找了一些资料.尽量把这个验证码做出来.代码不是很,較的简单,以下给大家看看我是怎么实现该功能的: 源代码地址下载:http://download.csdn.net/detai ...

  8. SQL Server :理解GAM和SGAM页

    原文:SQL Server :理解GAM和SGAM页 我们知道SQL Server在8K 的页里存储数据.分区就是物理上连续的8个页.当我们创建一个数据库,数据文件会被逻辑分为页和区,当用户对象创建时 ...

  9. Android定义自己的面板共享系统

    在Android分享知道有一个更方便的方法.调用的共享面板来分享我们的应用程序的系统.主要实现例如,下面的: public Intent getShareIntent(){ Intent intent ...

  10. 猫学习IOS(三)UI纯代码UI——图片浏览器

    猫分享.必须精品 看看效果 主要实现相似看新闻的一个界面,不用拖拽,纯代码手工写. 首先分析app能够非常easy知道他这里有两个UILabel一个UIImageView还有两个UIButton 定义 ...