Theories of Deep Learning

借该课程，进入战略要地的局部战斗中，采用红色字体表示值得深究的概念，以及想起的一些需要注意的地方。

Lecture 01

Lecture01: Deep Learning Challenge. Is There Theory? (Donoho/Monajemi/Papyan)

Video link

纯粹的简介，意义不大。

Lecture 02

Video: Stats385 - Theories of Deep Learning - David Donoho - Lecture 2

资料：http://deeplearning.net/reading-list/ 【有点意思的链接】

Readings for this lecture

1 A mathematical theory of deep convolutional neural networks for feature extraction
2 Energy propagation in deep convolutional neural networks
3 Discrete deep feature extraction: A theory and new architectures
4 Topology reduction in deep convolutional feature extraction networks

重要点记录：

未知概念：能量传播，Topology reduction

Lecturer said:　

"Deep learning is simply an era where brute force has sudenly exploded its potential."

"How to use brute force (with limited scope) methold to yield result."

介绍ImageNet，没啥可说的；然后是基本back-propagation。

提了一句：

Newton法的发明人牛顿从来没想过用到NN这种地方，尬聊。

output的常见输出cost计算【补充】，介绍三种：

Assume z is the actual output and t is the target output.

*squared error:*	E = (z-t)²/2
*cross entropy:*	E = -t log(z) - (1-t)log(1-z)
*softmax:*	E = -(z_i - log Σ_j exp(z_j)), where i is the correct class.

第一个难点：

严乐春大咖：http://yann.lecun.com/exdb/publis/pdf/lecun-88.pdf

通过拉格朗日不等式认识反向传播，摘自论文链接前言。

开始介绍常见的卷积网络模型以及对应引进的feature。

讲到在正则方面，dropout有等价ridge regression的效果。

在损失函数中，weight decay是放在正则项（regularization）前面的一个系数，

正则项一般指示模型的复杂度，所以weight decay的作用是调节模型复杂度对损失函数的影响，

若weight decay很大，则复杂的模型损失函数的值也就大。

第二个难点：

通过这个对比：AlexNet vs. Olshausen and Field 引出了一些深度思考：

Why does AlexNet learn filters similar to Olshausen/Field?
Is there an implicit sparsity-promotion in training network?
How would classification results change if replace learned filters in first layer with analytically defined wavelets, e.g. Gabors?
Filters in the first layer are spatially localized, oriented and bandpass. What properties do filters in remaining layers satisfy?
Can we derive mathematically?

这些内容貌似在之后的lecture展开，在此作下标记。

Ref reading：sparse coding，paper

Batch Normalization：

其中有提出这么一个问题，甚是有趣：

Does this imply filters can be learned in unsupervised manner?

第三个难点：

关于卷积可视化，以及DeepDream的原理。

第四个难点：

补充一个难点：权重初始化的策略

Links:

以上提及的重难点，未来将在此附上对应的博客链接。

[Stats385] Lecture 01-02, warm up with some questions的更多相关文章

linux下生成00 01 02..99的这些数
[root@localhost ~]# seq -s " " -w 9901 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 ...
ML Lecture 0-1: Introduction of Machine Learning
本博客是针对李宏毅教授在Youtube上上传的课程视频<ML Lecture 0-1: Introduction of Machine Learning>的学习笔记.在Github上也po ...
[Stats385] Lecture 03, Harmonic Analysis of Deep CNN
大咖秀,注意提问环节大家的表情,深入窥探大咖的心态,很有意思. 之前有NG做访谈,现在这成了学术圈流行. Video: https://www.youtube.com/watch?v=oCohnBbm ...
CS229 Lecture 01
CS229 Lecture notes 01 机器学习课程主要分为4部分:监督学习:学习理论:无监督学习:增强学习. $x^{(i)}$表示特征,$y^{(i)}$表示目标,$i=1...m$.m是训 ...
[Stats385] Lecture 04: Convnets from Probabilistic Perspective
本篇围绕“深度渲染混合模型”展开. Lecture slices Lecture video Reading list A Probabilistic Framework for Deep Learn ...
[Stats385] Lecture 05: Avoid the curse of dimensionality
Lecturer 咖中咖 Tomaso A. Poggio Lecture slice Lecture video 三个基本问题: Approximation Theory: When and why ...
Cheatsheet: 2016 02.01 ~ 02.29
Web How to do distributed locking Writing Next Generation Reusable JavaScript Modules in ECMAScript ...
Cheatsheet: 2015.02.01 ~ 02.28
Other API Best Practices: API Management Rewriting History with Git Rebase .NET Announcing Microsoft ...
Cheatsheet: 2014 02.01 ~ 02.28
Database Managing disk space in MongoDB When to use GridFS on MongoDB .NET The Past, Present, and Fu ...

随机推荐

centos7 重置root 密码
重置Centos 7 Root密码的方式和Centos 6完全不同.让我来展示一下到底如何操作. 1 - 在启动grub菜单,选择编辑选项启动 2 - 按键盘e键,来进入编辑界面 3 - 找到Linu ...
QT.Qt qmake报错(TypeError: Property 'asciify' of object Core::Internal::UtilsJsExtension)
出错信息打开左边的"项目" 把右侧的"构建目录"修改成你项目所在的文件夹再次运行试试成功!
Math类操作数据
Math 类位于 java.lang 包中,包含用于执行基本数学运算的方法, Math 类的所有方法都是静态方法,所以使用该类中的方法时,可以直接使用类名.方法名,如: Math.round(); 常 ...
让.Net程序支持命令行启动
很多时候,我们需要让程序支持命令行启动,这个时候则需要一个命令行解析器,由于.Net BCL并没有内置命令行解析库,因此需要我们自己实现一个.对于简单的参数来说,自己写一个字符串比较函数来分析args ...
配置sonar和jenkins进行代码审查
转自: http://www.cnblogs.com/gao241/p/3190701.html, 版权归原作者所有. 本文以CentOS操作系统为例介绍Sonar的安装配置,以及如何与Jenkin ...
CUDA_ERROR_OUT_OF_MEMORY
E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 198.83M (208486400 bytes) ...
ES6 js中const,var,let区别今天第一次遇到const定义的变量
今天第一次遇到const定义的变量,查阅了相关资料整理了这篇文章.主要内容是:js中三种定义变量的方式const, var, let的区别. 1.const定义的变量不可以修改,而且必须初始化. 1 ...
[数据结构与算法分析(Mark Allen Weiss)]不相交集 @ Python
最简单的不相交集的实现,来自MAW的<数据结构与算法分析>. 代码: class DisjSet: def __init__(self, NumSets): self.S = [0 for ...
[leetcode]Rotate List @ Python
原题地址:https://oj.leetcode.com/problems/rotate-list/ 题意: Given a list, rotate the list to the right by ...
SqlDateTime overflow / SqlDateTime 溢出
Error - SqlDateTime overflow. Must be between 1/1/1753 12:00:00 AM and 12/31/9999 11:59:59 PM SqlDat ...

[Stats385] Lecture 01-02, warm up with some questions

Theories of Deep Learning

Links:

[Stats385] Lecture 01-02, warm up with some questions的更多相关文章

随机推荐

热门专题