http://stats.stackexchange.com/questions/145768/importance-of-local-response-normalization-in-cnn

caffe 解释:

The local response normalization layer performs a kind of “lateral inhibition” by normalizing over local input regions.双边抑制。看起来就像是激活函数

几种解释以上链接的几个答复,引用附上如上链接:

(1)是优化的计算更加快

Here is my suggested answer, though I don't claim to be knowledgeable. When performing gradient descent on a linear model, the error surface is quadratic, with the curvature determined by XXTXXT, where XX is your input. Now the ideal error surface for or gradient descent has the same curvature in all directions (otherwise the step size is too small in some directions and too big in others). Normalising your inputs by rescaling the inputs to mean zero, variance 1 helps and is fast:now the directions along each dimension all have the same curvature, which in turn bounds the curvature in other directions.

The optimal solution would be to sphere/whiten the inputs to each neuron, however this is computationally too expensive. LCN can be justified as an approximate whitening based on the assumption of a high degree of correlation between neighbouring pixels (or channels) So I would claim the benefit is that the error surface is more benign for SGD... A single Learning rate works well across the input dimensions (of each neuron)

(2)在NIPS 2012这篇文章中,提到ReLU可以不需要这个归一化,但为了一般化,仍加上这个使之generalization。效果提高2%(没说明是什么激活函数)

Indeed, there seems no good explanation in a single place. The best is to read the articles from where it comes:

The original AlexNet article explains a bit in Section 3.3:

Krizhevsky, Sutskever, and Hinton, ImageNet Classification with Deep Convolutional Neural Networks, NIPS 2012. www.cs.toronto.edu/~fritz/absps/imagenet.pdf
The exact way of doing this was proposed in (but not much extra info here):

Kevin Jarrett, Koray Kavukcuoglu, Marc’Aurelio Ranzato and Yann LeCun, What is the best Multi-Stage Architecture for Object Recognition?, ICCV 2009. yann.lecun.com/exdb/publis/pdf/jarrett-iccv-09.pdf
It was inspired by computational neuroscience:

S. Lyu and E. Simoncelli. Nonlinear image representation using divisive normalization. CVPR 2008. www.cns.nyu.edu/pub/lcv/lyu08b.pdf . This paper goes deeper into the math, and is in accordance with the answer of seanv507.
[24] N. Pinto, D. D. Cox, and J. J. DiCarlo. Why is real-world vi- sual object recognition hard? PLoS Computational Biology, 2008. http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.0040027

caffe中的Local Response Normalization (LRN)有什么用,和激活函数区别的更多相关文章

  1. 局部响应归一化(Local Response Normalization,LRN)

     版权声明:本文为博主原创文章,欢迎转载,注明地址. https://blog.csdn.net/program_developer/article/details/79430119 一.LRN技术介 ...

  2. Local Response Normalization 60 million parameters and 500,000 neurons

    CNN是工具,在图像识别中是发现图像中待识别对象的特征的工具,是剔除对识别结果无用信息的工具. ImageNet Classification with Deep Convolutional Neur ...

  3. Local Response Normalization作用——对局部神经元的活动创建竞争机制,使得其中响应比较大的值变得相对更大,并抑制其他反馈较小的神经元,增强了模型的泛化能力

    AlexNet将LeNet的思想发扬光大,把CNN的基本原理应用到了很深很宽的网络中.AlexNet主要使用到的新技术点如下. (1)成功使用ReLU作为CNN的激活函数,并验证其效果在较深的网络超过 ...

  4. caffe中添加local层

    下载caffe-local,解压缩; 修改makefile.config:我是将cuudn注释掉,去掉cpu_only的注释; make all make test(其中local_test出错,将文 ...

  5. caffe 中base_lr、weight_decay、lr_mult、decay_mult代表什么意思?

    在机器学习或者模式识别中,会出现overfitting,而当网络逐渐overfitting时网络权值逐渐变大,因此,为了避免出现overfitting,会给误差函数添加一个惩罚项,常用的惩罚项是所有权 ...

  6. 【神经网络与深度学习】如何在Caffe中配置每一个层的结构

    如何在Caffe中配置每一个层的结构 最近刚在电脑上装好Caffe,由于神经网络中有不同的层结构,不同类型的层又有不同的参数,所有就根据Caffe官网的说明文档做了一个简单的总结. 1. Vision ...

  7. LRN(local response normalization--局部响应标准化)

    LRN全称为Local Response Normalization,即局部响应归一化层,LRN函数类似DROPOUT和数据增强作为relu激励之后防止数据过拟合而提出的一种处理方法.这个函数很少使用 ...

  8. 浅谈caffe中train_val.prototxt和deploy.prototxt文件的区别

    本文以CaffeNet为例: 1. train_val.prototxt  首先,train_val.prototxt文件是网络配置文件.该文件是在训练的时候用的. 2.deploy.prototxt ...

  9. Caffe中deploy.prototxt 和 train_val.prototxt 区别

    之前用deploy.prototxt 还原train_val.prototxt过程中,遇到了坑,所以打算总结一下 本人以熟悉的LeNet网络结构为例子 不同点主要在一前一后,相同点都在中间 train ...

随机推荐

  1. Caffe学习笔记(三):Caffe数据是如何输入和输出的?

    Caffe学习笔记(三):Caffe数据是如何输入和输出的? Caffe中的数据流以Blobs进行传输,在<Caffe学习笔记(一):Caffe架构及其模型解析>中已经对Blobs进行了简 ...

  2. Html基本用法

    hmtl hyper text markup language  超文本标记语言 超文本:超越一般的文本 变色 加粗 变大 甚至设置超链接 标记:浏览器已经定义好的 一些由尖括号括起来的特殊符号 语言 ...

  3. js 日期 处理 加减时分秒

    1.日期处理 var _d = new Date("2018/01/01 12:00:00"); _d = new Date(_d.valueOf() + 60 * 1000);/ ...

  4. PAT1075. PAT Judge (25)

    其中在排名输出上参照了 http://blog.csdn.net/xyzchenzd/article/details/27074665,原先自己写的很繁琐,而且还有一个测试点不通过. #include ...

  5. HtmlAgilityPach基本使用方法

    //过滤html标签 static void InnerText() { HtmlWeb htmlWeb = new HtmlWeb(); HtmlDocument doc = htmlWeb.Loa ...

  6. 使用@media screen解决移动web开发的多分辨率问题

    当今移动设备的发展已经越来越迅速,移动web开发的需求也越来越多多.许多大平台.大门户都纷纷推出了自己的移动web版网站. 随着移动设备飞速的发展,移动产品的屏幕规格越来越多.从几年前的320×240 ...

  7. 分布式缓存集群方案特性使用场景(Memcache/Redis(Twemproxy/Codis/Redis-cluster))优缺点对比及选型

    分布式缓存集群方案特性使用场景(Memcache/Redis(Twemproxy/Codis/Redis-cluster))优缺点对比及选型   分布式缓存特性: 1) 高性能:当传统数据库面临大规模 ...

  8. Graphviz(转载)

    简述 原文: http://www.tuicool.com/articles/vy2Ajyu 本文翻译自 Drawing Graphs using Dot and Graphviz 1. 许可 Cop ...

  9. mysql 在创建表或者插入时遇到关键字报错

    mysql 在创建表或者插入时遇到关键字:比如name,status等.都不报错 解决方法:在字段上加上` 上面这个符号是键盘ecs下面那个符号

  10. UVA-1312 Cricket Field (技巧枚举)

    题目大意:在一个w*h的网格中,有n个点,找出一个最大的正方形,使得正方形内部没有点. 题目分析:寻找正方形实质上等同于寻找矩形(只需令长宽同取较短的边长).那么枚举出所有可能的长宽组合取最优答案即可 ...