Ablation Study
We often come across 'ablation study' in machine learning papers, for example, in this paper with the original R-CNN, it has a section of ablation studies. But what does this means?
Well, we know that when we build a model, we usually have different components of the model. If we remove some component of the model, what's the effect on the model? This is a very coarse definition of ablation study - we want to see the contributions of some proposed components in the model by comparing the model including this component with that without this component.
In the above paper, in order to see the effect of fine-tuning of the CNN, the authors analyzed the performance of the model with the fine-tuning and the performance of it without the fine-tuning. This way, we can easily see the effect of the fine-tuning.
The following I copied from the answer of Jonathan Uesato on Quora, it explains very well:
- An LSTM has 4 gates: feature, input, output, forget. We might ask: are all 4 necessary? What if I remove one? Indeed, lots of experimentation has gone into LSTM variants, the GRU being a notable example (which is simpler).
- If certain tricks are used to get an algorithm to work, it’s useful to know whether the algorithm is robust to removing these tricks. For example, DeepMind’s original DQN paper reports using (1) only periodically updating the reference network and (2) using a replay buffer rather than updating online. It’s very useful for the research community to know that both these tricks are necessary, in order to build on top of these results.
- If an algorithm is a modification of a previous work, and has multiple differences, researchers want to know what the key difference is.
- Simpler is better (inductive prior towards simpler model classes). If you can get the same performance with two models, prefer the simpler one.
Ablation Study的更多相关文章
- 深度学习研究理解5:Visualizing and Understanding Convolutional Networks(转)
Visualizing and understandingConvolutional Networks 本文是Matthew D.Zeiler 和Rob Fergus于(纽约大学)13年撰写的论文,主 ...
- 《DSOD:Learning Deeply Supervised Object Detectors from Scratch》翻译
原文地址:https://arxiv.org/pdf/1708.01241 DSOD:从零开始学习深度有监督的目标检测器 Abstract摘要: 我们提出了深入的监督对象检测器(DSOD),一个框架, ...
- 论文笔记(2):Deep Crisp Boundaries: From Boundaries to Higher-level Tasks
---------------------------------------------------------------------------------------------------- ...
- SCNN车道线检测--(SCNN)Spatial As Deep: Spatial CNN for Traffic Scene Understanding(论文解读)
Spatial As Deep: Spatial CNN for Traffic Scene Understanding 收录:AAAI2018 (AAAI Conference on Artific ...
- [Arxiv1706] Few-Example Object Detection with Model Communication 论文笔记
p.p1 { margin: 0.0px 0.0px 0.0px 0.0px; font: 13.0px "Helvetica Neue"; color: #042eee } p. ...
- [论文解读]CNN网络可视化——Visualizing and Understanding Convolutional Networks
概述 虽然CNN深度卷积网络在图像识别等领域取得的效果显著,但是目前为止人们对于CNN为什么能取得如此好的效果却无法解释,也无法提出有效的网络提升策略.利用本文的反卷积可视化方法,作者发现了AlexN ...
- (转)The Evolved Transformer - Enhancing Transformer with Neural Architecture Search
The Evolved Transformer - Enhancing Transformer with Neural Architecture Search 2019-03-26 19:14:33 ...
- Dual Attention Network for Scene Segmentation
Dual Attention Network for Scene Segmentation 原始文档 https://www.yuque.com/lart/papers/onk4sn 在本文中,我们通 ...
- 【中文版 | 论文原文】BERT:语言理解的深度双向变换器预训练
BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding 谷歌AI语言组论文<BERT:语言 ...
随机推荐
- pdf的使用遇到的问题
http://blog.csdn.net/atluckstar/article/details/77688972 回答网友提问 2015-7-28 因为好多人问能不能显示中文的问题,我总结大致分为两 ...
- KL散度=交叉熵-熵
熵:可以表示一个事件A的自信息量,也就是A包含多少信息. KL散度:可以用来表示从事件A的角度来看,事件B有多大不同. 交叉熵:可以用来表示从事件A的角度来看,如何描述事件B. 一种信息论的解释是: ...
- 201671010450-姚玉婷-实验十四 团队项目评审&课程学习总结
项目 内容 所属科目 软件工程http://www.cnblogs.com/nwnu-daizh 作业要求 https://www.cnblogs.com/nwnu-daizh/p/11093584. ...
- python网页内容提取神器lxml
一.Xpath是什么 XPath 是一门在 XML 文档中查找信息的语言.XPath 用于在 XML 文档中通过元素和属性进行导航. XPath 使用路径表达式在 XML 文档中进行导航 XPath ...
- RMP和YUM软件安装
1.卸载RPM包 rpm -e rpm包的名称 2.安装rpm包 rmp -ivh xxx.rpm 3.查询yum服务器是否有需要安装的软件 yum list|grep xxx软件列表 4.yum安装 ...
- NOIP 2015 推销员
洛谷 P2672 推销员 洛谷传送门 JDOJ 2994: [NOIP2015]推销员 T4 JDOJ传送门 Description 阿明是一名推销员,他奉命到螺丝街推销他们公司的产品.螺丝街是一条死 ...
- Django 1.11 bootstrap样式文件无法加载问题解决
先吐槽一波,多看官方教程,多找对应版本解决方法,多思考!... 在调试模式下面,打开页面无法加载bootstrap.min.css样式,解决思路如下: 查看settings文件INSTALL_APP配 ...
- 你好,我叫Flask
首先,要看你学没学过Django 如果学过Django 的同学,请从头看到尾,如果没有学过Django的同学,并且不想学习Django的同学,轻饶过第一部分 一. Python 现阶段三大主流Web框 ...
- concurrent(六)同步辅助器CyclicBarrier & 源码分析
参考文档:Java多线程系列--“JUC锁”10之 CyclicBarrier原理和示例:https://www.cnblogs.com/skywang12345/p/3533995.html简介Cy ...
- Azure容器监控部署(下)
上文已经基本完成了环境的搭建,prometheus可以以https的方式从node_exporter和cAdvisor上pull到数据,访问grafana时也可以以https的方式访问,安全性得到了一 ...