Neural Architecture Search — Limitations and Extensions
Neural Architecture Search — Limitations and Extensions
2019-09-16 07:46:09
This blog is from: https://towardsdatascience.com/neural-architecture-search-limitations-and-extensions-8141bec7681f
For the past couple of years, researchers and companies have been trying to make deep learning more accessible to non-experts by providing access to pre-trained computer vision or machine translation models. Using a pre-trained model for another task is known as transfer learning, but it still requires sufficient expertise to fine-tune the model on another dataset. Fully automating this procedure allows even more users to benefit from the great progress that has been made in ML to date.
This is called AutoML, and it can cover many parts of predictive modelling such as architecture search and hyperparameter optimization. In this post, I focus on the former, as there has been a recent explosion of methods that search for the “best” architecture for a given dataset. The results presented are based on joint work with Jonathan Lorraine.
Importance of Neural Architecture Search
As a side note, keep in mind that a neural network with just a single hidden layer and non-linear activation function is able to represent any function possible, provided that there are sufficient neurons in that layer (UAT). However, such a simple architecture, though it is theoretically able to learn any function, does not represent the hierarchical processing that occurs in the human visual cortex. The architecture of a neural network gives it an inductive bias, and shallow, wide networks that do not use convolutions are significantly worse at image classification tasks than deep, convolutional networks. Thus, in order for neural networks to generalize and not overfit the training dataset, it’s important to find architectures with the right inductive bias (regardless if those architectures are inspired by the brain or not).
Neural Architecture Search (NAS) Overview
NAS was an inspiring work out of Google that lead to several follow up works such as ENAS, PNAS, and DARTS. It involves training a recurrent neural network (RNN) controller using reinforcement learning (RL) to automatically generate architectures. These architectures then have their weights trained and are evaluated on a validation set. Their performance on the validation set is the reward signal for the controller which then increases its probability of generating architectures that have done well, and decreases the probability of generating architectures that have done poorly. For non-technical readers, this essentially takes the process of a human manually tweaking a neural network and learning what works well, and automating it. The idea of automatically creating NN architectures was not coined by NAS, as other approaches using methods such as genetic algorithms existed long before, but NAS effectively used RL to efficiently search a space that is prohibitively large to search exhaustively. Below, the components of NAS are analyzed in a bit more depth, before I go on to discuss the limitations of the method as well as its more efficient successor ENAS, as well as an interesting failure mode. The next 2 subsections are best understood while comparing the text again the below figure showing how architectures are sampled and trained:

Diagram of how NAS Works
LSTM Controller
The controller generates architectures by making a series of choices for a pre-defined amount of time steps. For example, when generating a convolutional architecture, the controller begins by only creating architectures with 6 layers in them. For each layer, just 4 decisions are made by the controller: filter height, filter width, number of filters, and stride (so 24 time steps). Assuming that the first layer is numbered 0, then the decisions
Neural Architecture Search — Limitations and Extensions的更多相关文章
- 论文笔记:Fast Neural Architecture Search of Compact Semantic Segmentation Models via Auxiliary Cells
Fast Neural Architecture Search of Compact Semantic Segmentation Models via Auxiliary Cells 2019-04- ...
- (转)Illustrated: Efficient Neural Architecture Search ---Guide on macro and micro search strategies in ENAS
Illustrated: Efficient Neural Architecture Search --- Guide on macro and micro search strategies in ...
- (转)The Evolved Transformer - Enhancing Transformer with Neural Architecture Search
The Evolved Transformer - Enhancing Transformer with Neural Architecture Search 2019-03-26 19:14:33 ...
- 论文笔记:ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware
ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware 2019-03-19 16:13:18 Pape ...
- 论文笔记:Progressive Neural Architecture Search
Progressive Neural Architecture Search 2019-03-18 20:28:13 Paper:http://openaccess.thecvf.com/conten ...
- 论文笔记:Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation
Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation2019-03-18 14:4 ...
- 论文笔记系列-Neural Architecture Search With Reinforcement Learning
摘要 神经网络在多个领域都取得了不错的成绩,但是神经网络的合理设计却是比较困难的.在本篇论文中,作者使用 递归网络去省城神经网络的模型描述,并且使用 增强学习训练RNN,以使得生成得到的模型在验证集上 ...
- 小米造最强超分辨率算法 | Fast, Accurate and Lightweight Super-Resolution with Neural Architecture Search
本篇是基于 NAS 的图像超分辨率的文章,知名学术性自媒体 Paperweekly 在该文公布后迅速跟进,发表分析称「属于目前很火的 AutoML / Neural Architecture Sear ...
- Research Guide for Neural Architecture Search
Research Guide for Neural Architecture Search 2019-09-19 09:29:04 This blog is from: https://heartbe ...
随机推荐
- 关于如何往Jupyter notebook添加可选的kernel
关于如何往Jupyter notebook添加可选的kernel 1. Anaconda知识预热 管理虚拟环境 关于如何安装Anaconda,这里就不再一一赘述,安装完Anaconda,接下来我们就可 ...
- vue.js下移动端绑定click事件失效,pc端正常的问题
原因可能是 我在项目中使用到了 better-scroll,默认它会阻止 touch 事件.所以在配置中需要加上 click: true 即可. 例如: mounted () { this.scrol ...
- Trunk 实现跨交换机 VLAN 通信
当网络中有多台交换机时,位于不同交换机上的相同VLAN的主机之间时如何通信的呢?我们使用Trunk实现跨交换机VLAN通信.还有以太网通道的操作哦. 实验拓扑 两台交换机直连,每台下面再连接两台VPC ...
- Linux命令——pidof
参考:Linux pidof Command Examples To Find PID of A Program/Command Linux pidof Command Tutorial for Be ...
- ps aux|grep *** 解释
对进程进行监测和控制,首先必须要了解当前进程的情况,也就是需要查看当前进程, 而ps命令(Process Status)就是最基本同时也是非常强大的进程查看命令. 使用该命令 可以确定有哪些进程正在运 ...
- javascript取元素里面的所有文本内容,过滤掉标签
textContent主要用法 备注:工作要取富文本里面的内容,但是只取开头前50个左右字符串,就想到textContent,大致总结了一下,大家可以借鉴参考一下textContent有更加信息的内容 ...
- 前端安全问题之CSRF和XSS
一.CSRF 1.什么是 CSRF CSRF(全称 Cross-site request forgery),即跨站请求伪造 2.攻击原理 用户登录A网站,并生成 Cookie,在不登出的情况下访问危险 ...
- python coding style guide 的快速落地实践——业内python 编码风格就pep8和谷歌可以认作标准
python coding style guide 的快速落地实践 机器和人各有所长,如coding style检查这种可自动化的工作理应交给机器去完成,故发此文帮助你在几分钟内实现coding st ...
- Kotlin注解深入解析与实例剖析
在上一次https://www.cnblogs.com/webor2006/p/11522798.html中学习了Kotlin注解相关的东东,这次继续对Kotlin的注解继续学习: 注解也可以拥有自己 ...
- http消息与webservice
别人的:在一台配置较低的PC上,同时开启服务端与客户端,10000条数据,使用基于http的消息逐条进行传递,从开始传递至全部接收并处理完毕,大概需要465秒的时间:而在同一台机器上,使用WebSer ...