Neural Architecture Search — Limitations and Extensions
Neural Architecture Search — Limitations and Extensions
2019-09-16 07:46:09
This blog is from: https://towardsdatascience.com/neural-architecture-search-limitations-and-extensions-8141bec7681f
For the past couple of years, researchers and companies have been trying to make deep learning more accessible to non-experts by providing access to pre-trained computer vision or machine translation models. Using a pre-trained model for another task is known as transfer learning, but it still requires sufficient expertise to fine-tune the model on another dataset. Fully automating this procedure allows even more users to benefit from the great progress that has been made in ML to date.
This is called AutoML, and it can cover many parts of predictive modelling such as architecture search and hyperparameter optimization. In this post, I focus on the former, as there has been a recent explosion of methods that search for the “best” architecture for a given dataset. The results presented are based on joint work with Jonathan Lorraine.
Importance of Neural Architecture Search
As a side note, keep in mind that a neural network with just a single hidden layer and non-linear activation function is able to represent any function possible, provided that there are sufficient neurons in that layer (UAT). However, such a simple architecture, though it is theoretically able to learn any function, does not represent the hierarchical processing that occurs in the human visual cortex. The architecture of a neural network gives it an inductive bias, and shallow, wide networks that do not use convolutions are significantly worse at image classification tasks than deep, convolutional networks. Thus, in order for neural networks to generalize and not overfit the training dataset, it’s important to find architectures with the right inductive bias (regardless if those architectures are inspired by the brain or not).
Neural Architecture Search (NAS) Overview
NAS was an inspiring work out of Google that lead to several follow up works such as ENAS, PNAS, and DARTS. It involves training a recurrent neural network (RNN) controller using reinforcement learning (RL) to automatically generate architectures. These architectures then have their weights trained and are evaluated on a validation set. Their performance on the validation set is the reward signal for the controller which then increases its probability of generating architectures that have done well, and decreases the probability of generating architectures that have done poorly. For non-technical readers, this essentially takes the process of a human manually tweaking a neural network and learning what works well, and automating it. The idea of automatically creating NN architectures was not coined by NAS, as other approaches using methods such as genetic algorithms existed long before, but NAS effectively used RL to efficiently search a space that is prohibitively large to search exhaustively. Below, the components of NAS are analyzed in a bit more depth, before I go on to discuss the limitations of the method as well as its more efficient successor ENAS, as well as an interesting failure mode. The next 2 subsections are best understood while comparing the text again the below figure showing how architectures are sampled and trained:
Diagram of how NAS Works
LSTM Controller
The controller generates architectures by making a series of choices for a pre-defined amount of time steps. For example, when generating a convolutional architecture, the controller begins by only creating architectures with 6 layers in them. For each layer, just 4 decisions are made by the controller: filter height, filter width, number of filters, and stride (so 24 time steps). Assuming that the first layer is numbered 0, then the decisions
Neural Architecture Search — Limitations and Extensions的更多相关文章
- 论文笔记:Fast Neural Architecture Search of Compact Semantic Segmentation Models via Auxiliary Cells
Fast Neural Architecture Search of Compact Semantic Segmentation Models via Auxiliary Cells 2019-04- ...
- (转)Illustrated: Efficient Neural Architecture Search ---Guide on macro and micro search strategies in ENAS
Illustrated: Efficient Neural Architecture Search --- Guide on macro and micro search strategies in ...
- (转)The Evolved Transformer - Enhancing Transformer with Neural Architecture Search
The Evolved Transformer - Enhancing Transformer with Neural Architecture Search 2019-03-26 19:14:33 ...
- 论文笔记:ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware
ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware 2019-03-19 16:13:18 Pape ...
- 论文笔记:Progressive Neural Architecture Search
Progressive Neural Architecture Search 2019-03-18 20:28:13 Paper:http://openaccess.thecvf.com/conten ...
- 论文笔记:Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation
Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation2019-03-18 14:4 ...
- 论文笔记系列-Neural Architecture Search With Reinforcement Learning
摘要 神经网络在多个领域都取得了不错的成绩,但是神经网络的合理设计却是比较困难的.在本篇论文中,作者使用 递归网络去省城神经网络的模型描述,并且使用 增强学习训练RNN,以使得生成得到的模型在验证集上 ...
- 小米造最强超分辨率算法 | Fast, Accurate and Lightweight Super-Resolution with Neural Architecture Search
本篇是基于 NAS 的图像超分辨率的文章,知名学术性自媒体 Paperweekly 在该文公布后迅速跟进,发表分析称「属于目前很火的 AutoML / Neural Architecture Sear ...
- Research Guide for Neural Architecture Search
Research Guide for Neural Architecture Search 2019-09-19 09:29:04 This blog is from: https://heartbe ...
随机推荐
- k8s资源清单基础
资源清单介绍 创建资源的方法 apiserver仅接收JSON格式的资源定义 yaml格式提供配置清单 apiserver可自动把yaml转换成json格式数据 资源清单五个一级字段 1.ap ...
- windows下用纯C实现一个简陋的imshow:基于GDI
intro 先前实现了GDI显示图像时设定窗口大小为图像大小,不过并没有刻意封装函数调用接口,并不适合给其他函数调用.现在简单封装一下,特点: 纯C 基于GDI,因此只支持windows平台 类似于o ...
- windows下控制台程序实现窗口显示
windows下实现窗口显示,如果限定是C/C++语言,并且是原生Windows支持,需要使用GDI或GDI+.一般是在Visual Studio里新建Win32应用程序,而不是Win32 conso ...
- js 判断浏览器是pc端还是移动端
if(/Android|webOS|iPhone|iPod|BlackBerry/i.test(navigator.userAgent)) { //说明是移动端 } else { //说明是pc端 }
- k8s node节点部署(v1.13.10)
系统环境: node节点 操作系统: CentOS-7-x86_64-DVD-1908.iso node节点 IP地址: 192.168.1.204 node节点 hostname(主机名, 请和保持 ...
- Altium Designer常用快捷键总结
一.PCB中常用快捷键 ● R+L 输出PCB中所有网络的布线长度 ● Ctrl+左键点击 对正在布的线完成自动布线连接 ● M+G 可更改铜的形状; ● 按P+T在布线状态下,按Shift+A可直接 ...
- Cisco网络模拟器踩坑记录
1.在我们新建一个拓扑图的时候,选择设备之间的连线种类有时会导致线路不通的情况(两个端口上为红色点)这时候建议拆除这条线后选择闪电标记 的万能线帮助我们自动创建连线(这时就能根据它显示的线条种类得知应 ...
- Linux系统文件/etc/fstab
挂载可以使用命令或者修改系统文件两种方式,第一种方式使用mount命令挂载文件系统可以立即生效并使用,但计算机重启后无效.另一种方式需要修改系统文件/etc/fstab,这种方式挂载的文件系统在计算机 ...
- 《BUG创造队》第六次作业:团队项目系统设计改进与详细设计
项目 内容 这个作业属于哪个课程 2016级软件工程 这个作业的要求在哪里 实验十 团队作业6:团队项目系统设计改进与详细设计 团队名称 BUG创造队 作业学习目标 1.编写完整<软件系统设计说 ...
- WM_MOUSEWHEEL、WM_LBUTTONDOWN等父子窗口消息传递陷阱
mfc中,碰到以下问题:父对话框A.子窗口B.B是CWnd对象.需要在B中处理WM_MOUSEWHEEL.WM_LBUTTONDOWN等消息. 所以在B中增加对应的消息处理,发现B中的消息循环中,收不 ...