(转) Deep learning architecture diagrams
FastML
Machine learning made easy
Deep learning architecture diagrams
2016-09-30
Like a wild stream after a wet season in African savanna diverges into many smaller streams forming lakes and puddles, deep learning has diverged into a myriad of specialized architectures. Each architecture has a diagram. Here are some of them.
Neural networks are conceptually simple, and that’s their beauty. A bunch of homogenous, uniform units, arranged in layers, weighted connections between them, and that’s all. At least in theory. Practice turned out to be a bit different. Instead of feature engineering, we now have architecture engineering, as described by Stephen Merrity:
The romanticized description of deep learning usually promises that the days of hand crafted feature engineering are gone - that the models are advanced enough to work this out themselves. Like most advertising, this is simultaneously true and misleading.
Whilst deep learning has simplified feature engineering in many cases, it certainly hasn’t removed it. As feature engineering has decreased, the architectures of the machine learning models themselves have become increasingly more complex. Most of the time, these model architectures are as specific to a given task as feature engineering used to be.
To clarify, this is still an important step. Architecture engineering is more general than feature engineering and provides many new opportunities. Having said that, however, we shouldn’t be oblivious to the fact that where we are is still far from where we intended to be.
Not quite as bad as doings of architecture astronauts, but not too good either.
An example of architecture specific to a given task
LSTM diagrams
How to explain those architectures? Naturally, with a diagram. A diagram will make it all crystal clear.
Let’s first inspect the two most popular types of networks these days, CNN and LSTM. You’ve already seen a convnet diagram, so turning to the iconic LSTM:
It’s easy, just take a closer look:
As they say, in mathematics you don’t understand things, you just get used to them.
Fortunately, there are good explanations, for example Understanding LSTM Networks and Written Memories: Understanding, Deriving and Extending the LSTM.
LSTM still too complex? Let’s try a simplified version, GRU (Gated Recurrent Unit). Trivial, really.
Especially this one, called minimal GRU.
More diagrams
Various modifications of LSTM are now common. Here’s one, called deep bidirectional LSTM:
DB-LSTM, PDF
The rest are pretty self-explanatory, too. Let’s start with a combination of CNN and LSTM, since you have both under your belt now:
Convolutional Residual Memory Network, 1606.05262
Dynamic NTM, 1607.00036
Evolvable Neural Turing Machines, PDF
Recurrent Model Of Visual Attention, 1406.6247
Unsupervised Domain Adaptation By Backpropagation, 1409.7495
Deeply Recursive CNN For Image Super-Resolution, 1511.04491
This diagram of multilayer perceptron with synthetic gradients scores high on clarity:
MLP with synthetic gradients, 1608.05343
Every day brings more. Here’s a fresh one, again from Google:
Google’s Neural Machine Translation System, 1609.08144
And Now for Something Completely Different
Drawings from the Neural Network ZOO are pleasantly simple, but, unfortunately, serve mostly as eye candy. For example:
ESM, ESN and ELM
These look like not-fully-connected perceptrons, but are supposed to represent a Liquid State Machine, an Echo State Network, and an Extreme Learning Machine.
How does LSM differ from ESN? That’s easy, it has green neuron with triangles. But how does ESN differ from ELM? Both have blue neurons.
Seriously, while similar, ESN is a recursive network and ELM is not. And this kind of thing should probably be visible in an architecture diagram.
Posted by Zygmunt Z. 2016-09-30 basics, neural-networks
« Factorized convolutional neural networks, AKA separable convolutions
Comments
Recent Posts
- Deep learning architecture diagrams
- Factorized convolutional neural networks, AKA separable convolutions
- How to make those 3D data visualizations
- Adversarial validation, part two
- ^one weird trick for training char-^r^n^ns
- Adversarial validation, part one
- Coming out
Follow @fastml for notifications about new posts.
- Status updating...
Also check out @fastml_extra for things related to machine learning and data science in general.
GitHub
Most articles come with some code. We push it to Github.
Cubert
Visualize your data in interactive 3D, as described here.
Copyright © 2016 - Zygmunt Z. - Powered by Octopress
(转) Deep learning architecture diagrams的更多相关文章
- 15 cvpr An Improved Deep Learning Architecture for Person Re-Identification
http://www.umiacs.umd.edu/~ejaz/ * 也是同时学习feature和metric * 输入一对图片,输出是否是同一个人 * 包含了一个新的层: include a lay ...
- Deep Learning in a Nutshell: History and Training
Deep Learning in a Nutshell: History and Training This series of blog posts aims to provide an intui ...
- 深度学习材料:从感知机到深度网络A Deep Learning Tutorial: From Perceptrons to Deep Networks
In recent years, there’s been a resurgence in the field of Artificial Intelligence. It’s spread beyo ...
- 【Deep Learning】genCNN: A Convolutional Architecture for Word Sequence Prediction
作者:Mingxuan Wang.李航,刘群 单位:华为.中科院 时间:2015 发表于:acl 2015 文章下载:http://pan.baidu.com/s/1bnBBVuJ 主要内容: 用de ...
- Why GEMM is at the heart of deep learning
Why GEMM is at the heart of deep learning I spend most of my time worrying about how to make deep le ...
- 【深度学习Deep Learning】资料大全
最近在学深度学习相关的东西,在网上搜集到了一些不错的资料,现在汇总一下: Free Online Books by Yoshua Bengio, Ian Goodfellow and Aaron C ...
- (转) Awesome - Most Cited Deep Learning Papers
转自:https://github.com/terryum/awesome-deep-learning-papers Awesome - Most Cited Deep Learning Papers ...
- (转) Deep Learning Research Review Week 2: Reinforcement Learning
Deep Learning Research Review Week 2: Reinforcement Learning 转载自: https://adeshpande3.github.io/ad ...
- deep learning 的综述
从13年11月初开始接触DL,奈何boss忙or 各种问题,对DL理解没有CSDN大神 比如 zouxy09等 深刻,主要是自己觉得没啥进展,感觉荒废时日(丢脸啊,这么久....)开始开文,即为记录自 ...
随机推荐
- python中Scikit-Learn机器学习模块
Scikit-Learn是基于python的机器学习模块,基于BSD开源许可证.这个项目最早由DavidCournapeau 在2007 年发起的,目前也是由社区自愿者进行维护. Scikit-Lea ...
- 今天的感悟,对于python中的list()与w3c教程
首先本来想百度一下python定义列表的时候用 list()与直接用[]有什么区别,其中没有找到相关直接资料,看到了W3c菜鸟教程中之前看到的tuple,不禁想起list(tuple)是用来将元组转换 ...
- CAD打开文件总是弹出要求选择字体怎么办
CAD打开文件总是弹出要求选择字体怎么办1.在C:\Documents and Settings\下搜索acad.fmp文件,双击用记事本打开acad.fmp文件,在最后添加内容,上面几行本来就有,不 ...
- centos 安装 mongdb
1.安装MongoDB(安装到/usr/local) wget https://fastdl.mongodb.org/linux/mongodb-linux-x86_64-amazon-3.2.4.t ...
- 关于Jedis连接redis出现问题
环境说明: redis服务器系统:ubuntu ip 192.168.10.9 port 6379 两台电脑:一个作为专门的服务器,一个是开发环境,以下一顿操作皆基于开发环境. 就这样的简单的代码连接 ...
- C#日常知识
常量: 定义常量:const; 条件运算符: 表达式1?表达式2:表达式3[如果正确则执行表达式2,不正确执行表达式3] (例如:int result; result = 5>4?100:200 ...
- scanf与scanf_s的区别
scanf()函数是标准C中提供的标准输入函数,用以用户输入数据 scanf_s()函数是Microsoft公司VS开发工具提供的一个功能相同的安全标准输入函数,从vc++2005开始,VS系统提供了 ...
- 如何编写android ANE
1.编写AndroidAne.jar: a.编写SkyContext.java: import java.util.HashMap;import java.util.Map;import com.ad ...
- Android Studio 查看密钥库证书指纹SHA1
打开DOC命令窗体
- Learning with Trees
We are now going to consider a rather different approach to machine learning, starting with one of t ...