(转)Awesome Knowledge Distillation
Awesome Knowledge Distillation
2018-07-19 10:38:40
Reference:https://github.com/dkozlov/awesome-knowledge-distillation
Papers
- Combining labeled and unlabeled data with co-training, A. Blum, T. Mitchell, 1998
- Model Compression, Rich Caruana, 2006
- Dark knowledge, Geoffrey Hinton , OriolVinyals & Jeff Dean, 2014
- Learning with Pseudo-Ensembles, Philip Bachman, Ouais Alsharif, Doina Precup, 2014
- Distilling the Knowledge in a Neural Network, Hinton, J.Dean, 2015
- Cross Modal Distillation for Supervision Transfer, Saurabh Gupta, Judy Hoffman, Jitendra Malik, 2015
- Heterogeneous Knowledge Transfer in Video Emotion Recognition, Attribution and Summarization, Baohan Xu, Yanwei Fu, Yu-Gang Jiang, Boyang Li, Leonid Sigal, 2015
- Distilling Model Knowledge, George Papamakarios, 2015
- Unifying distillation and privileged information, David Lopez-Paz, Léon Bottou, Bernhard Schölkopf, Vladimir Vapnik, 2015
- Learning Using Privileged Information: Similarity Control and Knowledge Transfer, Vladimir Vapnik, Rauf Izmailov, 2015
- Distillation as a Defense to Adversarial Perturbations against Deep Neural Networks, Nicolas Papernot, Patrick McDaniel, Xi Wu, Somesh Jha, Ananthram Swami, 2016
- Do deep convolutional nets really need to be deep and convolutional?, Gregor Urban, Krzysztof J. Geras, Samira Ebrahimi Kahou, Ozlem Aslan, Shengjie Wang, Rich Caruana, Abdelrahman Mohamed, Matthai Philipose, Matt Richardson, 2016
- Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer, Sergey Zagoruyko, Nikos Komodakis, 2016
- FitNets: Hints for Thin Deep Nets, Adriana Romero, Nicolas Ballas, Samira Ebrahimi Kahou, Antoine Chassang, Carlo Gatta, Yoshua Bengio, 2015
- Deep Model Compression: Distilling Knowledge from Noisy Teachers, Bharat Bhusan Sau, Vineeth N. Balasubramanian, 2016
- Knowledge Distillation for Small-footprint Highway Networks, Liang Lu, Michelle Guo, Steve Renals, 2016
- Sequence-Level Knowledge Distillation, deeplearning-papernotes, Yoon Kim, Alexander M. Rush, 2016
- MobileID: Face Model Compression by Distilling Knowledge from Neurons, Ping Luo, Zhenyao Zhu, Ziwei Liu, Xiaogang Wang and Xiaoou Tang, 2016
- Recurrent Neural Network Training with Dark Knowledge Transfer, Zhiyuan Tang, Dong Wang, Zhiyong Zhang, 2016
- Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer, Sergey Zagoruyko, Nikos Komodakis, 2016
- Adapting Models to Signal Degradation using Distillation, Jong-Chyi Su, Subhransu Maji,2016
- Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results, Antti Tarvainen, Harri Valpola, 2017
- Data-Free Knowledge Distillation For Deep Neural Networks, Raphael Gontijo Lopes, Stefano Fenu, 2017
- Like What You Like: Knowledge Distill via Neuron Selectivity Transfer, Zehao Huang, Naiyan Wang, 2017
- Learning Loss for Knowledge Distillation with Conditional Adversarial Networks, Zheng Xu, Yen-Chang Hsu, Jiawei Huang, 2017
- DarkRank: Accelerating Deep Metric Learning via Cross Sample Similarities Transfer, Yuntao Chen, Naiyan Wang, Zhaoxiang Zhang, 2017
- Knowledge Projection for Deep Neural Networks, Zhi Zhang, Guanghan Ning, Zhihai He, 2017
- Moonshine: Distilling with Cheap Convolutions, Elliot J. Crowley, Gavin Gray, Amos Storkey, 2017
- Local Affine Approximators for Improving Knowledge Transfer, Suraj Srinivas and Francois Fleuret, 2017
- Best of Both Worlds: Transferring Knowledge from Discriminative Learning to a Generative Visual Dialog Model, Jiasen Lu1, Anitha Kannan, Jianwei Yang, Devi Parikh, Dhruv Batra 2017
- Learning Efficient Object Detection Models with Knowledge Distillation, Guobin Chen, Wongun Choi, Xiang Yu, Tony Han, Manmohan Chandraker, 2017
- Model Distillation with Knowledge Transfer from Face Classification to Alignment and Verification, Chong Wang, Xipeng Lan and Yangang Zhang, 2017
- Learning Transferable Architectures for Scalable Image Recognition, Barret Zoph, Vijay Vasudevan, Jonathon Shlens, Quoc V. Le, 2017
- Revisiting knowledge transfer for training object class detectors, Jasper Uijlings, Stefan Popov, Vittorio Ferrari, 2017
- A Gift from Knowledge Distillation: Fast Optimization, Network Minimization and Transfer Learning, Junho Yim, Donggyu Joo, Jihoon Bae, Junmo Kim, 2017
- Rocket Launching: A Universal and Efficient Framework for Training Well-performing Light Net, Zihao Liu, Qi Liu, Tao Liu, Yanzhi Wang, Wujie Wen, 2017
- Data Distillation: Towards Omni-Supervised Learning, Ilija Radosavovic, Piotr Dollár, Ross Girshick, Georgia Gkioxari, Kaiming He, 2017
- Interpreting Deep Classifiers by Visual Distillation of Dark Knowledge, Kai Xu, Dae Hoon Park, Chang Yi, Charles Sutton, 2018
- Efficient Neural Architecture Search via Parameters Sharing, Hieu Pham, Melody Y. Guan, Barret Zoph, Quoc V. Le, Jeff Dean, 2018
- Transparent Model Distillation, Sarah Tan, Rich Caruana, Giles Hooker, Albert Gordo, 2018
- Defensive Collaborative Multi-task Training - Defending against Adversarial Attack towards Deep Neural Networks, Derek Wang, Chaoran Li, Sheng Wen, Yang Xiang, Wanlei Zhou, Surya Nepal, 2018
- Deep Co-Training for Semi-Supervised Image Recognition, Siyuan Qiao, Wei Shen, Zhishuai Zhang, Bo Wang, Alan Yuille, 2018
- Feature Distillation: DNN-Oriented JPEG Compression Against Adversarial Examples, Zihao Liu, Qi Liu, Tao Liu, Yanzhi Wang, Wujie Wen, 2018
- Multimodal Recurrent Neural Networks with Information Transfer Layers for Indoor Scene Labeling, Abrar H. Abdulnabi, Bing Shuai, Zhen Zuo, Lap-Pui Chau, Gang Wang, 2018
Videos
- Dark knowledge, Geoffrey Hinton, 2014
- Model Compression, Rich Caruana, 2016
Implementations
MXNet
PyTorch
- Attention Transfer
- Best of Both Worlds: Transferring Knowledge from Discriminative Learning to a Generative Visual Dialog Model
- Interpreting Deep Classifier by Visual Distillation of Dark Knowledge
- A PyTorch implementation for exploring deep and shallow knowledge distillation (KD) experiments with flexibility
- Mean teachers are better role models
Lua
Torch
- Distilling knowledge to specialist ConvNets for clustered classification
- Sequence-Level Knowledge Distillation, Neural Machine Translation on Android
- cifar.torch distillation
Theano
- FitNets: Hints for Thin Deep Nets
- Transfer knowledge from a large DNN or an ensemble of DNNs into a small DNN
Lasagne + Theano
Tensorflow
- Deep Model Compression: Distilling Knowledge from Noisy Teachers
- Distillation
- An example application of neural network distillation to MNIST
- Data-free Knowledge Distillation for Deep Neural Networks
- Inspired by net2net, network distillation
- Deep Reinforcement Learning, knowledge transfer
- Knowledge Distillation using Tensorflow
Caffe
- Face Model Compression by Distilling Knowledge from Neurons
- KnowledgeDistillation Layer (Caffe implementation)
- Knowledge distillation, realized in caffe
- Cross Modal Distillation for Supervision Transfer
Keras
- Knowledge distillation with Keras
- keras google-vision's distillation
- Distilling the knowledge in a Neural Network
(转)Awesome Knowledge Distillation的更多相关文章
- Focal and Global Knowledge Distillation for Detectors
一. 概述 论文地址:链接 代码地址:链接 论文简介: 此篇论文是在CGNet上增加部分限制loss而来 核心部分是将gt框变为mask进行蒸馏 注释:仅为阅读论文和代码,未进行试验,如有漏错请不吝指 ...
- Feature Fusion for Online Mutual Knowledge Distillation (CVPR 2019)
一.解决问题 如何将特征融合与知识蒸馏结合起来,提高模型性能 二.创新点 支持多子网络分支的在线互学习 子网络可以是相同结构也可以是不同结构 应用特征拼接.depthwise+pointwise,将特 ...
- The Brain as a Universal Learning Machine
The Brain as a Universal Learning Machine This article presents an emerging architectural hypothesis ...
- Classifying plankton with deep neural networks
Classifying plankton with deep neural networks The National Data Science Bowl, a data science compet ...
- [综述]Deep Compression/Acceleration深度压缩/加速/量化
Survey Recent Advances in Efficient Computation of Deep Convolutional Neural Networks, [arxiv '18] A ...
- Bag of Tricks for Image Classification with Convolutional Neural Networks论文笔记
一.高效的训练 1.Large-batch training 使用大的batch size可能会减小训练过程(收敛的慢?我之前训练的时候挺喜欢用较大的batch size),即在相同的迭代次数 ...
- 论文笔记:Fast Neural Architecture Search of Compact Semantic Segmentation Models via Auxiliary Cells
Fast Neural Architecture Search of Compact Semantic Segmentation Models via Auxiliary Cells 2019-04- ...
- ICML 2018 | 从强化学习到生成模型:40篇值得一读的论文
https://blog.csdn.net/y80gDg1/article/details/81463731 感谢阅读腾讯AI Lab微信号第34篇文章.当地时间 7 月 10-15 日,第 35 届 ...
- 【转载】NeurIPS 2018 | 腾讯AI Lab详解3大热点:模型压缩、机器学习及最优化算法
原文:NeurIPS 2018 | 腾讯AI Lab详解3大热点:模型压缩.机器学习及最优化算法 导读 AI领域顶会NeurIPS正在加拿大蒙特利尔举办.本文针对实验室关注的几个研究热点,模型压缩.自 ...
随机推荐
- git时光机操作
A状态:代码版本A B状态:代码版本B(比A状态时增加了图片.代码) 这时,git add. git commit -m"" .push之前,意识到忘了让git忽略图片的添加,就: ...
- uvalive 3353 Optimal Bus Route Design
题意: 给出n个点,以及每个点到其他点的有向距离,要求设计线路使得每一个点都在一个环中,如果设计的线路拥有最小值,那么这个线路就是可选的.输出这个最小值或者说明最小线路不存在. 思路: 在DAG的最小 ...
- Spark学习之路 (十四)SparkCore的调优之资源调优JVM的GC垃圾收集器
一.概述 垃圾收集 Garbage Collection 通常被称为“GC”,它诞生于1960年 MIT 的 Lisp 语言,经过半个多世纪,目前已经十分成熟了. jvm 中,程序计数器.虚拟机栈.本 ...
- django的母板和继承
Django模板中只需要记两种特殊符号: {{ }}和 {% %} {{ }}表示变量,在模板渲染的时候替换成值,{% %}表示逻辑相关的操作. 母板 <!DOCTYPE html> & ...
- 异常检测LOF
局部异常因子算法-Local Outlier Factor(LOF)在数据挖掘方面,经常需要在做特征工程和模型训练之前对数据进行清洗,剔除无效数据和异常数据.异常检测也是数据挖掘的一个方向,用于反作弊 ...
- python中__call__()方法的用法
__call__()的用法 __call__()方法能够让类的实例对象,像函数一样被调用: >>> >>> class A(object): def __call_ ...
- html5水平方向重力感应
html5图片随手机重力感应而移动 <!DOCTYPE html> <html lang="zh-cn"><head><meta http ...
- watch解放你的双手
有时候我们需要重复执行某个命令,观察某个文件和某个结果的变化情况.可以写脚本去实现这些需求,但是有更简单的方法,本文档要介绍的就是watch命令. 1. 以固定时间反复执行某个命令 root@jaki ...
- SQL注入(dvwa环境)
首先登录DVWA主页: 1.修改安全级别为LOW级(第一次玩别打脸),如图中DVWA Security页面中. 2.进入SQL Injection页面,出错了.(心里想着这DVWA是官网下的不至于玩不 ...
- Oracle笔记 #01# 简单分页
rownum是Oracle为查询结果分配的有序编号(总是从1~n).言下之意,rownum字段本来并不存在于表中,而是经查询后才分配的. 举一个例子: SELECT rownum, name, pri ...