D. Alistarh, D. Grubic, J. Li, R. Tomioka, and M. Vojnovic, "QSGD: Communication-Efficient SGD via Gradient Quantization and Encoding," Advances in Neural Information Processing Systems, vol. 30, 2017, Accessed: Jul. 31, 2021. [Online]. Availabl…
由于良好的可扩展性,随机梯度下降(SGD)的并行实现是最近研究的热点.实现并行化SGD的关键障碍就是节点间梯度更新时的高带宽开销.因此,研究者们提出了一些启发式的梯度压缩方法,使得节点间只传输压缩后的梯度.尽管这些启发式方法在实践中很有效,但它们有时并不会收敛. 本文提出了量化SGD(Quantization SGD,QSGD),它是一类具有收敛保证且在实践中性能良好的压缩模式.QSGD允许用户平滑得权衡通信带宽和收敛时间:节点可以在每轮迭代时调整发送的比特数,代价可能是更高的方差.这种权衡是固…
J. N. Tsitsiklis and Z.-Q. Luo, "Communication complexity of convex optimization," Journal of Complexity, vol. 3, no. 3, pp. 231–243, Sep. 1987, doi: 10.1016/0885-064x(87)90013-6. 问题描述 两个用户各自有一个凸函数\(f_i\),相互交互最少的二进制消息,从而找到\(f_i+f_2\)的最优点 基本定义 \(…
论文<A Deep Neural Network Compression Pipeline: Pruning, Quantization, Huffman Encoding> Pruning by learning only the important connections. all connections with weights below a threshold are removed from the network. retrain the network to learn the…
B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas, "Communication-Efficient Learning of Deep Networks from Decentralized Data," in Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, Apr. 2017…
论文信息 论文标题:Efficient Graph Convolution for Joint Node RepresentationLearning and Clustering论文作者:Chakib Fettal, Lazhar Labiod,Mohamed Nadif论文来源:2021, WSDM论文地址:download论文代码:download 1 Introduction 一个统一的框架中解决了节点嵌入和聚类问题. 2 Method 整体框架: 2.1 Joint Graph Rep…
ShuffleNet: An Extremely Efficient Convolutional Neural Network for MobileDevices…
作者:Tsaipei Wang, Member, IEEE 发表:IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 41, NO. 3, JUNE 2011 这是一遍关于聚类集成的论文,作者提出了一种聚类集成方法,命名为:CA-Tree,,基于层次结构(dendogram),这个结构的大致与hierarchical cluster 相同,当然是比hc 效果好,同时该方法适合用于数据样本比较大的…
论文内容 G. Hinton, O. Vinyals, and J. Dean, "Distilling the Knowledge in a Neural Network." 2015. 如何将一堆模型或一个超大模型的知识压缩到一个小模型中,从而更容易进行部署? 训练超大模型是因为它更容易提取出数据的结构信息(为什么?) 知识应该理解为从输入到输出的映射,而不是学习到的参数信息 模型的泛化性来源于错误答案的相对概率大小(一辆宝马被误判为卡车的概率大于被误判为萝卜的概率),而泛化性是学…
参考 1. PBA_paper; 2. github; 3. Berkeley_blog; 4. pabbeel_berkeley_EECS_homepage; 完…