【论文考古】量化SGD QSGD: Communication-Efficient SGD via Gradient Quantization and Encoding

【【论文考古】量化SGD QSGD: Communication-Efficient SGD via Gradient Quantization and Encoding】的更多相关文章

【论文考古】量化SGD QSGD: Communication-Efficient SGD via Gradient Quantization and Encoding

D. Alistarh, D. Grubic, J. Li, R. Tomioka, and M. Vojnovic, "QSGD: Communication-Efficient SGD via Gradient Quantization and Encoding," Advances in Neural Information Processing Systems, vol. 30, 2017, Accessed: Jul. 31, 2021. [Online]. Availabl…

NeurIPS 2017 | QSGD: Communication-Efficient SGD via Gradient Quantization and Encoding

由于良好的可扩展性,随机梯度下降(SGD)的并行实现是最近研究的热点.实现并行化SGD的关键障碍就是节点间梯度更新时的高带宽开销.因此,研究者们提出了一些启发式的梯度压缩方法,使得节点间只传输压缩后的梯度.尽管这些启发式方法在实践中很有效,但它们有时并不会收敛. 本文提出了量化SGD(Quantization SGD,QSGD),它是一类具有收敛保证且在实践中性能良好的压缩模式.QSGD允许用户平滑得权衡通信带宽和收敛时间:节点可以在每轮迭代时调整发送的比特数,代价可能是更高的方差.这种权衡是固…

【论文考古】分布式优化 Communication Complexity of Convex Optimization

J. N. Tsitsiklis and Z.-Q. Luo, "Communication complexity of convex optimization," Journal of Complexity, vol. 3, no. 3, pp. 231–243, Sep. 1987, doi: 10.1016/0885-064x(87)90013-6. 问题描述两个用户各自有一个凸函数\(f_i\),相互交互最少的二进制消息,从而找到\(f_i+f_2\)的最优点基本定义 \(…

论文笔记——A Deep Neural Network Compression Pipeline: Pruning, Quantization, Huffman Encoding

论文<A Deep Neural Network Compression Pipeline: Pruning, Quantization, Huffman Encoding> Pruning by learning only the important connections. all connections with weights below a threshold are removed from the network. retrain the network to learn the…

【论文考古】联邦学习开山之作 Communication-Efficient Learning of Deep Networks from Decentralized Data

B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas, "Communication-Efficient Learning of Deep Networks from Decentralized Data," in Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, Apr. 2017…

论文解读（GCC）《Efficient Graph Convolution for Joint Node RepresentationLearning and Clustering》

论文信息论文标题:Efficient Graph Convolution for Joint Node RepresentationLearning and Clustering论文作者:Chakib Fettal, Lazhar Labiod,Mohamed Nadif论文来源:2021, WSDM论文地址:download论文代码:download 1 Introduction 一个统一的框架中解决了节点嵌入和聚类问题. 2 Method 整体框架: 2.1 Joint Graph Rep…

【论文阅读】ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices

ShuffleNet: An Extremely Efficient Convolutional Neural Network for MobileDevices…

[论文]CA-Tree: A Hierarchical Structure for Efficient and Scalable Coassociation-Based Cluster Ensembles

作者:Tsaipei Wang, Member, IEEE 发表:IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 41, NO. 3, JUNE 2011 这是一遍关于聚类集成的论文,作者提出了一种聚类集成方法,命名为:CA-Tree,,基于层次结构(dendogram),这个结构的大致与hierarchical cluster 相同,当然是比hc 效果好,同时该方法适合用于数据样本比较大的…

【论文考古】知识蒸馏 Distilling the Knowledge in a Neural Network

论文内容 G. Hinton, O. Vinyals, and J. Dean, "Distilling the Knowledge in a Neural Network." 2015. 如何将一堆模型或一个超大模型的知识压缩到一个小模型中,从而更容易进行部署? 训练超大模型是因为它更容易提取出数据的结构信息(为什么?) 知识应该理解为从输入到输出的映射,而不是学习到的参数信息模型的泛化性来源于错误答案的相对概率大小(一辆宝马被误判为卡车的概率大于被误判为萝卜的概率),而泛化性是学…

【论文阅读】PBA-Population Based Augmentation:Efficient Learning of Augmentation Policy Schedules

参考 1. PBA_paper; 2. github; 3. Berkeley_blog; 4. pabbeel_berkeley_EECS_homepage; 完…