Sparse Coding

  • Sparse coding is a class of unsupervised methods for learning sets of over-complete bases to represent data efficiently. The aim of sparse coding is to find a set of basis vectors  such that an input vector  can be represented as a linear combination of these basis vectors:

  • The advantage of having an over-complete basis is that our basis vectors are better able to capture structures and patterns inherent in the input data 
  • However, with an over-complete basis, the coefficients are no longer uniquely determined by the input vector. Therefore, in sparse coding, we introduce the additional criterion of sparsity to resolve the degeneracy introduced by over-completeness.
  • The sparse coding cost function is defined on a set of m input vectors as: 

    where is a sparsity function which penalizes  for being far from zero. We can interpret the first term of the sparse coding objective as a reconstruction term which tries to force the algorithm to provide a good representation of , and the second term as a sparsity penalty which forces our representation of  (i.e., the learned features) to be sparse. The constant  is a scaling constant to determine the relative importance of these two contributions.

  • Although the most direct measure of sparsity is the  norm, it is non-differentiable and difficult to optimize in general. In practice, common choices for the sparsity cost  are the penalty and the log sparsity
  • It is also possible to make the sparsity penalty arbitrary small by scaling down  and scaling up  by some large constant. To prevent this from happening, we will constrain  to be less than some constant . The full sparse coding cost function hence is:


    where the constant is usually set

  • One problem is that the constraint cannot be forced using simple gradient-based methods. Hence, in practice, this constraint is weakened to a "weight decay" term designed to keep the entries of  small:
  • Another problem is that the L1 norm is not differentiable at 0, and hence poses a problem for gradient-based methods. We will "smooth out" the L1 norm using an approximation which will allow us to use gradient descent. To "smooth out" the L1 norm, we use  in place of , where is a "smoothing parameter" which can also be interpreted as a sort of "sparsity parameter" (to see this, observe that when is large compared to , the is dominated by , and taking the square root yield approximately .
  • Hence, the final objective function is:
  • The set of basis vectors are called "dictionary" ().  is "adapted" to if it can represent it with a few basis vectors, that is, there exists a sparse vector in  such that . We call  the sparse code. It is illustrated as follows: 

Learning

  • Learning a set of basis vectors using sparse coding consists of performing two separate optimizations (i.e., alternative optimization method):

    • The first being an optimization over coefficients  for each training example
    • The second being an optimization over basis vectors across many training examples at once.
  • However, the classical optimization alternates between D and  can achieve good results, but very slow.
  • A significant limitation of sparse coding is that even after a set of basis vectors have been learnt, in order to "encode" a new data example, optimization must be performed to obtain the required coefficients. This significant "runtime" cost means that sparse coding is computationally expensive to implement even at test time, especially compared to typical feed-forward architectures.

Remarks

  • From my view, due to the sparseness enforced in the dictionary learning (i.e., sparse code), the restored matrix is able to remove noise of original matrix, i.e., having some effect of denoising. Hence, Sparse coding could be used to denoise images.

References

Study notes for Sparse Coding的更多相关文章

  1. Machine Learning Algorithms Study Notes(2)--Supervised Learning

    Machine Learning Algorithms Study Notes 高雪松 @雪松Cedro Microsoft MVP 本系列文章是Andrew Ng 在斯坦福的机器学习课程 CS 22 ...

  2. Machine Learning Algorithms Study Notes(3)--Learning Theory

    Machine Learning Algorithms Study Notes 高雪松 @雪松Cedro Microsoft MVP 本系列文章是Andrew Ng 在斯坦福的机器学习课程 CS 22 ...

  3. Machine Learning Algorithms Study Notes(1)--Introduction

    Machine Learning Algorithms Study Notes 高雪松 @雪松Cedro Microsoft MVP 目 录 1    Introduction    1 1.1    ...

  4. 理解sparse coding

    理解sparse coding 稀疏编码系列: (一)----Spatial Pyramid 小结 (二)----图像的稀疏表示——ScSPM和LLC的总结 (三)----理解sparse codin ...

  5. [Paper] **Before GAN: sparse coding

    读罢[UFLDL] ConvNet,为了知识体系的完整,看来需要实战几篇论文深入理解一些原理. 如下是未来博文系列的初步设想,为了hold住 GAN而必备的知识体系,也是必经之路. [Paper] B ...

  6. sparse coding

    Deep Learning(深度学习)学习笔记整理系列 zouxy09@qq.com http://blog.csdn.net/zouxy09 作者:Zouxy version 1.0 2013-04 ...

  7. 稀疏编码(Sparse Coding)的前世今生(一) 转自http://blog.csdn.net/marvin521/article/details/8980853

    稀疏编码来源于神经科学,计算机科学和机器学习领域一般一开始就从稀疏编码算法讲起,上来就是找基向量(超完备基),但是我觉得其源头也比较有意思,知道根基的情况下,拓展其应用也比较有底气.哲学.神经科学.计 ...

  8. Study notes for Clustering and K-means

    1. Clustering Analysis Clustering is the process of grouping a set of (unlabeled) data objects into ...

  9. sparse coding稀疏表达入门

    最近在看sparse and redundant representations这本书,进度比较慢,不过力争看过的都懂,不把时间浪费掉.才看完了不到3页吧,书上基本给出了稀疏表达的概念以及传统的求法. ...

随机推荐

  1. POJ 1458 Common Subsequence(LCS最长公共子序列)

    POJ 1458 Common Subsequence(LCS最长公共子序列)解题报告 题目链接:http://acm.hust.edu.cn/vjudge/contest/view.action?c ...

  2. gcc代码反汇编查看内存分布[2]: arm-linux-gcc

    arm-none-linux-gnueabi-gcc -v gcc version 4.4.1 (Sourcery G++ Lite 2010q1-202) 重点: 代码中的内存分配, 地址从低到高: ...

  3. BZOJ 2882: 工艺( 后缀自动机 )

    把串S复制成SS然后扔进后缀自动机里, 从根选最小的儿子走, 走N步就是答案了...一开始还想写个treap的...后来觉得太麻烦..就用map了... ----------------------- ...

  4. BZOJ 1679: [Usaco2005 Jan]Moo Volume 牛的呼声( )

    一开始直接 O( n² ) 暴力..结果就 A 了... USACO 数据是有多弱 = = 先sort , 然后自己再YY一下就能想出来...具体看code --------------------- ...

  5. HTML5 总结-表单-输入类型

    HTML5 Input 类型 HTML5 新的 Input 类型 HTML5 拥有多个新的表单输入类型.这些新特性提供了更好的输入控制和验证. 本章全面介绍这些新的输入类型: email url nu ...

  6. Ecmall系统自带的分页功能使用

    在控制器如果没有定义相关模型,直接使用sql语句的话,直接使用如下语句. 即: public $db; $this->db = &db(); //然后开始使用分页类 $sql='sele ...

  7. java实现电脑远程控制完整源代码(转)

    Java JDK1.4 的Robot对象,该对象可以完成屏幕图像截取操作,控制鼠标,键盘,如此便可以轻而易举地实现远程服务器的控制.本文向大家介绍如何用Java Robot对象实现远程服务器的控制,并 ...

  8. ios8 swift开发:显示变量的类名称

    var ivar = [:] ivar.className // __NSDictionaryI var i = 1 i.className // error: 'Int' does not have ...

  9. 解决warning: incompatible implicit declaration of built-in function 'malloc'

    因为代码中使用了malloc函数和字符串函数.编译时出现错误 warning: incompatible implicit declaration of built-in function 'mall ...

  10. Linux下批量转换文件编码

    find -iname "*.java" -exec enca {} + |grep -v ASCI |grep -v -i utf |awk -F':' '{print $1}' ...